[PATCH 00/10] Tweaks to follow guidelines in style guide

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 00/10] Tweaks to follow guidelines in style guide
@ 2017-10-05 15:47 Akira Yokosawa
  2017-10-05 15:48 ` [PATCH 01/10] debugging: Insert narrow space in front of percent symbol Akira Yokosawa
                   ` (10 more replies)
  0 siblings, 11 replies; 12+ messages in thread
From: Akira Yokosawa @ 2017-10-05 15:47 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: perfbook, Akira Yokosawa

From 2890e0069882321553c16aac213d4bb8d0a06fb7 Mon Sep 17 00:00:00 2001
From: Akira Yokosawa <akiyks@gmail.com>
Date: Wed, 5 Oct 2017 22:57:46 +0900
Subject: [PATCH 00/10] Tweaks to follow guidelines in style guide

Hi Paul,

This patch set consists of minor tweaks in regard to the suggestions
having been presented in style guide for a while.

Patches #1 -- #5 are trivial changes.

Patch #6 attempts to improve consistency in denoting POWER series CPU
by defining a macro "\Power{}".

Patch #7 substitutes "GCC" for "gcc". There are a few exceptions as
mentioned in commit log.

Patch #8 substitutes "IRQ" for "irq" in the same way. You might like
to skip this one, as I see "irq" more often than "IRQ" in Linux
documentations.

Patch #9 is somewhat invasive. It switches Times font to that of
"newtxtext" and "newtxmath" packages. The reason of the change
is to have access to both upright and slated glyphs of Greek letters.
Recent versions of these font packages give better looking result,
especially in math mode. As noted in the commit log, newtxmath in
TeX Live 2013/Debian has a few issues which have been fixed in later
versions. It also switches font choice for the experimental target "1csf".

Patch #10 updates style guide to reflect the changes made in this
patch set.

        Thanks, Akira
--
Akira Yokosawa (10):
  debugging: Insert narrow space in front of percent symbol
  debugging: Use upright font for Euler's number
  future/QC: Insert narrow space in front of percent symbol
  future/QC: Use non-breakable hyphen for axis names
  treewide: Insert narrow space in front of percent symbol
  treewide: Use \Power{} macro for POWER CPU family
  treewide: Call GNU C compiler as "GCC"
  treewide: Use "IRQ" instead of "irq" used as abbreviation
  future/QC: Use upright glyph for math constant and descriptive suffix
  styleguide: Reflect recent style improvements

 FAQ-BUILD.txt                      |   4 +-
 Makefile                           |   2 +-
 SMPdesign/SMPdesign.tex            |   2 +-
 SMPdesign/beyond.tex               |  14 ++---
 advsync/advsync.tex                |   2 +-
 appendix/styleguide/styleguide.tex |  64 ++++++++---------------
 appendix/toyrcu/toyrcu.tex         |  26 +++++-----
 count/count.tex                    |  28 +++++-----
 cpu/hwfreelunch.tex                |   4 +-
 datastruct/datastruct.tex          |   2 +-
 debugging/debugging.tex            | 104 ++++++++++++++++++-------------------
 defer/rcuapi.tex                   |   6 +--
 defer/rcuusage.tex                 |   4 +-
 formal/dyntickrcu.tex              |  52 +++++++++----------
 formal/formal.tex                  |   2 +-
 formal/spinhint.tex                |   2 +-
 future/QC.tex                      |  92 ++++++++++++++++----------------
 future/htm.tex                     |   2 +-
 future/tm.tex                      |   4 +-
 intro/intro.tex                    |   8 +--
 memorder/memorder.tex              |  32 ++++++------
 perfbook.tex                       |  20 +++++--
 rt/rt.tex                          |  22 ++++----
 toolsoftrade/toolsoftrade.tex      |  24 ++++-----
 24 files changed, 257 insertions(+), 265 deletions(-)

-- 
2.7.4



^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 01/10] debugging: Insert narrow space in front of percent symbol
  2017-10-05 15:47 [PATCH 00/10] Tweaks to follow guidelines in style guide Akira Yokosawa
@ 2017-10-05 15:48 ` Akira Yokosawa
  2017-10-05 15:49 ` [PATCH 02/10] debugging: Use upright font for Euler's number Akira Yokosawa
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Akira Yokosawa @ 2017-10-05 15:48 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: perfbook, Akira Yokosawa

From b67d6df0f0621907e81f419784f6b63b09619e9a Mon Sep 17 00:00:00 2001
From: Akira Yokosawa <akiyks@gmail.com>
Date: Sat, 30 Sep 2017 16:20:34 +0900
Subject: [PATCH 01/10] debugging: Insert narrow space in front of percent symbol

Signed-off-by: Akira Yokosawa <akiyks@gmail.com>
---
 debugging/debugging.tex | 82 ++++++++++++++++++++++++-------------------------
 1 file changed, 41 insertions(+), 41 deletions(-)

diff --git a/debugging/debugging.tex b/debugging/debugging.tex
index 0199720..5747656 100644
--- a/debugging/debugging.tex
+++ b/debugging/debugging.tex
@@ -1025,19 +1025,19 @@ We therefore start with discrete tests.
 \subsection{Statistics for Discrete Testing}
 \label{sec:debugging:Statistics for Discrete Testing}

-Suppose that the bug had a 10\% chance of occurring in
+Suppose that the bug had a 10\,\% chance of occurring in
 a given run and that we do five runs.
 How do we compute that probability of at least one run failing?
 One way is as follows:

 \begin{enumerate}
-\item	Compute the probability of a given run succeeding, which is 90\%.
+\item	Compute the probability of a given run succeeding, which is 90\,\%.
 \item	Compute the probability of all five runs succeeding, which
-	is 0.9 raised to the fifth power, or about 59\%.
+	is 0.9 raised to the fifth power, or about 59\,\%.
 \item	There are only two possibilities: either all five runs succeed,
 	or at least one fails.
 	Therefore, the probability of at least one failure is
-	59\% taken away from 100\%, or 41\%.
+	59\,\% taken away from 100\,\%, or 41\,\%.
 \end{enumerate}

 However, many people find it easier to work with a formula than a series
@@ -1060,7 +1060,7 @@ The probability of failure is $1-S_n$, or:
 \QuickQuiz{}
 	Say what???
 	When I plug the earlier example of five tests each with a
-	10\% failure rate into the formula, I get 59,050\% and that
+	10\,\% failure rate into the formula, I get 59,050\,\% and that
 	just doesn't make sense!!!
 \QuickQuizAnswer{
 	You are right, that makes no sense at all.
@@ -1068,27 +1068,27 @@ The probability of failure is $1-S_n$, or:
 	Remember that a probability is a number between zero and one,
 	so that you need to divide a percentage by 100 to get a
 	probability.
-	So 10\% is a probability of 0.1, which gets a probability
-	of 0.4095, which rounds to 41\%, which quite sensibly
+	So 10\,\% is a probability of 0.1, which gets a probability
+	of 0.4095, which rounds to 41\,\%, which quite sensibly
 	matches the earlier result.
 } \QuickQuizEnd

-So suppose that a given test has been failing 10\% of the time.
-How many times do you have to run the test to be 99\% sure that
+So suppose that a given test has been failing 10\,\% of the time.
+How many times do you have to run the test to be 99\,\% sure that
 your supposed fix has actually improved matters?

 Another way to ask this question is ``How many times would we need
-to run the test to cause the probability of failure to rise above 99\%?''
+to run the test to cause the probability of failure to rise above 99\,\%?''
 After all, if we were to run the test enough times that the probability
-of seeing at least one failure becomes 99\%, if there are no failures,
-there is only 1\% probability of this being due to dumb luck.
+of seeing at least one failure becomes 99\,\%, if there are no failures,
+there is only 1\,\% probability of this being due to dumb luck.
 And if we plug $f=0.1$ into
 Equation~\ref{eq:debugging:Binomial Failure Rate} and vary $n$,
-we find that 43 runs gives us a 98.92\% chance of at least one test failing
-given the original 10\% per-test failure rate,
-while 44 runs gives us a 99.03\% chance of at least one test failing.
+we find that 43 runs gives us a 98.92\,\% chance of at least one test failing
+given the original 10\,\% per-test failure rate,
+while 44 runs gives us a 99.03\,\% chance of at least one test failing.
 So if we run the test on our fix 44 times and see no failures, there
-is a 99\% probability that our fix was actually a real improvement.
+is a 99\,\% probability that our fix was actually a real improvement.

 But repeatedly plugging numbers into
 Equation~\ref{eq:debugging:Binomial Failure Rate}
@@ -1110,7 +1110,7 @@ Finally the number of tests required is given by:
 Plugging $f=0.1$ and $F_n=0.99$ into
 Equation~\ref{eq:debugging:Binomial Number of Tests Required}
 gives 43.7, meaning that we need 44 consecutive successful test
-runs to be 99\% certain that our fix was a real improvement.
+runs to be 99\,\% certain that our fix was a real improvement.
 This matches the number obtained by the previous method, which
 is reassuring.

@@ -1135,9 +1135,9 @@ is reassuring.
 Figure~\ref{fig:debugging:Number of Tests Required for 99 Percent Confidence Given Failure Rate}
 shows a plot of this function.
 Not surprisingly, the less frequently each test run fails, the more
-test runs are required to be 99\% confident that the bug has been
+test runs are required to be 99\,\% confident that the bug has been
 fixed.
-If the bug caused the test to fail only 1\% of the time, then a
+If the bug caused the test to fail only 1\,\% of the time, then a
 mind-boggling 458 test runs are required.
 As the failure probability decreases, the number of test runs required
 increases, going to infinity as the failure probability goes to zero.
@@ -1145,18 +1145,18 @@ increases, going to infinity as the failure probability goes to zero.
 The moral of this story is that when you have found a rarely occurring
 bug, your testing job will be much easier if you can come up with
 a carefully targeted test with a much higher failure rate.
-For example, if your targeted test raised the failure rate from 1\%
-to 30\%, then the number of runs required for 99\% confidence
+For example, if your targeted test raised the failure rate from 1\,\%
+to 30\,\%, then the number of runs required for 99\,\% confidence
 would drop from 458 test runs to a mere thirteen test runs.

-But these thirteen test runs would only give you 99\% confidence that
+But these thirteen test runs would only give you 99\,\% confidence that
 your fix had produced ``some improvement''.
-Suppose you instead want to have 99\% confidence that your fix reduced
+Suppose you instead want to have 99\,\% confidence that your fix reduced
 the failure rate by an order of magnitude.
 How many failure-free test runs are required?

-An order of magnitude improvement from a 30\% failure rate would be
-a 3\% failure rate.
+An order of magnitude improvement from a 30\,\% failure rate would be
+a 3\,\% failure rate.
 Plugging these numbers into
 Equation~\ref{eq:debugging:Binomial Number of Tests Required} yields:

@@ -1178,14 +1178,14 @@ Section~\ref{sec:debugging:Hunting Heisenbugs}.
 But suppose that you have a continuous test that fails about three
 times every ten hours, and that you fix the bug that you believe was
 causing the failure.
-How long do you have to run this test without failure to be 99\% certain
+How long do you have to run this test without failure to be 99\,\% certain
 that you reduced the probability of failure?

 Without doing excessive violence to statistics, we could simply
-redefine a one-hour run to be a discrete test that has a 30\%
+redefine a one-hour run to be a discrete test that has a 30\,\%
 probability of failure.
 Then the results of in the previous section tell us that if the test
-runs for 13 hours without failure, there is a 99\% probability that
+runs for 13 hours without failure, there is a 99\,\% probability that
 our fix actually improved the program's reliability.

 A dogmatic statistician might not approve of this approach, but the
@@ -1216,10 +1216,10 @@ this book~\cite{McKenney2014ParallelProgramming-e1}.
 Let's try reworking the example from
 Section~\ref{sec:debugging:Abusing Statistics for Discrete Testing}
 using the Poisson distribution.
-Recall that this example involved a test with a 30\% failure rate per
+Recall that this example involved a test with a 30\,\% failure rate per
 hour, and that the question was how long the test would need to run
 error-free
-on a alleged fix to be 99\% certain that the fix actually reduced the
+on a alleged fix to be 99\,\% certain that the fix actually reduced the
 failure rate.
 In this case, $\lambda$ is zero, so that
 Equation~\ref{eq:debugging:Poisson Probability} reduces to:
@@ -1236,17 +1236,17 @@ to 0.01 and solving for $\lambda$, resulting in:
 \end{equation}

 Because we get $0.3$ failures per hour, the number of hours required
-is $4.6/0.3 = 14.3$, which is within 10\% of the 13 hours
+is $4.6/0.3 = 14.3$, which is within 10\,\% of the 13 hours
 calculated using the method in
 Section~\ref{sec:debugging:Abusing Statistics for Discrete Testing}.
-Given that you normally won't know your failure rate to within 10\%,
+Given that you normally won't know your failure rate to within 10\,\%,
 this indicates that the method in
 Section~\ref{sec:debugging:Abusing Statistics for Discrete Testing}
 is a good and sufficient substitute for the Poisson distribution in
 a great many situations.

 More generally, if we have $n$ failures per unit time, and we want to
-be P\% certain that a fix reduced the failure rate, we can use the
+be P\,\% certain that a fix reduced the failure rate, we can use the
 following formula:

 \begin{equation}
@@ -1257,7 +1257,7 @@ following formula:
 \QuickQuiz{}
 	Suppose that a bug causes a test failure three times per hour
 	on average.
-	How long must the test run error-free to provide 99.9\%
+	How long must the test run error-free to provide 99.9\,\%
 	confidence that the fix significantly reduced the probability
 	of failure?
 \QuickQuizAnswer{
@@ -1268,7 +1268,7 @@ following formula:
 		T = - \frac{1}{3} \log \frac{100 - 99.9}{100} = 2.3
 	\end{equation}

-	If the test runs without failure for 2.3 hours, we can be 99.9\%
+	If the test runs without failure for 2.3 hours, we can be 99.9\,\%
 	certain that the fix reduced the probability of failure.
 } \QuickQuizEnd

@@ -1616,7 +1616,7 @@ delay might be counted as a near miss.\footnote{
 For example, a low-probability bug in RCU priority boosting occurred
 roughly once every hundred hours of focused rcutorture testing.
 Because it would take almost 500 hours of failure-free testing to be
-99\% certain that the bug's probability had been significantly reduced,
+99\,\% certain that the bug's probability had been significantly reduced,
 the \co{git bisect} process
 to find the failure would be painfully slow---or would require an extremely
 large test farm.
@@ -1782,12 +1782,12 @@ much a bug as is incorrectness.
 	Although I do heartily salute your spirit and aspirations,
 	you are forgetting that there may be high costs due to delays
 	in the program's completion.
-	For an extreme example, suppose that a 40\% performance shortfall
+	For an extreme example, suppose that a 40\,\% performance shortfall
 	from a single-threaded application is causing one person to die
 	each day.
 	Suppose further that in a day you could hack together a
 	quick and dirty
-	parallel program that ran 50\% faster on an eight-CPU system
+	parallel program that ran 50\,\% faster on an eight-CPU system
 	than the sequential version, but that an optimal parallel
 	program would require four months of painstaking design, coding,
 	debugging, and tuning.
@@ -2265,7 +2265,7 @@ This script takes three optional arguments as follows:
 \item	[\tco{--relerr}\nf{:}] Relative measurement error.  The script assumes
 	that values that differ by less than this error are for all
 	intents and purposes equal.
-	This defaults to 0.01, which is equivalent to 1\%.
+	This defaults to 0.01, which is equivalent to 1\,\%.
 \item	[\tco{--trendbreak}\nf{:}] Ratio of inter-element spacing constituting
 	a break in the trend of the data.
 	For example, if the average spacing in the data accepted so far
@@ -2322,7 +2322,7 @@ Lines~44-52 then compute and print the statistics for the data set.
 \QuickQuizAnswer{
 	Because mean and standard deviation were not designed to do this job.
 	To see this, try applying mean and standard deviation to the
-	following data set, given a 1\% relative error in measurement:
+	following data set, given a 1\,\% relative error in measurement:

 	\begin{quote}
 		49,548.4 49,549.4 49,550.2 49,550.9 49,550.9 49,551.0
@@ -2452,7 +2452,7 @@ about a billion instances throughout the world?
 In that case, a bug that would be encountered once every million years
 will be encountered almost three times per day across the installed
 base.
-A test with a 50\% chance of encountering this bug in a one-hour run
+A test with a 50\,\% chance of encountering this bug in a one-hour run
 would need to increase that bug's probability of occurrence by more
 than nine orders of magnitude, which poses a severe challenge to
 today's testing methodologies.
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 02/10] debugging: Use upright font for Euler's number
  2017-10-05 15:47 [PATCH 00/10] Tweaks to follow guidelines in style guide Akira Yokosawa
  2017-10-05 15:48 ` [PATCH 01/10] debugging: Insert narrow space in front of percent symbol Akira Yokosawa
@ 2017-10-05 15:49 ` Akira Yokosawa
  2017-10-05 15:51 ` [PATCH 03/10] future/QC: Insert narrow space in front of percent symbol Akira Yokosawa
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Akira Yokosawa @ 2017-10-05 15:49 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: perfbook, Akira Yokosawa

From 69881c2d7792c59d8dfbed3a799186a95ef835fb Mon Sep 17 00:00:00 2001
From: Akira Yokosawa <akiyks@gmail.com>
Date: Sat, 30 Sep 2017 17:37:45 +0900
Subject: [PATCH 02/10] debugging: Use upright font for Euler's number

Also use \ln for natural logarithm.

Signed-off-by: Akira Yokosawa <akiyks@gmail.com>
---
 debugging/debugging.tex | 24 ++++++++++++------------
 perfbook.tex            |  2 ++
 2 files changed, 14 insertions(+), 12 deletions(-)

diff --git a/debugging/debugging.tex b/debugging/debugging.tex
index 5747656..7a5f71d 100644
--- a/debugging/debugging.tex
+++ b/debugging/debugging.tex
@@ -1116,7 +1116,7 @@ is reassuring.

 \QuickQuiz{}
 	In Equation~\ref{eq:debugging:Binomial Number of Tests Required},
-	are the logarithms base-10, base-2, or base-$e$?
+	are the logarithms base-10, base-2, or base-$\euler$?
 \QuickQuizAnswer{
 	It does not matter.
 	You will get the same answer no matter what base of logarithms
@@ -1201,7 +1201,7 @@ The fundamental formula for failure probabilities is the Poisson
 distribution:

 \begin{equation}
-	F_m = \frac{\lambda^m}{m!} e^{-\lambda}
+	F_m = \frac{\lambda^m}{m!} \euler^{-\lambda}
 \label{eq:debugging:Poisson Probability}
 \end{equation}

@@ -1225,14 +1225,14 @@ In this case, $\lambda$ is zero, so that
 Equation~\ref{eq:debugging:Poisson Probability} reduces to:

 \begin{equation}
-	F_0 =  e^{-\lambda}
+	F_0 =  \euler^{-\lambda}
 \end{equation}

 Solving this requires setting $F_0$
 to 0.01 and solving for $\lambda$, resulting in:

 \begin{equation}
-	\lambda = - \log 0.01 = 4.6
+	\lambda = - \ln 0.01 = 4.6
 \end{equation}

 Because we get $0.3$ failures per hour, the number of hours required
@@ -1246,11 +1246,11 @@ is a good and sufficient substitute for the Poisson distribution in
 a great many situations.

 More generally, if we have $n$ failures per unit time, and we want to
-be P\,\% certain that a fix reduced the failure rate, we can use the
+be $P$\,\% certain that a fix reduced the failure rate, we can use the
 following formula:

 \begin{equation}
-	T = - \frac{1}{n} \log \frac{100 - P}{100}
+	T = - \frac{1}{n} \ln \frac{100 - P}{100}
 \label{eq:debugging:Error-Free Test Duration}
 \end{equation}

@@ -1287,14 +1287,14 @@ Equation~\ref{eq:debugging:Poisson Probability} as follows:

 \begin{equation}
 	F_0 + F_1 + \dots + F_{m - 1} + F_m =
-		\sum_{i=0}^m \frac{\lambda^i}{i!} e^{-\lambda}
+		\sum_{i=0}^m \frac{\lambda^i}{i!} \euler^{-\lambda}
 \end{equation}

 This is the Poisson cumulative distribution function, which can be
 written more compactly as:

 \begin{equation}
-	F_{i \le m} = \sum_{i=0}^m \frac{\lambda^i}{i!} e^{-\lambda}
+	F_{i \le m} = \sum_{i=0}^m \frac{\lambda^i}{i!} \euler^{-\lambda}
 \label{eq:debugging:Possion CDF}
 \end{equation}

@@ -1341,18 +1341,18 @@ that the fix actually had some relationship to the bug.\footnote{
 	Indeed it should.
 	And it does.

-	To see this, note that $e^{-\lambda}$ does not depend on $i$,
+	To see this, note that $\euler^{-\lambda}$ does not depend on $i$,
 	which means that it can be pulled out of the summation as follows:

 	\begin{equation}
-		e^{-\lambda} \sum_{i=0}^\infty \frac{\lambda^i}{i!}
+		\euler^{-\lambda} \sum_{i=0}^\infty \frac{\lambda^i}{i!}
 	\end{equation}

 	The remaining summation is exactly the Taylor series for
-	$e^\lambda$, yielding:
+	$\euler^\lambda$, yielding:

 	\begin{equation}
-		e^{-\lambda} e^\lambda
+		\euler^{-\lambda} \euler^\lambda
 	\end{equation}

 	The two exponentials are reciprocals, and therefore cancel,
diff --git a/perfbook.tex b/perfbook.tex
index 84e48eb..da9cfa8 100644
--- a/perfbook.tex
+++ b/perfbook.tex
@@ -137,6 +137,8 @@
 \newcommand{\nf}[1]{\textnormal{#1}} % to return to normal font
 \newcommand{\qop}[1]{{\sffamily #1}} % QC operator such as H, T, S, etc.

+\DeclareRobustCommand{\euler}{\ensuremath{\mathrm{e}}}
+
 \newcommand{\Epigraph}[2]{\epigraphhead[65]{\rmfamily\epigraph{#1}{#2}}}

 \input{ushyphex} % Hyphenation exceptions for US English from hyphenex package
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 03/10] future/QC: Insert narrow space in front of percent symbol
  2017-10-05 15:47 [PATCH 00/10] Tweaks to follow guidelines in style guide Akira Yokosawa
  2017-10-05 15:48 ` [PATCH 01/10] debugging: Insert narrow space in front of percent symbol Akira Yokosawa
  2017-10-05 15:49 ` [PATCH 02/10] debugging: Use upright font for Euler's number Akira Yokosawa
@ 2017-10-05 15:51 ` Akira Yokosawa
  2017-10-05 15:52 ` [PATCH 04/10] future/QC: Use non-breakable hyphen for axis names Akira Yokosawa
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Akira Yokosawa @ 2017-10-05 15:51 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: perfbook, Akira Yokosawa

From 85760945ceceff79e0d43156ad428043141c38b0 Mon Sep 17 00:00:00 2001
From: Akira Yokosawa <akiyks@gmail.com>
Date: Sat, 30 Sep 2017 18:01:14 +0900
Subject: [PATCH 03/10] future/QC: Insert narrow space in front of percent symbol

Signed-off-by: Akira Yokosawa <akiyks@gmail.com>
---
 future/QC.tex | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/future/QC.tex b/future/QC.tex
index e5ff74e..437a4d4 100644
--- a/future/QC.tex
+++ b/future/QC.tex
@@ -309,10 +309,10 @@ A qubit is said to:
 \item	Collapse to a zero ($\ket{0}$) or a one ($\ket{1}$) if measured,
 	with probability being a function of the relative distance from
 	$\ket{0}$ and $\ket{1}$, but projected onto the Z-axis.
-	Thus, a qubit on the equator of the Bloch sphere has a 50\%
+	Thus, a qubit on the equator of the Bloch sphere has a 50\,\%
 	probability of being measured as a one or as a zero, while
 	a qubit on the 45\textdegree-north latitude would have
-	a 14\% chance of being measured as one and 86\% chance
+	a 14\,\% chance of being measured as one and 86\,\% chance
 	of being measured as zero.
 	This situation naturally causes developers to prefer a line
 	segment---or a classic-computing bit---over a sphere.
@@ -335,7 +335,7 @@ are as follows:
 	positive X-axis intersects the Bloch sphere, and rotates $\ket{1}$
 	to the point at which the negative X-axis intersects the Bloch
 	sphere.
-	Either way, we get a qubit that is 50\% one and 50\% zero.
+	Either way, we get a qubit that is 50\,\% one and 50\,\% zero.
 \item[\qop{S}\,:]
 	Rotate 90\degree{} ($\frac{\pi}{2}$ radians) about the
 	Bloch-sphere Z-axis, which has no effect on qubits in the
@@ -1260,9 +1260,9 @@ be extremely valuable in reducing costs (and environmental impacts)
 of logistics, but current classic heuristics can find near-optimal
 solutions for hundreds of cities~\cite{Martin:1992:LMC:2307953.2308141}
 and polynomial-time algorithms that are guaranteed to find routes
-that are no more than 40\% longer than optimal for arbitrarily
+that are no more than 40\,\% longer than optimal for arbitrarily
 large numbers of cities~\cite{Sebo:2014:STN:2688265.2688281},
-improving on the 50\% bound located a few decades
+improving on the 50\,\% bound located a few decades
 earlier~\cite{NicosChristofides1976TSP-FiftyPercent}.
 As of 2006 TSP solvers were finding optimal solutions to
 85,900-city problems~\cite{DLApplegate2007TSPtextbook}.
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 04/10] future/QC: Use non-breakable hyphen for axis names
  2017-10-05 15:47 [PATCH 00/10] Tweaks to follow guidelines in style guide Akira Yokosawa
                   ` (2 preceding siblings ...)
  2017-10-05 15:51 ` [PATCH 03/10] future/QC: Insert narrow space in front of percent symbol Akira Yokosawa
@ 2017-10-05 15:52 ` Akira Yokosawa
  2017-10-05 15:53 ` [PATCH 05/10] treewide: Insert narrow space in front of percent symbol Akira Yokosawa
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Akira Yokosawa @ 2017-10-05 15:52 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: perfbook, Akira Yokosawa

From 827ccdeef99cec23f937b92720992f38b492e502 Mon Sep 17 00:00:00 2001
From: Akira Yokosawa <akiyks@gmail.com>
Date: Sat, 30 Sep 2017 18:08:03 +0900
Subject: [PATCH 04/10] future/QC: Use non-breakable hyphen for axis names

The short cut "\=/" is provided by the "extdash" package,
as is presented in style guide.

Signed-off-by: Akira Yokosawa <akiyks@gmail.com>
---
 future/QC.tex | 20 ++++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/future/QC.tex b/future/QC.tex
index 437a4d4..daf0086 100644
--- a/future/QC.tex
+++ b/future/QC.tex
@@ -308,7 +308,7 @@ A qubit is said to:
 	Figure~\ref{fig:future:Qubit as Bloch Sphere}.
 \item	Collapse to a zero ($\ket{0}$) or a one ($\ket{1}$) if measured,
 	with probability being a function of the relative distance from
-	$\ket{0}$ and $\ket{1}$, but projected onto the Z-axis.
+	$\ket{0}$ and $\ket{1}$, but projected onto the Z\=/axis.
 	Thus, a qubit on the equator of the Bloch sphere has a 50\,\%
 	probability of being measured as a one or as a zero, while
 	a qubit on the 45\textdegree-north latitude would have
@@ -332,37 +332,37 @@ are as follows:
 	Rotate 180\degree{} ($\pi$ radians) about the Bloch-sphere
 	X-Z axis, that is, about the 45\degree{} line on the
 	X-Z plane.  This rotates $\ket{0}$ to the point at which the
-	positive X-axis intersects the Bloch sphere, and rotates $\ket{1}$
-	to the point at which the negative X-axis intersects the Bloch
+	positive X\=/axis intersects the Bloch sphere, and rotates $\ket{1}$
+	to the point at which the negative X\=/axis intersects the Bloch
 	sphere.
 	Either way, we get a qubit that is 50\,\% one and 50\,\% zero.
 \item[\qop{S}\,:]
 	Rotate 90\degree{} ($\frac{\pi}{2}$ radians) about the
-	Bloch-sphere Z-axis, which has no effect on qubits in the
+	Bloch-sphere Z\=/axis, which has no effect on qubits in the
 	$\ket{0}$ or $\ket{1}$ states.
 \item[\qop{S}$^{\bm{\dagger}}$:]
 	Rotate $-90\degree$ ($-\frac{\pi}{2}$ radians) about the
-	Bloch-sphere Z-axis, which has no effect on qubits in the
+	Bloch-sphere Z\=/axis, which has no effect on qubits in the
 	$\ket{0}$ or $\ket{1}$ states.
 	This operator is the inverse of \qop{S}.
 \item[\qop{T}\,:]
 	Rotate 45\degree{} ($\frac{\pi}{4}$ radians) about the
-	Bloch-sphere Z-axis, which has no effect on qubits in the
+	Bloch-sphere Z\=/axis, which has no effect on qubits in the
 	$\ket{0}$ or $\ket{1}$ states.
 \item[\qop{T}$^{\bm{\dagger}}$:]
 	Rotate $-45\degree$ ($-\frac{\pi}{4}$ radians) about the
-	Bloch-sphere Z-axis, which has no effect on qubits in the
+	Bloch-sphere Z\=/axis, which has no effect on qubits in the
 	$\ket{0}$ or $\ket{1}$ states.
 	This operator is the inverse of \qop{T}.
 \item[\qop{X}\,:]
 	Rotate 180\degree{} ($\pi$ radians) about the Bloch-sphere
-	X-axis, which takes $\ket{0}$ to $\ket{1}$ and vice versa.
+	X\=/axis, which takes $\ket{0}$ to $\ket{1}$ and vice versa.
 \item[\qop{Y}\,:]
 	Rotate 180\degree{} ($\pi$ radians) about the Bloch-sphere
-	Y-axis, which also takes $\ket{0}$ to $\ket{1}$ and vice versa.
+	Y\=/axis, which also takes $\ket{0}$ to $\ket{1}$ and vice versa.
 \item[\qop{Z}\,:]
 	Rotate 180\degree{} ($\pi$ radians) about the Bloch-sphere
-	Z-axis, which has no effect on qubits in the $\ket{0}$ or
+	Z\=/axis, which has no effect on qubits in the $\ket{0}$ or
 	$\ket{1}$ states.
 \end{description}

-- 
2.7.4



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 05/10] treewide: Insert narrow space in front of percent symbol
  2017-10-05 15:47 [PATCH 00/10] Tweaks to follow guidelines in style guide Akira Yokosawa
                   ` (3 preceding siblings ...)
  2017-10-05 15:52 ` [PATCH 04/10] future/QC: Use non-breakable hyphen for axis names Akira Yokosawa
@ 2017-10-05 15:53 ` Akira Yokosawa
  2017-10-05 15:54 ` [PATCH 06/10] treewide: Use \Power{} macro for POWER CPU family Akira Yokosawa
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Akira Yokosawa @ 2017-10-05 15:53 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: perfbook, Akira Yokosawa

From ffbf7756c160eaa59e8a93c1bdd09c1497dfe449 Mon Sep 17 00:00:00 2001
From: Akira Yokosawa <akiyks@gmail.com>
Date: Sun, 1 Oct 2017 12:17:43 +0900
Subject: [PATCH 05/10] treewide: Insert narrow space in front of percent symbol

In SMPdesign/beyond.tex, there are two cases where "percent" is
spelled out in compound words.

Signed-off-by: Akira Yokosawa <akiyks@gmail.com>
---
 SMPdesign/SMPdesign.tex |  2 +-
 SMPdesign/beyond.tex    | 14 +++++++-------
 advsync/advsync.tex     |  2 +-
 count/count.tex         |  4 ++--
 cpu/hwfreelunch.tex     |  4 ++--
 defer/rcuusage.tex      |  4 ++--
 formal/dyntickrcu.tex   |  2 +-
 formal/spinhint.tex     |  2 +-
 future/htm.tex          |  2 +-
 future/tm.tex           |  4 ++--
 intro/intro.tex         |  6 +++---
 rt/rt.tex               |  8 ++++----
 12 files changed, 27 insertions(+), 27 deletions(-)

diff --git a/SMPdesign/SMPdesign.tex b/SMPdesign/SMPdesign.tex
index 1936d27..81219cb 100644
--- a/SMPdesign/SMPdesign.tex
+++ b/SMPdesign/SMPdesign.tex
@@ -1186,7 +1186,7 @@ which fortunately is usually quite easy to do in actual
 practice~\cite{McKenney01e}, especially given today's large memories.
 For example, in most systems, it is quite reasonable to set
 \co{TARGET_POOL_SIZE} to 100, in which case allocations and frees
-are guaranteed to be confined to per-thread pools at least 99\% of
+are guaranteed to be confined to per-thread pools at least 99\,\% of
 the time.

 As can be seen from the figure, the situations where the common-case
diff --git a/SMPdesign/beyond.tex b/SMPdesign/beyond.tex
index 7ba351e..1fb2a6b 100644
--- a/SMPdesign/beyond.tex
+++ b/SMPdesign/beyond.tex
@@ -401,8 +401,8 @@ large algorithmic superlinear speedups.
 \end{figure}

 Further investigation showed that
-PART sometimes visited fewer than 2\% of the maze's cells,
-while SEQ and PWQ never visited fewer than about 9\%.
+PART sometimes visited fewer than 2\,\% of the maze's cells,
+while SEQ and PWQ never visited fewer than about 9\,\%.
 The reason for this difference is shown by
 Figure~\ref{fig:SMPdesign:Reason for Small Visit Percentages}.
 If the thread traversing the solution from the upper left reaches
@@ -473,11 +473,11 @@ optimizations are quite attractive.
 Cache alignment and padding often improves performance by reducing
 false sharing.
 However, for these maze-solution algorithms, aligning and padding the
-maze-cell array \emph{degrades} performance by up to 42\% for 1000x1000 mazes.
+maze-cell array \emph{degrades} performance by up to 42\,\% for 1000x1000 mazes.
 Cache locality is more important than avoiding
 false sharing, especially for large mazes.
 For smaller 20-by-20 or 50-by-50 mazes, aligning and padding can produce
-up to a 40\% performance improvement for PART,
+up to a 40\,\% performance improvement for PART,
 but for these small sizes, SEQ performs better anyway because there
 is insufficient time for PART to make up for the overhead of
 thread creation and destruction.
@@ -508,7 +508,7 @@ context-switch overhead and visit percentage.
 As can be seen in
 Figure~\ref{fig:SMPdesign:Partitioned Coroutines},
 this coroutine algorithm (COPART) is quite effective, with the performance
-on one thread being within about 30\% of PART on two threads
+on one thread being within about 30\,\% of PART on two threads
 (\path{maze_2seq.c}).

 \subsection{Performance Comparison II}
@@ -532,7 +532,7 @@ Figures~\ref{fig:SMPdesign:Varying Maze Size vs. SEQ}
 and~\ref{fig:SMPdesign:Varying Maze Size vs. COPART}
 show the effects of varying maze size, comparing both PWQ and PART
 running on two threads
-against either SEQ or COPART, respectively, with 90\%-confidence
+against either SEQ or COPART, respectively, with 90\=/percent\-/confidence
 error bars.
 PART shows superlinear scalability against SEQ and modest scalability
 against COPART for 100-by-100 and larger mazes.
@@ -565,7 +565,7 @@ a thread is connected to both beginning and end).
 PWQ performs quite poorly, but
 PART hits breakeven at two threads and again at five threads, achieving
 modest speedups beyond five threads.
-Theoretical energy efficiency breakeven is within the 90\% confidence
+Theoretical energy efficiency breakeven is within the 90\=/percent\-/confidence
 interval for seven and eight threads.
 The reasons for the peak at two threads are (1) the lower complexity
 of termination detection in the two-thread case and (2) the fact that
diff --git a/advsync/advsync.tex b/advsync/advsync.tex
index 98e6986..adf1dc9 100644
--- a/advsync/advsync.tex
+++ b/advsync/advsync.tex
@@ -85,7 +85,7 @@ basis of real-time programming:
 	bound.
 \item	Real-time forward-progress guarantees are sometimes
 	probabilistic, as in the soft-real-time guarantee that
-	``at least 99.9\% of the time, scheduling latency must
+	``at least 99.9\,\% of the time, scheduling latency must
 	be less than 100 microseconds.''
 	In contrast, NBS's forward-progress
 	guarantees have traditionally been unconditional.
diff --git a/count/count.tex b/count/count.tex
index f1645ee..73b6866 100644
--- a/count/count.tex
+++ b/count/count.tex
@@ -55,7 +55,7 @@ counting.
 	whatever ``true value'' might mean in this context.
 	However, the value read out should maintain roughly the same
 	absolute error over time.
-	For example, a 1\% error might be just fine when the count
+	For example, a 1\,\% error might be just fine when the count
 	is on the order of a million or so, but might be absolutely
 	unacceptable once the count reaches a trillion.
 	See Section~\ref{sec:count:Statistical Counters}.
@@ -204,7 +204,7 @@ On my dual-core laptop, a short run invoked \co{inc_count()}
 100,014,000 times, but the final value of the counter was only
 52,909,118.
 Although approximate values do have their place in computing,
-accuracies far greater than 50\% are almost always necessary.
+accuracies far greater than 50\,\% are almost always necessary.

 \QuickQuiz{}
 	But doesn't the \co{++} operator produce an x86 add-to-memory
diff --git a/cpu/hwfreelunch.tex b/cpu/hwfreelunch.tex
index b449ba2..152f691 100644
--- a/cpu/hwfreelunch.tex
+++ b/cpu/hwfreelunch.tex
@@ -193,13 +193,13 @@ excellent bragging rights, if nothing else!
 Although the speed of light would be a hard limit, the fact is that
 semiconductor devices are limited by the speed of electricity rather
 than that of light, given that electric waves in semiconductor materials
-move at between 3\% and 30\% of the speed of light in a vacuum.
+move at between 3\,\% and 30\,\% of the speed of light in a vacuum.
 The use of copper connections on silicon devices is one way to increase
 the speed of electricity, and it is quite possible that additional
 advances will push closer still to the actual speed of light.
 In addition, there have been some experiments with tiny optical fibers
 as interconnects within and between chips, based on the fact that
-the speed of light in glass is more than 60\% of the speed of light
+the speed of light in glass is more than 60\,\% of the speed of light
 in a vacuum.
 One obstacle to such optical fibers is the inefficiency conversion
 between electricity and light and vice versa, resulting in both
diff --git a/defer/rcuusage.tex b/defer/rcuusage.tex
index af4faff..74be9fc 100644
--- a/defer/rcuusage.tex
+++ b/defer/rcuusage.tex
@@ -193,7 +193,7 @@ ideal synchronization-free workload, as desired.
 	each search is taking on average about 13~nanoseconds,
 	which is short enough for small differences in code
 	generation to make their presence felt.
-	The difference ranges from about 1.5\% to about 11.1\%, which is
+	The difference ranges from about 1.5\,\% to about 11.1\,\%, which is
 	quite small when you consider that the RCU QSBR code can handle
 	concurrent updates and the ``ideal'' code cannot.

@@ -775,7 +775,7 @@ again showing data taken on a 16-CPU 3\,GHz Intel x86 system.
 	Most likely NUMA effects.
 	However, there is substantial variance in the values measured for the
 	refcnt line, as can be seen by the error bars.
-	In fact, standard deviations range in excess of 10\% of measured
+	In fact, standard deviations range in excess of 10\,\% of measured
 	values in some cases.
 	The dip in overhead therefore might well be a statistical aberration.
 } \QuickQuizEnd
diff --git a/formal/dyntickrcu.tex b/formal/dyntickrcu.tex
index 80fa3e7..ec3c78c 100644
--- a/formal/dyntickrcu.tex
+++ b/formal/dyntickrcu.tex
@@ -1748,7 +1748,7 @@ states, passing without errors.
 	\end{quote}

 	This means that any attempt to optimize the production of code should
-	place at least 66\% of its emphasis on optimizing the debugging process,
+	place at least 66\,\% of its emphasis on optimizing the debugging process,
 	even at the expense of increasing the time and effort spent coding.
 	Incremental coding and testing is one way to optimize the debugging
 	process, at the expense of some increase in coding effort.
diff --git a/formal/spinhint.tex b/formal/spinhint.tex
index a40d2c3..27df639 100644
--- a/formal/spinhint.tex
+++ b/formal/spinhint.tex
@@ -416,7 +416,7 @@ Given a source file \path{qrcu.spin}, one can use the following commands:
 	run \co{top} in one window and \co{./pan} in another.  Keep the
 	focus on the \co{./pan} window so that you can quickly kill
 	execution if need be.  As soon as CPU time drops much below
-	100\%, kill \co{./pan}.  If you have removed focus from the
+	100\,\%, kill \co{./pan}.  If you have removed focus from the
 	window running \co{./pan}, you may wait a long time for the
 	windowing system to grab enough memory to do anything for
 	you.
diff --git a/future/htm.tex b/future/htm.tex
index e26ee2a..0c3801d 100644
--- a/future/htm.tex
+++ b/future/htm.tex
@@ -1185,7 +1185,7 @@ by Siakavaras et al.~\cite{Siakavaras2017CombiningHA},
 is to use RCU for read-only traversals and HTM
 only for the actual updates themselves.
 This combination outperformed other transactional-memory techniques by
-up to 220\%, a speedup similar to that observed by
+up to 220\,\%, a speedup similar to that observed by
 Howard and Walpole~\cite{PhilHoward2011RCUTMRBTree}
 when they combined RCU with STM.
 In both cases, the weak atomicity is implemented in software rather than
diff --git a/future/tm.tex b/future/tm.tex
index ec5373d..8420331 100644
--- a/future/tm.tex
+++ b/future/tm.tex
@@ -711,8 +711,8 @@ representing the lock as part of the transaction, and everything works
 out perfectly.
 In practice, a number of non-obvious complications~\cite{Volos2008TRANSACT}
 can arise, depending on implementation details of the TM system.
-These complications can be resolved, but at the cost of a 45\% increase in
-overhead for locks acquired outside of transactions and a 300\% increase
+These complications can be resolved, but at the cost of a 45\,\% increase in
+overhead for locks acquired outside of transactions and a 300\,\% increase
 in overhead for locks acquired within transactions.
 Although these overheads might be acceptable for transactional
 programs containing small amounts of locking, they are often completely
diff --git a/intro/intro.tex b/intro/intro.tex
index ca991bd..8bed518 100644
--- a/intro/intro.tex
+++ b/intro/intro.tex
@@ -414,7 +414,7 @@ To see this, consider that the price of early computers was tens
 of millions of dollars at
 a time when engineering salaries were but a few thousand dollars a year.
 If dedicating a team of ten engineers to such a machine would improve
-its performance, even by only 10\%, then their salaries
+its performance, even by only 10\,\%, then their salaries
 would be repaid many times over.

 One such machine was the CSIRAC, the oldest still-intact stored-program
@@ -863,11 +863,11 @@ been extremely narrowly focused, and hence unable to demonstrate any
 general results.
 Furthermore, given that the normal range of programmer productivity
 spans more than an order of magnitude, it is unrealistic to expect
-an affordable study to be capable of detecting (say) a 10\% difference
+an affordable study to be capable of detecting (say) a 10\,\% difference
 in productivity.
 Although the multiple-order-of-magnitude differences that such studies
 \emph{can} reliably detect are extremely valuable, the most impressive
-improvements tend to be based on a long series of 10\% improvements.
+improvements tend to be based on a long series of 10\,\% improvements.

 We must therefore take a different approach.

diff --git a/rt/rt.tex b/rt/rt.tex
index 2f5d4fe..21e7117 100644
--- a/rt/rt.tex
+++ b/rt/rt.tex
@@ -48,7 +48,7 @@ are clearly required.
 We might therefore say that a given soft real-time application must meet
 its response-time requirements at least some fraction of the time, for
 example, we might say that it must execute in less than 20 microseconds
-99.9\% of the time.
+99.9\,\% of the time.

 This of course raises the question of what is to be done when the application
 fails to meet its response-time requirements.
@@ -267,7 +267,7 @@ or even avoiding interrupts altogether in favor of polling.

 Overloading can also degrade response times due to queueing effects,
 so it is not unusual for real-time systems to overprovision CPU bandwidth,
-so that a running system has (say) 80\% idle time.
+so that a running system has (say) 80\,\% idle time.
 This approach also applies to storage and networking devices.
 In some cases, separate storage and networking hardware might be reserved
 for the sole use of high-priority portions of the real-time application.
@@ -351,7 +351,7 @@ on the hardware and software implementing those operations.
 For each such operation, these constraints might include a maximum
 response time (and possibly also a minimum response time) and a
 probability of meeting that response time.
-A probability of 100\% indicates that the corresponding operation
+A probability of 100\,\% indicates that the corresponding operation
 must provide hard real-time service.

 In some cases, both the response times and the required probabilities of
@@ -1583,7 +1583,7 @@ These constraints include:
 	latencies are provided only to the highest-priority threads.
 \item	Sufficient bandwidth to support the workload.
 	An implementation rule supporting this constraint might be
-	``There will be at least 50\% idle time on all CPUs
+	``There will be at least 50\,\% idle time on all CPUs
 	during normal operation,''
 	or, more formally, ``The offered load will be sufficiently low
 	to allow the workload to be schedulable at all times.''
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 06/10] treewide: Use \Power{} macro for POWER CPU family
  2017-10-05 15:47 [PATCH 00/10] Tweaks to follow guidelines in style guide Akira Yokosawa
                   ` (4 preceding siblings ...)
  2017-10-05 15:53 ` [PATCH 05/10] treewide: Insert narrow space in front of percent symbol Akira Yokosawa
@ 2017-10-05 15:54 ` Akira Yokosawa
  2017-10-05 15:55 ` [PATCH 07/10] treewide: Call GNU C compiler as "GCC" Akira Yokosawa
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Akira Yokosawa @ 2017-10-05 15:54 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: perfbook, Akira Yokosawa

From c7255fa8b6fc7835c0eb6ab524aed3349cea1dca Mon Sep 17 00:00:00 2001
From: Akira Yokosawa <akiyks@gmail.com>
Date: Sun, 1 Oct 2017 12:40:18 +0900
Subject: [PATCH 06/10] treewide: Use \Power{} macro for POWER CPU family

Signed-off-by: Akira Yokosawa <akiyks@gmail.com>
---
 appendix/toyrcu/toyrcu.tex    | 26 +++++++++++++-------------
 count/count.tex               |  6 +++---
 intro/intro.tex               |  2 +-
 memorder/memorder.tex         | 30 +++++++++++++++---------------
 perfbook.tex                  |  1 +
 toolsoftrade/toolsoftrade.tex |  4 ++--
 6 files changed, 35 insertions(+), 34 deletions(-)

diff --git a/appendix/toyrcu/toyrcu.tex b/appendix/toyrcu/toyrcu.tex
index db45fad..2c65f74 100644
--- a/appendix/toyrcu/toyrcu.tex
+++ b/appendix/toyrcu/toyrcu.tex
@@ -73,7 +73,7 @@ Of course, only one RCU reader may be in its read-side critical section
 at a time, which almost entirely defeats the purpose of RCU.
 In addition, the lock operations in \co{rcu_read_lock()} and
 \co{rcu_read_unlock()} are extremely heavyweight,
-with read-side overhead ranging from about 100~nanoseconds on a single Power5
+with read-side overhead ranging from about 100~nanoseconds on a single \Power{5}
 CPU up to more than 17~\emph{microseconds} on a 64-CPU system.
 Worse yet,
 these same lock operations permit \co{rcu_read_lock()}
@@ -216,7 +216,7 @@ with a single global lock.
 Furthermore, the read-side overhead, though high at roughly 140 nanoseconds,
 remains at about 140 nanoseconds regardless of the number of CPUs.
 However, the update-side overhead ranges from about 600 nanoseconds
-on a single Power5 CPU
+on a single \Power{5} CPU
 up to more than 100 \emph{microseconds} on 64 CPUs.

 \QuickQuiz{}
@@ -368,7 +368,7 @@ However, this implementations still has some serious shortcomings.
 First, the atomic operations in \co{rcu_read_lock()} and
 \co{rcu_read_unlock()} are still quite  heavyweight,
 with read-side overhead ranging from about 100~nanoseconds on
-a single Power5 CPU up to almost 40~\emph{microseconds}
+a single \Power{5} CPU up to almost 40~\emph{microseconds}
 on a 64-CPU system.
 This means that the RCU read-side critical sections
 have to be extremely long in order to get any real
@@ -718,9 +718,9 @@ In fact, they are more complex than those
 of the single-counter variant shown in
 Figure~\ref{fig:app:toyrcu:RCU Implementation Using Single Global Reference Counter},
 with the read-side primitives consuming about 150~nanoseconds on a single
-Power5 CPU and almost 40~\emph{microseconds} on a 64-CPU system.
+\Power{5} CPU and almost 40~\emph{microseconds} on a 64-CPU system.
 The update-side \co{synchronize_rcu()} primitive is more costly as
-well, ranging from about 200~nanoseconds on a single Power5 CPU to
+well, ranging from about 200~nanoseconds on a single \Power{5} CPU to
 more than 40~\emph{microseconds} on a 64-CPU system.
 This means that the RCU read-side critical sections
 have to be extremely long in order to get any real
@@ -963,9 +963,9 @@ environments.

 That said, the read-side primitives scale very nicely, requiring about
 115~nanoseconds regardless of whether running on a single-CPU or a 64-CPU
-Power5 system.
+\Power{5} system.
 As noted above, the \co{synchronize_rcu()} primitive does not scale,
-ranging in overhead from almost a microsecond on a single Power5 CPU
+ranging in overhead from almost a microsecond on a single \Power{5} CPU
 up to almost 200~microseconds on a 64-CPU system.
 This implementation could conceivably form the basis for a
 production-quality user-level RCU implementation.
@@ -1340,9 +1340,9 @@ destruction will not be reordered into the preceding loop.

 This approach achieves much better read-side performance, incurring
 roughly 63~nanoseconds of overhead regardless of the number of
-Power5 CPUs.
+\Power{5} CPUs.
 Updates incur more overhead, ranging from about 500~nanoseconds on
-a single Power5 CPU to more than 100~\emph{microseconds} on 64
+a single \Power{5} CPU to more than 100~\emph{microseconds} on 64
 such CPUs.

 \QuickQuiz{}
@@ -1542,9 +1542,9 @@ This approach achieves read-side performance almost equal to that
 shown in
 Section~\ref{sec:app:toyrcu:RCU Based on Free-Running Counter}, incurring
 roughly 65~nanoseconds of overhead regardless of the number of
-Power5 CPUs.
+\Power{5} CPUs.
 Updates again incur more overhead, ranging from about 600~nanoseconds on
-a single Power5 CPU to more than 100~\emph{microseconds} on 64
+a single \Power{5} CPU to more than 100~\emph{microseconds} on 64
 such CPUs.

 \QuickQuiz{}
@@ -1866,11 +1866,11 @@ This implementation has blazingly fast read-side primitives, with
 an \co{rcu_read_lock()}-\co{rcu_read_unlock()} round trip incurring
 an overhead of roughly 50~\emph{picoseconds}.
 The \co{synchronize_rcu()} overhead ranges from about 600~nanoseconds
-on a single-CPU Power5 system up to more than 100~microseconds on
+on a single-CPU \Power{5} system up to more than 100~microseconds on
 a 64-CPU system.

 \QuickQuiz{}
-	To be sure, the clock frequencies of Power
+	To be sure, the clock frequencies of \Power{}
 	systems in 2008 were quite high, but even a 5\,GHz clock
 	frequency is insufficient to allow
 	loops to be executed in 50~picoseconds!
diff --git a/count/count.tex b/count/count.tex
index 73b6866..a38aba1 100644
--- a/count/count.tex
+++ b/count/count.tex
@@ -3330,7 +3330,7 @@ will expand on these lessons.
 	\path{count_end_rcu.c} & \ref{sec:together:RCU and Per-Thread-Variable-Based Statistical Counters} &
 		5.7 ns & 354 ns & 501 ns \\
 \end{tabular}
-\caption{Statistical Counter Performance on Power-6}
+\caption{Statistical Counter Performance on \Power{6}}
 \label{tab:count:Statistical Counter Performance on Power-6}
 \end{table*}

@@ -3410,14 +3410,14 @@ courtesy of eventual consistency.
 	\path{count_lim_sig.c} & \ref{sec:count:Signal-Theft Limit Counter Implementation} &
 		Y & 10.2 ns & 370 ns & 54,000 ns \\
 \end{tabular}
-\caption{Limit Counter Performance on Power-6}
+\caption{Limit Counter Performance on \Power{6}}
 \label{tab:count:Limit Counter Performance on Power-6}
 \end{table*}

 Figure~\ref{tab:count:Limit Counter Performance on Power-6}
 shows the performance of the parallel limit-counting algorithms.
 Exact enforcement of the limits incurs a substantial performance
-penalty, although on this 4.7\,GHz Power-6 system that penalty can be reduced
+penalty, although on this 4.7\,GHz \Power{6} system that penalty can be reduced
 by substituting signals for atomic operations.
 All of these implementations suffer from read-side lock contention
 in the face of concurrent readers.
diff --git a/intro/intro.tex b/intro/intro.tex
index 8bed518..293a02f 100644
--- a/intro/intro.tex
+++ b/intro/intro.tex
@@ -77,7 +77,7 @@ that of a bicycle, courtesy of Moore's Law.
 Papers calling out the advantages of multicore CPUs were published
 as early as 1996~\cite{Olukotun96}.
 IBM introduced simultaneous multi-threading
-into its high-end POWER family in 2000, and multicore in 2001.
+into its high-end \Power{} family in 2000, and multicore in 2001.
 Intel introduced hyperthreading into its commodity Pentium line in
 November 2000, and both AMD and Intel introduced
 dual-core CPUs in 2005.
diff --git a/memorder/memorder.tex b/memorder/memorder.tex
index 7dc3fb4..944c17a 100644
--- a/memorder/memorder.tex
+++ b/memorder/memorder.tex
@@ -314,7 +314,7 @@ synchronization primitives (such as locking and RCU)
 that are responsible for maintaining the illusion of ordering through use of
 \emph{memory barriers} (for example, \co{smp_mb()} in the Linux kernel).
 These memory barriers can be explicit instructions, as they are on
-ARM, POWER, Itanium, and Alpha, or they can be implied by other instructions,
+ARM, \Power{}, Itanium, and Alpha, or they can be implied by other instructions,
 as they often are on x86.
 Since these standard synchronization primitives preserve the illusion of
 ordering, your path of least resistance is to simply use these primitives,
@@ -827,7 +827,7 @@ if the shared variable had changed before entry into the loop.
 This allows us to plot each CPU's view of the value of \co{state.variable}
 over a 532-nanosecond time period, as shown in
 Figure~\ref{fig:memorder:A Variable With Multiple Simultaneous Values}.
-This data was collected in 2006 on 1.5\,GHz POWER5 system with 8 cores,
+This data was collected in 2006 on 1.5\,GHz \Power{5} system with 8 cores,
 each containing a pair of hardware threads.
 CPUs~1, 2, 3, and~4 recorded the values, while CPU~0 controlled the test.
 The timebase counter period was about 5.32\,ns, sufficiently fine-grained
@@ -2043,7 +2043,7 @@ communicated to \co{P1()} long before it was communicated to \co{P2()}.
 \QuickQuizAnswer{
 	You need to face the fact that it really can trigger.
 	Akira Yokosawa used the \co{litmus7} tool to run this litmus test
-	on a Power8 system.
+	on a \Power{8} system.
 	Out of 1,000,000,000 runs, 4 triggered the \co{exists} clause.
 	Thus, triggering the \co{exists} clause is not merely a one-in-a-million
 	occurrence, but rather a one-in-a-hundred-million occurrence.
@@ -3707,7 +3707,7 @@ dependencies.
 		\rotatebox{90}{PA-RISC CPUs}
 	  \end{picture}
 	& \begin{picture}(6,60)(0,0)
-		\rotatebox{90}{POWER}
+		\rotatebox{90}{\Power{}}
 	  \end{picture}
 	& \begin{picture}(6,60)(0,0)
 		\rotatebox{90}{SPARC TSO}
@@ -4134,7 +4134,7 @@ For more on Alpha, see its reference manual~\cite{ALPHA2002}.

 The ARM family of CPUs is extremely popular in embedded applications,
 particularly for power-constrained applications such as cellphones.
-Its memory model is similar to that of Power
+Its memory model is similar to that of \Power{}
 (see Section~\ref{sec:memorder:POWER / PowerPC}, but ARM uses a
 different set of memory-barrier instructions~\cite{ARMv7A:2010}:

@@ -4144,7 +4144,7 @@ different set of memory-barrier instructions~\cite{ARMv7A:2010}:
 	subsequent operations of the same type.
 	The ``type'' of operations can be all operations or can be
 	restricted to only writes (similar to the Alpha \co{wmb}
-	and the POWER \co{eieio} instructions).
+	and the \Power{} \co{eieio} instructions).
 	In addition, ARM allows cache coherence to have one of three
 	scopes: single processor, a subset of the processors
 	(``inner'') and global (``outer'').
@@ -4168,7 +4168,7 @@ None of these instructions exactly match the semantics of Linux's
 \co{DMB}.
 The \co{DMB} and \co{DSB} instructions have a recursive definition
 of accesses ordered before and after the barrier, which has an effect
-similar to that of POWER's cumulativity.
+similar to that of \Power{}'s cumulativity.

 ARM also implements control dependencies, so that if a conditional
 branch depends on a load, then any store executed after that conditional
@@ -4292,7 +4292,7 @@ memory barriers.
 \subsection{MIPS}

 The MIPS memory model~\cite[Table 6.6]{MIPSvII-A-2015}
-appears to resemble that of ARM, Itanium, and Power,
+appears to resemble that of ARM, Itanium, and \Power{},
 being weakly ordered by default, but respecting dependencies.
 MIPS has a wide variety of memory-barrier instructions, but ties them
 not to hardware considerations, but rather to the use cases provided
@@ -4325,7 +4325,7 @@ in a manner similar to the ARM64 additions:

 Informal discussions with MIPS architects indicates that MIPS has a
 definition of transitivity or cumulativity similar to that of
-ARM and Power.
+ARM and \Power{}.
 However, it appears that different MIPS implementations can have
 different memory-ordering properties, so it is important to consult
 the documentation for the specific MIPS implementation you are using.
@@ -4339,10 +4339,10 @@ no code, however, they do use the gcc {\tt memory} attribute to disable
 compiler optimizations that would reorder code across the memory
 barrier.

-\subsection{POWER / PowerPC}
+\subsection{\Power{} / PowerPC}
 \label{sec:memorder:POWER / PowerPC}

-The POWER and PowerPC\textsuperscript{\textregistered}
+The \Power{} and PowerPC\textsuperscript{\textregistered}
 CPU families have a wide variety of memory-barrier
 instructions~\cite{PowerPC94,MichaelLyons05a}:
 \begin{description}
@@ -4388,7 +4388,7 @@ The \co{smp_mb()} instruction is also defined to be the {\tt sync}
 instruction, but both \co{smp_rmb()} and \co{rmb()} are defined to
 be the lighter-weight {\tt lwsync} instruction.

-Power features ``cumulativity'', which can be used to obtain
+\Power{} features ``cumulativity'', which can be used to obtain
 transitivity.
 When used properly, any code seeing the results of an earlier
 code fragment will also see the accesses that this earlier code
@@ -4396,11 +4396,11 @@ fragment itself saw.
 Much more detail is available from
 McKenney and Silvera~\cite{PaulEMcKenneyN2745r2009}.

-Power respects control dependencies in much the same way that ARM
-does, with the exception that the Power \co{isync} instruction
+\Power{} respects control dependencies in much the same way that ARM
+does, with the exception that the \Power{} \co{isync} instruction
 is substituted for the ARM \co{ISB} instruction.

-Many members of the POWER architecture have incoherent instruction
+Many members of the \Power{} architecture have incoherent instruction
 caches, so that a store to memory will not necessarily be reflected
 in the instruction cache.
 Thankfully, few people write self-modifying code these days, but JITs
diff --git a/perfbook.tex b/perfbook.tex
index da9cfa8..cc4f4b0 100644
--- a/perfbook.tex
+++ b/perfbook.tex
@@ -138,6 +138,7 @@
 \newcommand{\qop}[1]{{\sffamily #1}} % QC operator such as H, T, S, etc.

 \DeclareRobustCommand{\euler}{\ensuremath{\mathrm{e}}}
+\newcommand{\Power}[1]{POWER#1}

 \newcommand{\Epigraph}[2]{\epigraphhead[65]{\rmfamily\epigraph{#1}{#2}}}

diff --git a/toolsoftrade/toolsoftrade.tex b/toolsoftrade/toolsoftrade.tex
index 9cf3312..97a37d3 100644
--- a/toolsoftrade/toolsoftrade.tex
+++ b/toolsoftrade/toolsoftrade.tex
@@ -1038,7 +1038,7 @@ Line~39 moves the lock-acquisition count to this thread's element of the
 \end{figure}

 Figure~\ref{fig:toolsoftrade:Reader-Writer Lock Scalability}
-shows the results of running this test on a 64-core Power-5 system
+shows the results of running this test on a 64-core \Power{5} system
 with two hardware threads per core for a total of 128 software-visible
 CPUs.
 The \co{thinktime} parameter was zero for all these tests, and the
@@ -1137,7 +1137,7 @@ This situation will only get worse as you add CPUs.
 } \QuickQuizEnd

 \QuickQuiz{}
-	Power-5 is several years old, and new hardware should
+	\Power{5} is several years old, and new hardware should
 	be faster.
 	So why should anyone worry about reader-writer locks being slow?
 \QuickQuizAnswer{
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 07/10] treewide: Call GNU C compiler as "GCC"
  2017-10-05 15:47 [PATCH 00/10] Tweaks to follow guidelines in style guide Akira Yokosawa
                   ` (5 preceding siblings ...)
  2017-10-05 15:54 ` [PATCH 06/10] treewide: Use \Power{} macro for POWER CPU family Akira Yokosawa
@ 2017-10-05 15:55 ` Akira Yokosawa
  2017-10-05 15:56 ` [PATCH 08/10] treewide: Use "IRQ" instead of "irq" used as abbreviation Akira Yokosawa
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Akira Yokosawa @ 2017-10-05 15:55 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: perfbook, Akira Yokosawa

From 051dc90e73bbd57412c054f482d6ad401f3b1228 Mon Sep 17 00:00:00 2001
From: Akira Yokosawa <akiyks@gmail.com>
Date: Sun, 1 Oct 2017 16:29:14 +0900
Subject: [PATCH 07/10] treewide: Call GNU C compiler as "GCC"

Exception to simple substitution:

   The gcc compiler -> The GNU C compiler
   the gcc xxxx facility -> GCC's xxxx facility
   gcc extensions -> GNU extensions

"GNU C" and "GCC" are defined in macros "\GNUC" and "\GCC" respectively.

Signed-off-by: Akira Yokosawa <akiyks@gmail.com>
---
 count/count.tex               | 18 +++++++++---------
 datastruct/datastruct.tex     |  2 +-
 formal/formal.tex             |  2 +-
 memorder/memorder.tex         |  2 +-
 perfbook.tex                  |  3 +++
 toolsoftrade/toolsoftrade.tex | 20 ++++++++++----------
 6 files changed, 25 insertions(+), 22 deletions(-)

diff --git a/count/count.tex b/count/count.tex
index a38aba1..a213558 100644
--- a/count/count.tex
+++ b/count/count.tex
@@ -213,7 +213,7 @@ accuracies far greater than 50\,\% are almost always necessary.
 \QuickQuizAnswer{
 	Although the \co{++} operator \emph{could} be atomic, there
 	is no requirement that it be so.
-	And indeed, \co{gcc} often
+	And indeed, \GCC\ often
 	chooses to load the value to a register, increment
 	the register, then store the value to memory, which is
 	decidedly non-atomic.
@@ -486,7 +486,7 @@ thread (presumably cache aligned and padded to avoid false sharing).
 	It can, and in this toy implementation, it does.
 	But it is not that hard to come up with an alternative
 	implementation that permits an arbitrary number of threads,
-	for example, using the \co{gcc} \co{__thread} facility,
+	for example, using \GCC's \co{__thread} facility,
 	as shown in
 	Section~\ref{sec:count:Per-Thread-Variable-Based Implementation}.
 } \QuickQuizEnd
@@ -535,11 +535,11 @@ using the \co{for_each_thread()} primitive to iterate over the list of
 currently running threads, and using the \co{per_thread()} primitive
 to fetch the specified thread's counter.
 Because the hardware can fetch and store a properly aligned \co{long}
-atomically, and because gcc is kind enough to make use of this capability,
+atomically, and because \GCC\ is kind enough to make use of this capability,
 normal loads suffice, and no special atomic instructions are required.

 \QuickQuiz{}
-	What other choice does gcc have, anyway???
+	What other choice does \GCC\ have, anyway???
 \QuickQuizAnswer{
 	According to the C standard, the effects of fetching a variable
 	that might be concurrently modified by some other thread are
@@ -548,7 +548,7 @@ normal loads suffice, and no special atomic instructions are required.
 	given that C must support (for example) eight-bit architectures
 	which are incapable of atomically loading a \co{long}.
 	An upcoming version of the C standard aims to fill this gap,
-	but until then, we depend on the kindness of the gcc developers.
+	but until then, we depend on the kindness of the \GCC\ developers.

 	Alternatively, use of volatile accesses such as those provided
 	by \co{ACCESS_ONCE()}~\cite{JonCorbet2012ACCESS:ONCE}
@@ -987,7 +987,7 @@ comes at the cost of the additional thread running \co{eventual()}.
 \label{fig:count:Per-Thread Statistical Counters}
 \end{figure}

-Fortunately, gcc provides an \co{__thread} storage class that provides
+Fortunately, \GCC\ provides an \co{__thread} storage class that provides
 per-thread storage.
 This can be used as shown in
 Figure~\ref{fig:count:Per-Thread Statistical Counters} (\path{count_end.c})
@@ -1005,13 +1005,13 @@ value of the counter and exiting threads.
 \QuickQuiz{}
 	Why do we need an explicit array to find the other threads'
 	counters?
-	Why doesn't gcc provide a \co{per_thread()} interface, similar
+	Why doesn't \GCC\ provide a \co{per_thread()} interface, similar
 	to the Linux kernel's \co{per_cpu()} primitive, to allow
 	threads to more easily access each others' per-thread variables?
 \QuickQuizAnswer{
 	Why indeed?

-	To be fair, gcc faces some challenges that the Linux kernel
+	To be fair, \GCC\ faces some challenges that the Linux kernel
 	gets to ignore.
 	When a user-level thread exits, its per-thread variables all
 	disappear, which complicates the problem of per-thread-variable
@@ -2862,7 +2862,7 @@ line~33 sends the thread a signal.
 \QuickQuiz{}
 	The code in
 	Figure~\ref{fig:count:Signal-Theft Limit Counter Value-Migration Functions},
-	works with gcc and POSIX.
+	works with \GCC\ and POSIX.
 	What would be required to make it also conform to the ISO C standard?
 \QuickQuizAnswer{
 	The \co{theft} variable must be of type \co{sig_atomic_t}
diff --git a/datastruct/datastruct.tex b/datastruct/datastruct.tex
index fad7668..8b8dd0a 100644
--- a/datastruct/datastruct.tex
+++ b/datastruct/datastruct.tex
@@ -2086,7 +2086,7 @@ performance and scalability.

 One way to solve this problem on systems with 64-byte cache line is shown in
 Figure~\ref{fig:datastruct:Alignment for 64-Byte Cache Lines}.
-Here a gcc \co{aligned} attribute is used to force the \co{->counter}
+Here \GCC's \co{aligned} attribute is used to force the \co{->counter}
 and the \co{ht_elem} structure into separate cache lines.
 This would allow CPUs to traverse the hash bucket list at full speed
 despite the frequent incrementing.
diff --git a/formal/formal.tex b/formal/formal.tex
index f629190..e4bf3bd 100644
--- a/formal/formal.tex
+++ b/formal/formal.tex
@@ -127,7 +127,7 @@ The larger overarching software construct is of course validated by testing.
 	Furthermore, although the L4 microkernel is a large software
 	artifact from the viewpoint of formal verification, it is tiny
 	compared to a great number of projects, including LLVM,
-	gcc, the Linux kernel, Hadoop, MongoDB, and a great many others.
+	\GCC, the Linux kernel, Hadoop, MongoDB, and a great many others.

 	Although formal verification is finally starting to show some
 	promise, including more-recent L4 verifications involving greater
diff --git a/memorder/memorder.tex b/memorder/memorder.tex
index 944c17a..ba54fee 100644
--- a/memorder/memorder.tex
+++ b/memorder/memorder.tex
@@ -4335,7 +4335,7 @@ the documentation for the specific MIPS implementation you are using.
 Although the PA-RISC architecture permits full reordering of loads and
 stores, actual CPUs run fully ordered~\cite{GerryKane96a}.
 This means that the Linux kernel's memory-ordering primitives generate
-no code, however, they do use the gcc {\tt memory} attribute to disable
+no code, however, they do use \GCC's {\tt memory} attribute to disable
 compiler optimizations that would reorder code across the memory
 barrier.

diff --git a/perfbook.tex b/perfbook.tex
index cc4f4b0..dc28079 100644
--- a/perfbook.tex
+++ b/perfbook.tex
@@ -139,6 +139,9 @@

 \DeclareRobustCommand{\euler}{\ensuremath{\mathrm{e}}}
 \newcommand{\Power}[1]{POWER#1}
+\newcommand{\GNUC}{GNU~C}
+\newcommand{\GCC}{GCC}
+%\newcommand{\GCC}{\co{gcc}} % For those who prefer "gcc"

 \newcommand{\Epigraph}[2]{\epigraphhead[65]{\rmfamily\epigraph{#1}{#2}}}

diff --git a/toolsoftrade/toolsoftrade.tex b/toolsoftrade/toolsoftrade.tex
index 97a37d3..bd43879 100644
--- a/toolsoftrade/toolsoftrade.tex
+++ b/toolsoftrade/toolsoftrade.tex
@@ -481,7 +481,7 @@ in the following section.
 	broken???
 \QuickQuizAnswer{
 	Ah, but the Linux kernel is written in a carefully selected
-	superset of the C language that includes special gcc
+	superset of the C language that includes special GNU
 	extensions, such as asms, that permit safe execution even
 	in presence of data races.
 	In addition, the Linux kernel does not run on a number of
@@ -1001,7 +1001,7 @@ rights to assume that the value of \co{goflag} would never change.
 \QuickQuiz{}
 	Would it ever be necessary to use \co{READ_ONCE()} when accessing
 	a per-thread variable, for example, a variable declared using
-	the \co{gcc} \co{__thread} storage class?
+	\GCC's \co{__thread} storage class?
 \QuickQuizAnswer{
 	It depends.
 	If the per-thread variable was accessed only from its thread,
@@ -1156,7 +1156,7 @@ cases, for example when the readers must do high-latency file or network I/O.
 There are alternatives, some of which will be presented in
 Chapters~\ref{chp:Counting} and \ref{chp:Deferred Processing}.

-\subsection{Atomic Operations (gcc Classic)}
+\subsection{Atomic Operations (\GCC\ Classic)}
 \label{sec:toolsoftrade:Atomic Operations (gcc Classic)}

 Given that
@@ -1175,7 +1175,7 @@ If a pair of threads concurrently execute \co{__sync_fetch_and_add()} on
 the same variable, the resulting value of the variable will include
 the result of both additions.

-The {\sf gcc} compiler offers a number of additional atomic operations,
+The \GNUC\ compiler offers a number of additional atomic operations,
 including \co{__sync_fetch_and_sub()},
 \co{__sync_fetch_and_or()},
 \co{__sync_fetch_and_and()},
@@ -1250,7 +1250,7 @@ avoids optimizing away a given memory read, in which case the
 Figure~\ref{fig:toolsoftrade:Demonstration of Exclusive Locks}.
 Similarly, the \co{WRITE_ONCE()} primitive may be used to prevent the
 compiler from optimizing away a given memory write.
-These last three primitives are not provided directly by gcc,
+These last three primitives are not provided directly by \GCC,
 but may be implemented straightforwardly as follows:

 \vspace{5pt}
@@ -1307,7 +1307,7 @@ is vaguely similar to the Linux kernel's ``\co{READ_ONCE()}''.\footnote{

 One restriction of the C11 atomics is that they apply only to special
 atomic types, which can be problematic.
-The gcc compiler therefore provides atomic intrinsics, including
+The \GNUC\ compiler therefore provides atomic intrinsics, including
 \co{__atomic_load()},
 \co{__atomic_load_n()},
 \co{__atomic_store()},
@@ -1339,14 +1339,14 @@ to key,
 variable corresponding to the specified key,
 and \co{pthread_getspecific()} to return that value.

-A number of compilers (including gcc) provide a \co{__thread} specifier
+A number of compilers (including \GCC) provide a \co{__thread} specifier
 that may be used in a variable definition to designate that variable
 as being per-thread.
 The name of the variable may then be used normally to access the
 value of the current thread's instance of that variable.
 Of course, \co{__thread} is much easier to use than the POSIX
 thead-specific data, and so \co{__thread} is usually preferred for
-code that is to be built only with gcc or other compilers supporting
+code that is to be built only with \GCC\ or other compilers supporting
 \co{__thread}.

 Fortunately, the C11 standard introduced a \co{_Thread_local} keyword
@@ -1365,7 +1365,7 @@ are supported.
 It is still quite common to find these operations implemented in
 assembly language, either for historical reasons or to obtain better
 performance in specialized circumstances.
-For example, the gcc \co{__sync_} family of primitives all provide full
+For example, \GCC's \co{__sync_} family of primitives all provide full
 memory-ordering semantics, which in the past motivated many developers
 to create their own implementations for situations where the full memory
 ordering semantics are not required.
@@ -1380,7 +1380,7 @@ code, the code samples in this book start with a call to \co{smp_init()},
 which initializes a mapping from \co{pthread_t} to consecutive integers.
 The userspace RCU library similarly requires a call to \co{rcu_init()}.
 Although these calls can be hidden in environments (such as that of
-gcc) that support constructors,
+\GCC) that support constructors,
 most of the RCU flavors supported by the userspace RCU library
 also require each thread invoke \co{rcu_register_thread()} upon thread
 creation and \co{rcu_unregister_thread()} before thread exit.
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 08/10] treewide: Use "IRQ" instead of "irq" used as abbreviation
  2017-10-05 15:47 [PATCH 00/10] Tweaks to follow guidelines in style guide Akira Yokosawa
                   ` (6 preceding siblings ...)
  2017-10-05 15:55 ` [PATCH 07/10] treewide: Call GNU C compiler as "GCC" Akira Yokosawa
@ 2017-10-05 15:56 ` Akira Yokosawa
  2017-10-05 15:59 ` [PATCH 09/10] future/QC: Use upright glyph for math constant and descriptive suffix Akira Yokosawa
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Akira Yokosawa @ 2017-10-05 15:56 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: perfbook, Akira Yokosawa

From 07346d744227b79263c044034a03fc56c032dd0b Mon Sep 17 00:00:00 2001
From: Akira Yokosawa <akiyks@gmail.com>
Date: Sun, 1 Oct 2017 18:44:13 +0900
Subject: [PATCH 08/10] treewide: Use "IRQ" instead of "irq" used as abbreviation

"IRQ" is defined as a macro "\IRQ" in preamble for ease of customization.

Signed-off-by: Akira Yokosawa <akiyks@gmail.com>
---
 defer/rcuapi.tex      |  6 +++---
 formal/dyntickrcu.tex | 50 +++++++++++++++++++++++++-------------------------
 perfbook.tex          |  2 ++
 rt/rt.tex             | 14 +++++++-------
 4 files changed, 37 insertions(+), 35 deletions(-)

diff --git a/defer/rcuapi.tex b/defer/rcuapi.tex
index 158ebff..d60a0dd 100644
--- a/defer/rcuapi.tex
+++ b/defer/rcuapi.tex
@@ -93,7 +93,7 @@ Read side overhead &
     Preempt disable/enable (free on non-\tco{PREEMPT}) &
 	BH disable/enable &
 	    Preempt disable/enable (free on non-\tco{PREEMPT}) &
-	        Simple instructions, irq disable/enable &
+	        Simple instructions, \IRQ\ disable/enable &
 		    Simple instructions, preempt disable/enable, memory barriers \\
 \hline
 Asynchronous update-side overhead &
@@ -420,12 +420,12 @@ and also by their scope, as follows:
 	(\co{softirq}) handlers.
 	RCU BH is global in scope.
 \item	RCU Sched: read-side critical sections must guarantee forward
-	progress against everything except for NMI and irq handlers,
+	progress against everything except for NMI and \IRQ\ handlers,
 	including \co{softirq} handlers.
 	RCU Sched is global in scope.
 \item	RCU (both classic and real-time): read-side critical sections
 	must guarantee forward progress against everything except for
-	NMI handlers, irq handlers, \co{softirq} handlers, and (in the
+	NMI handlers, \IRQ\ handlers, \co{softirq} handlers, and (in the
 	real-time case) higher-priority real-time tasks.
 	RCU is global in scope.
 \item	SRCU: read-side critical sections need not guarantee
diff --git a/formal/dyntickrcu.tex b/formal/dyntickrcu.tex
index ec3c78c..2ae41ae 100644
--- a/formal/dyntickrcu.tex
+++ b/formal/dyntickrcu.tex
@@ -1829,7 +1829,7 @@ This effort provided some lessons (re)learned:
 	is buggy.
 \item	{\bf Use of atomic instructions can simplify verification.}
 	Unfortunately, use of the \co{cmpxchg} atomic instruction
-	would also slow down the critical irq fastpath, so they
+	would also slow down the critical \IRQ\ fastpath, so they
 	are not appropriate in this case.
 \item	{\bf The need for complex formal verification often indicates
 	a need to re-think your design.}
@@ -1842,10 +1842,10 @@ the dynticks problem, which is presented in the next section.
 \label{sec:formal:Simplicity Avoids Formal Verification}

 The complexity of the dynticks interface for preemptible RCU is primarily
-due to the fact that both irqs and NMIs use the same code path and the
+due to the fact that both \IRQ s and NMIs use the same code path and the
 same state variables.
 This leads to the notion of providing separate code paths and variables
-for irqs and NMIs, as has been done for
+for \IRQ s and NMIs, as has been done for
 hierarchical RCU~\cite{PaulEMcKenney2008HierarchicalRCU}
 as indirectly suggested by
 Manfred Spraul~\cite{ManfredSpraul2008StateMachineRCU}.
@@ -1884,7 +1884,7 @@ and efficiently share dynticks state.
 In what follows, they can be thought of as independent per-CPU variables.

 The \co{dynticks_nesting}, \co{dynticks}, and \co{dynticks_snap} variables
-are for the irq code paths, and the \co{dynticks_nmi} and
+are for the \IRQ\ code paths, and the \co{dynticks_nmi} and
 \co{dynticks_nmi_snap} variables are for the NMI code paths, although
 the NMI code path will also reference (but not modify) the
 \co{dynticks_nesting} variable.
@@ -1895,18 +1895,18 @@ These variables are used as follows:
 	This counts the number of reasons that the corresponding
 	CPU should be monitored for RCU read-side critical sections.
 	If the CPU is in dynticks-idle mode, then this counts the
-	irq nesting level, otherwise it is one greater than the
-	irq nesting level.
+	\IRQ\ nesting level, otherwise it is one greater than the
+	\IRQ\ nesting level.
 \item	[\tco{dynticks}]
 	This counter's value is even if the corresponding CPU is
-	in dynticks-idle mode and there are no irq handlers currently
+	in dynticks-idle mode and there are no \IRQ\ handlers currently
 	running on that CPU, otherwise the counter's value is odd.
 	In other words, if this counter's value is odd, then the
 	corresponding CPU might be in an RCU read-side critical section.
 \item	[\tco{dynticks_nmi}]
 	This counter's value is odd if the corresponding CPU is
 	in an NMI handler, but only if the NMI arrived while this
-	CPU was in dyntick-idle mode with no irq handlers running.
+	CPU was in dyntick-idle mode with no \IRQ\ handlers running.
 	Otherwise, the counter's value will be even.
 \item	[\tco{dynticks_snap}]
 	This will be a snapshot of the \co{dynticks} counter, but
@@ -1924,11 +1924,11 @@ passed through a quiescent state during that interval.

 \QuickQuiz{}
 	But what happens if an NMI handler starts running before
-	an irq handler completes, and if that NMI handler continues
-	running until a second irq handler starts?
+	an \IRQ\ handler completes, and if that NMI handler continues
+	running until a second \IRQ\ handler starts?
 \QuickQuizAnswer{
 	This cannot happen within the confines of a single CPU.
-	The first irq handler cannot complete until the NMI handler
+	The first \IRQ\ handler cannot complete until the NMI handler
 	returns.
 	Therefore, if each of the \co{dynticks} and \co{dynticks_nmi}
 	variables have taken on an even value during a given time
@@ -1985,7 +1985,7 @@ These two functions are invoked from process context.
 Line~6 ensures that any prior memory accesses (which might
 include accesses from RCU read-side critical sections) are seen
 by other CPUs before those marking entry to dynticks-idle mode.
-Lines~7 and~12 disable and reenable irqs.
+Lines~7 and~12 disable and reenable \IRQ s.
 Line~8 acquires a pointer to the current CPU's \co{rcu_dynticks}
 structure, and
 line~9 increments the current CPU's \co{dynticks} counter, which
@@ -2038,7 +2038,7 @@ Figure~\ref{fig:formal:NMIs From Dynticks-Idle Mode}
 shows the \co{rcu_nmi_enter()} and \co{rcu_nmi_exit()} functions,
 which inform RCU of NMI entry and exit, respectively, from dynticks-idle
 mode.
-However, if the NMI arrives during an irq handler, then RCU will already
+However, if the NMI arrives during an \IRQ\ handler, then RCU will already
 be on the lookout for RCU read-side critical sections from this CPU,
 so lines~6 and~7 of \co{rcu_nmi_enter()} and lines~18 and~19
 of \co{rcu_nmi_exit()} silently return if \co{dynticks} is odd.
@@ -2091,7 +2091,7 @@ respectively.

 Figure~\ref{fig:formal:Interrupts From Dynticks-Idle Mode}
 shows \co{rcu_irq_enter()} and \co{rcu_irq_exit()}, which
-inform RCU of entry to and exit from, respectively, irq context.
+inform RCU of entry to and exit from, respectively, \IRQ\ context.
 Line~6 of \co{rcu_irq_enter()} increments \co{dynticks_nesting},
 and if this variable was already non-zero, line~7 silently returns.
 Otherwise, line~8 increments \co{dynticks}, which will then have
@@ -2099,18 +2099,18 @@ an odd value, consistent with the fact that this CPU can now
 execute RCU read-side critical sections.
 Line~10 therefore executes a memory barrier to ensure that
 the increment of \co{dynticks} is seen before any
-RCU read-side critical sections that the subsequent irq handler
+RCU read-side critical sections that the subsequent \IRQ\ handler
 might execute.

 Line~18 of \co{rcu_irq_exit()} decrements \co{dynticks_nesting}, and
 if the result is non-zero, line~19 silently returns.
 Otherwise, line~20 executes a memory barrier to ensure that the
 increment of \co{dynticks} on line~21 is seen after any RCU
-read-side critical sections that the prior irq handler might have executed.
+read-side critical sections that the prior \IRQ\ handler might have executed.
 Line~22 verifies that \co{dynticks} is now even, consistent with
 the fact that no RCU read-side critical sections may appear in
 dynticks-idle mode.
-Lines~23-25 check to see if the prior irq handlers enqueued any
+Lines~23-25 check to see if the prior \IRQ\ handlers enqueued any
 RCU callbacks, forcing this CPU out of dynticks-idle mode via
 a reschedule API if so.

@@ -2159,7 +2159,7 @@ Figures~\ref{fig:formal:Entering and Exiting Dynticks-Idle Mode},
 Lines~11 and~12 record the snapshots for later calls to
 \co{rcu_implicit_dynticks_qs()},
 and lines~13 and~14 check to see if the CPU is in dynticks-idle mode with
-neither irqs nor NMIs in progress (in other words, both snapshots
+neither \IRQ s nor NMIs in progress (in other words, both snapshots
 have even values), hence in an extended quiescent state.
 If so, lines~15 and~16 count this event, and line~17 returns
 true if the CPU was in a quiescent state.
@@ -2225,15 +2225,15 @@ waiting for a CPU that is offline.
 	This is still pretty complicated.
 	Why not just have a \co{cpumask_t} that has a bit set for
 	each CPU that is in dyntick-idle mode, clearing the bit
-	when entering an irq or NMI handler, and setting it upon
+	when entering an \IRQ\ or NMI handler, and setting it upon
 	exit?
 \QuickQuizAnswer{
 	Although this approach would be functionally correct, it
-	would result in excessive irq entry/exit overhead on
+	would result in excessive \IRQ\ entry/exit overhead on
 	large machines.
 	In contrast, the approach laid out in this section allows
-	each CPU to touch only per-CPU data on irq and NMI entry/exit,
-	resulting in much lower irq entry/exit overhead, especially
+	each CPU to touch only per-CPU data on \IRQ\ and NMI entry/exit,
+	resulting in much lower \IRQ\ entry/exit overhead, especially
 	on large machines.
 } \QuickQuizEnd

@@ -2243,9 +2243,9 @@ waiting for a CPU that is offline.
 A slight shift in viewpoint resulted in a substantial simplification
 of the dynticks interface for RCU.
 The key change leading to this simplification was minimizing of
-sharing between irq and NMI contexts.
+sharing between \IRQ\ and NMI contexts.
 The only sharing in this simplified interface is references from NMI
-context to irq variables (the \co{dynticks} variable).
+context to \IRQ\ variables (the \co{dynticks} variable).
 This type of sharing is benign, because the NMI functions never update
 this variable, so that its value remains constant through the lifetime
 of the NMI handler.
@@ -2254,6 +2254,6 @@ understood one at a time, in happy contrast to the situation
 described in
 Section~\ref{sec:formal:Promela Parable: dynticks and Preemptible RCU},
 where an NMI might change shared state at any point during execution of
-the irq functions.
+the \IRQ\ functions.

 Verification can be a good thing, but simplicity is even better.
diff --git a/perfbook.tex b/perfbook.tex
index dc28079..906d71b 100644
--- a/perfbook.tex
+++ b/perfbook.tex
@@ -142,6 +142,8 @@
 \newcommand{\GNUC}{GNU~C}
 \newcommand{\GCC}{GCC}
 %\newcommand{\GCC}{\co{gcc}} % For those who prefer "gcc"
+\newcommand{\IRQ}{IRQ}
+%\newcommand{\IRQ}{irq}      % For those who prefer "irq"

 \newcommand{\Epigraph}[2]{\epigraphhead[65]{\rmfamily\epigraph{#1}{#2}}}

diff --git a/rt/rt.tex b/rt/rt.tex
index 21e7117..f1c0ae1 100644
--- a/rt/rt.tex
+++ b/rt/rt.tex
@@ -995,20 +995,20 @@ indefinitely, thus indefinitely degrading real-time latencies.

 One way of addressing this problem is the use of threaded interrupts shown in
 Figure~\ref{fig:rt:Threaded Interrupt Handler}.
-Interrupt handlers run in the context of a preemptible IRQ thread,
+Interrupt handlers run in the context of a preemptible \IRQ\ thread,
 which runs at a configurable priority.
 The device interrupt handler then runs for only a short time, just
-long enough to make the IRQ thread aware of the new event.
+long enough to make the \IRQ\ thread aware of the new event.
 As shown in the figure, threaded interrupts can greatly improve
 real-time latencies, in part because interrupt handlers running in
-the context of the IRQ thread may be preempted by high-priority real-time
+the context of the \IRQ\ thread may be preempted by high-priority real-time
 threads.

 However, there is no such thing as a free lunch, and there are downsides
 to threaded interrupts.
 One downside is increased interrupt latency.
 Instead of immediately running the interrupt handler, the handler's execution
-is deferred until the IRQ thread gets around to running it.
+is deferred until the \IRQ\ thread gets around to running it.
 Of course, this is not a problem unless the device generating the interrupt
 is on the real-time application's critical path.

@@ -1025,16 +1025,16 @@ which can be caused by, among other things, locks acquired by
 preemptible interrupt handlers~\cite{LuiSha1990PriorityInheritance}.
 Suppose that a low-priority thread holds a lock, but is preempted by
 a group of medium-priority threads, at least one such thread per CPU.
-If an interrupt occurs, a high-priority IRQ thread will preempt one
+If an interrupt occurs, a high-priority \IRQ\ thread will preempt one
 of the medium-priority threads, but only until it decides to acquire
 the lock held by the low-priority thread.
 Unfortunately, the low-priority thread cannot release the lock until
 it starts running, which the medium-priority threads prevent it from
 doing.
-So the high-priority IRQ thread cannot acquire the lock until after one
+So the high-priority \IRQ\ thread cannot acquire the lock until after one
 of the medium-priority threads releases its CPU.
 In short, the medium-priority threads are indirectly blocking the
-high-priority IRQ threads, a classic case of priority inversion.
+high-priority \IRQ\ threads, a classic case of priority inversion.

 Note that this priority inversion could not happen with non-threaded
 interrupts because the low-priority thread would have to disable interrupts
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 09/10] future/QC: Use upright glyph for math constant and descriptive suffix
  2017-10-05 15:47 [PATCH 00/10] Tweaks to follow guidelines in style guide Akira Yokosawa
                   ` (7 preceding siblings ...)
  2017-10-05 15:56 ` [PATCH 08/10] treewide: Use "IRQ" instead of "irq" used as abbreviation Akira Yokosawa
@ 2017-10-05 15:59 ` Akira Yokosawa
  2017-10-05 16:00 ` [PATCH 10/10] styleguide: Reflect recent style improvements Akira Yokosawa
  2017-10-05 20:48 ` [PATCH 00/10] Tweaks to follow guidelines in style guide Paul E. McKenney
  10 siblings, 0 replies; 12+ messages in thread
From: Akira Yokosawa @ 2017-10-05 15:59 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: perfbook, Akira Yokosawa

From 22146e245ef551489d65834ac4f766176f53bfaf Mon Sep 17 00:00:00 2001
From: Akira Yokosawa <akiyks@gmail.com>
Date: Sat, 30 Sep 2017 20:22:13 +0900
Subject: [PATCH 09/10] future/QC: Use upright glyph for math constant and descriptive suffix

To have access to a larger set of Greek glyphs and improved math mode
typesetting, substitute "newtxtext" and "newtxmath" packages for
the "mathptmx" package.

Uppercase Greek letters are now slanted by default.

To specify upright Greek letters, you can use commands such as
"\upDelta" and "\uppi" provided by the newtxmath package. In QC.tex,
"pi" is used to represent the circular constant and "Delta" is used to
represent the difference operator. In these cases upright glyphs should be
used.

In NIST style guide, descriptive suffixes are also recommended to be
upright. To avoid repetitive use of \mathrm{} command, macros "\TLo",
"\THi", and "\CPf" are defined locally in QC.tex.

Also use mathcal font for Big O.[1]

NOTE 1: For target "1csf", we now use the "newtxsf" package, which
also provides a larger set of Greek glyphs. However, it is not available
on TeX Live 2013/Debian. Furthermore, it uses a different upright font in
math mode than in text mode. You can distinguish math mode figures from
text mode figures in this target, but the difference looks acceptable.
The font choice for this target can be changed should a better font
combination be found.

NOTE 2: On TeX Live 2013/Debian, newtxmath has a few spacing issues.
They are fixed on TeX Live 2015/Debian, which is available on
Ubuntu Xenial. Both newtxtext and newtxmath are actively updated.
See https://www.ctan.org/pkg/newtx.

[1]: https://texblog.org/2014/06/24/big-o-and-related-notations-in-latex/

Signed-off-by: Akira Yokosawa <akiyks@gmail.com>
---
 FAQ-BUILD.txt                      |  4 +--
 Makefile                           |  2 +-
 appendix/styleguide/styleguide.tex |  6 ++--
 future/QC.tex                      | 62 ++++++++++++++++++++------------------
 perfbook.tex                       | 12 +++++---
 5 files changed, 46 insertions(+), 40 deletions(-)

diff --git a/FAQ-BUILD.txt b/FAQ-BUILD.txt
index 1fa581d..e277d95 100644
--- a/FAQ-BUILD.txt
+++ b/FAQ-BUILD.txt
@@ -114,9 +114,9 @@
 		directory.

 		Following is a list of links to optional packages as of
-		March 2017:
+		October 2017:

 			https://www.ctan.org/pkg/newtxtt
 			https://www.ctan.org/pkg/nimbus15
 			https://www.ctan.org/pkg/inconsolata
-			https://www.ctan.org/pkg/mathastext
+			https://www.ctan.org/pkg/newtxsf
diff --git a/Makefile b/Makefile
index 8799313..d8d23c7 100644
--- a/Makefile
+++ b/Makefile
@@ -134,7 +134,7 @@ perfbook-msnt.tex: perfbook.tex
 perfbook-1csf.tex: perfbook-1c.tex
 	sed -e 's/setboolean{sansserif}{false}/setboolean{sansserif}{true}/' \
 	    -e 's/%msfontstub/\\usepackage[var0]{inconsolata}[2013\/07\/17]/' < $< > $@
-	@echo "## This target requires recent version (>= 1.3i) of mathastext. ##"
+	@echo "## This target requires math font package newtxsf. ##"

 # Rules related to perfbook_html are removed as of May, 2016

diff --git a/appendix/styleguide/styleguide.tex b/appendix/styleguide/styleguide.tex
index bb100a9..bec700c 100644
--- a/appendix/styleguide/styleguide.tex
+++ b/appendix/styleguide/styleguide.tex
@@ -940,7 +940,7 @@ with the help of ``booktabs'' and ``xcolor'' packages.
 \begin{tabular}{lrrr}\toprule
 Situation
 	& $T$ (K)
-		& $C_P$	& \parbox[b]{.75in}{\raggedleft Power per watt\par waste heat (W)} \\
+		& $\CPf$ & \parbox[b]{.75in}{\raggedleft Power per watt\par waste heat (W)} \\
 \midrule
 Dry Ice
 	& $195$
@@ -1320,7 +1320,7 @@ with dashed horizontal and vertical rules of the arydshln package.
 \begin{tabular}{l:r:r:r}\toprule
 Situation
 	& $T$ (K)
-		& $C_P$	& \parbox[b]{.75in}{\raggedleft Power per watt\par waste heat (W)} \\
+		& $\CPf$ & \parbox[b]{.75in}{\raggedleft Power per watt\par waste heat (W)} \\
 \hline
 Dry Ice
 	& $195$
@@ -1356,7 +1356,7 @@ Table~\ref{tab:app:styleguide:Refrigeration Power Consumption (arydshln-2)}.
 \begin{tabular}{lrrr}\toprule
 Situation
 	& $T$ (K)
-		& $C_P$	& \parbox[b]{.75in}{\raggedleft Power per watt\par waste heat (W)} \\
+		& $\CPf$ & \parbox[b]{.75in}{\raggedleft Power per watt\par waste heat (W)} \\
 \midrule
 Dry Ice
 	& $195$
diff --git a/future/QC.tex b/future/QC.tex
index daf0086..349c4ed 100644
--- a/future/QC.tex
+++ b/future/QC.tex
@@ -329,7 +329,7 @@ are as follows:

 \begin{description}
 \item[\qop{H}\,:]
-	Rotate 180\degree{} ($\pi$ radians) about the Bloch-sphere
+	Rotate 180\degree{} ($\uppi$ radians) about the Bloch-sphere
 	X-Z axis, that is, about the 45\degree{} line on the
 	X-Z plane.  This rotates $\ket{0}$ to the point at which the
 	positive X\=/axis intersects the Bloch sphere, and rotates $\ket{1}$
@@ -337,31 +337,31 @@ are as follows:
 	sphere.
 	Either way, we get a qubit that is 50\,\% one and 50\,\% zero.
 \item[\qop{S}\,:]
-	Rotate 90\degree{} ($\frac{\pi}{2}$ radians) about the
+	Rotate 90\degree{} ($\frac{\uppi}{2}$ radians) about the
 	Bloch-sphere Z\=/axis, which has no effect on qubits in the
 	$\ket{0}$ or $\ket{1}$ states.
 \item[\qop{S}$^{\bm{\dagger}}$:]
-	Rotate $-90\degree$ ($-\frac{\pi}{2}$ radians) about the
+	Rotate $-90\degree$ ($-\frac{\uppi}{2}$ radians) about the
 	Bloch-sphere Z\=/axis, which has no effect on qubits in the
 	$\ket{0}$ or $\ket{1}$ states.
 	This operator is the inverse of \qop{S}.
 \item[\qop{T}\,:]
-	Rotate 45\degree{} ($\frac{\pi}{4}$ radians) about the
+	Rotate 45\degree{} ($\frac{\uppi}{4}$ radians) about the
 	Bloch-sphere Z\=/axis, which has no effect on qubits in the
 	$\ket{0}$ or $\ket{1}$ states.
 \item[\qop{T}$^{\bm{\dagger}}$:]
-	Rotate $-45\degree$ ($-\frac{\pi}{4}$ radians) about the
+	Rotate $-45\degree$ ($-\frac{\uppi}{4}$ radians) about the
 	Bloch-sphere Z\=/axis, which has no effect on qubits in the
 	$\ket{0}$ or $\ket{1}$ states.
 	This operator is the inverse of \qop{T}.
 \item[\qop{X}\,:]
-	Rotate 180\degree{} ($\pi$ radians) about the Bloch-sphere
+	Rotate 180\degree{} ($\uppi$ radians) about the Bloch-sphere
 	X\=/axis, which takes $\ket{0}$ to $\ket{1}$ and vice versa.
 \item[\qop{Y}\,:]
-	Rotate 180\degree{} ($\pi$ radians) about the Bloch-sphere
+	Rotate 180\degree{} ($\uppi$ radians) about the Bloch-sphere
 	Y\=/axis, which also takes $\ket{0}$ to $\ket{1}$ and vice versa.
 \item[\qop{Z}\,:]
-	Rotate 180\degree{} ($\pi$ radians) about the Bloch-sphere
+	Rotate 180\degree{} ($\uppi$ radians) about the Bloch-sphere
 	Z\=/axis, which has no effect on qubits in the $\ket{0}$ or
 	$\ket{1}$ states.
 \end{description}
@@ -606,11 +606,11 @@ However, because of its thermodynamic reversibiltiy,
 QC is governed by an even lower limit:

 \begin{equation}
-	\Delta E \geq \frac{\hbar}{2 \Delta t}
+	\upDelta E \geq \frac{\hbar}{2 \upDelta t}
 \end{equation}

-Here $\Delta E$ is the energy required to change the qubit in Joules,
-$\Delta t$ is the time taken to change the qubit in seconds, and
+Here $\upDelta E$ is the energy required to change the qubit in Joules,
+$\upDelta t$ is the time taken to change the qubit in seconds, and
 $\hbar$ is Planck's constant, which is $6.62 \times 10^{-34}$\,J$\cdot$s.
 For the 50-nanosecond switching times of IBM's Quantum Experience
 hardware, this limit is $5.52 \times 10^{-27}$\,J, more than an order
@@ -631,12 +631,16 @@ program.
 Unfortunately, it is not just the amount of heat generated that is
 important, but also the temperature at which this heat is generated.

+\newcommand{\TLo}{T_\mathrm{L}}
+\newcommand{\THi}{T_\mathrm{H}}
+\newcommand{\CPf}{C_\mathrm{P}}
+
 The thermodynamic theoretical limit on the ability of a refrigerator
-to transport heat from a low temperature ($T_L$) to a high temperature
-($T_H$) is given by the coefficient of performance ($C_P$):
+to transport heat from a low temperature ($\TLo$) to a high temperature
+($\THi$) is given by the coefficient of performance ($\CPf$):

 \begin{equation}
-	C_P = \frac{T_L}{T_H - T_L}
+	\CPf = \frac{\TLo}{\THi - \TLo}
 \end{equation}

 \begin{table}
@@ -664,9 +668,9 @@ fancifully illustrated in
 Table~\ref{tab:future:The Three Laws of Thermodynamics}.

 The nominal temperature for IBM~Q is 15~millikelvins, which certainly
-qualifies as a low $T_L$.
-Let's assume $T_H$ is 293\,K (room temperature),
-in which case $C_P$ is $0.000051$.
+qualifies as a low $\TLo$.
+Let's assume $\THi$ is 293\,K (room temperature),
+in which case $\CPf$ is $0.000051$.
 This in turn means that it requires \emph{at least} one watt of
 power into the refrigeration unit to transport $0.000051$~watts
 of waste heat from the 15~millikelvin IBM~Q out to room temperature.
@@ -692,7 +696,7 @@ at low temperatures.\footnote{
 	&	&	& Power per watt \\
 Situation
 	& $T$ (K)
-		& $C_P$	& waste heat (W) \\
+		& $\CPf$ & waste heat (W) \\
 \hline
 \hline
 Dry Ice
@@ -996,27 +1000,27 @@ it to be not too early to start thinking in terms of replacing RSA.
 \label{sec:future:Grover's Search Algorithm}

 Grover's algorithm searches an unordered list of $N$ items
-in $O(\sqrt N)$ time.
+in $\O{\sqrt N}$ time.
 This is mainly intended for implicit search for solutions as opposed
 to searching through data.
 To see why, keep in mind that before any data can be searched,
 that data list must be downloaded into the QC system, and that
-this download will have computational complexity $O(n)$, where
+this download will have computational complexity $\O{n}$, where
 $n$ is the number of data items.
 The competing classical system can use this time to sort the data
 or to construct any desired index over the data, and the computational
-complexity of these operations can be considered to be $O(n \log_2 n)$,
+complexity of these operations can be considered to be $\O{n \log_2 n}$,
 after which the classical
-system can carry out the search in $O(\log N)$ time, which
-is much faster than the $O(\sqrt N)$ time promised by
+system can carry out the search in $\O{\log N}$ time, which
+is much faster than the $\O{\sqrt N}$ time promised by
 Grover's algorithm.

 \QuickQuiz{}
-	What do you mean $O(n)$ for classic-computing sorting/indexing
-	and $O(n \log_2 n)$ for classic-computing search?
-	Hash tables do $O(n)$ and $O(1)$ respectively!!!
+	What do you mean $\O{n}$ for classic-computing sorting/indexing
+	and $\O{n \log_2 n}$ for classic-computing search?
+	Hash tables do $\O{n}$ and $\O{1}$ respectively!!!
 \QuickQuizAnswer{
-	Fixed-size hash table lookups are $O(n)$, not $O(1)$.
+	Fixed-size hash table lookups are $\O{n}$, not $\O{1}$.
 	And for a resizing hash table, fairness dictates that the overhead
 	of resizing be properly accounted for.

@@ -1049,7 +1053,7 @@ computing:

 Of course, one can pick $n$ and $m$ to favor either approach.
 It makes little sense to choose small $m$ because the winner of that
-race is a simple $O(n)$ sequential scan.
+race is a simple $\O{n}$ sequential scan.
 More interesting scenarios use larger values of $m$.

 The first scenario looks at
@@ -1185,7 +1189,7 @@ That said, this analysis has some limitations:
 \item	Explicit lists are assumed.
 	Implicit lists might well favor quantum computing.
 \item	Traditional sorting and indexing is assumed to result in
-	the traditional $O(\log N)$ computational complexity for
+	the traditional $\O{\log N}$ computational complexity for
 	classic-computing search.
 \item	Quantum computing is assumed to be capable of handling
 	very large data sets.
diff --git a/perfbook.tex b/perfbook.tex
index 906d71b..f36b7ca 100644
--- a/perfbook.tex
+++ b/perfbook.tex
@@ -7,9 +7,9 @@
 % A more pleasant font
 \usepackage{lmodern}
 \usepackage[T1]{fontenc} % use postscript type 1 fonts
+\usepackage[defaultsups]{newtxtext} % use nice, standard fonts for roman
 \usepackage{textcomp} % use symbols in TS1 encoding
-\usepackage{mathptmx} % use nice, standard fonts for roman
-\usepackage[scaled=.92]{helvet} % and sans serif
+\renewcommand*\ttdefault{lmtt}
 %msfontstub

 % Improves the text layout
@@ -85,9 +85,10 @@
 \IfSansSerif{
 \renewcommand{\familydefault}{\sfdefault}
 \normalfont
-\usepackage[italic]{mathastext}[2016/01/06]
-\renewcommand{\path}[1]{\nolinkurl{#1}} % workaround of interference with mathastext
-}{}
+\usepackage[slantedGreek,scaled=.96]{newtxsf}
+}{
+\usepackage[slantedGreek]{newtxmath} % math package to be used with newtxtext
+}

 \newcommand{\LstLineNo}{\makebox[5ex][r]{\arabic{VerbboxLineNo}\hspace{2ex}}}

@@ -138,6 +139,7 @@
 \newcommand{\qop}[1]{{\sffamily #1}} % QC operator such as H, T, S, etc.

 \DeclareRobustCommand{\euler}{\ensuremath{\mathrm{e}}}
+\DeclareRobustCommand{\O}[1]{\ensuremath{\mathcal{O}(#1)}}
 \newcommand{\Power}[1]{POWER#1}
 \newcommand{\GNUC}{GNU~C}
 \newcommand{\GCC}{GCC}
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 10/10] styleguide: Reflect recent style improvements
  2017-10-05 15:47 [PATCH 00/10] Tweaks to follow guidelines in style guide Akira Yokosawa
                   ` (8 preceding siblings ...)
  2017-10-05 15:59 ` [PATCH 09/10] future/QC: Use upright glyph for math constant and descriptive suffix Akira Yokosawa
@ 2017-10-05 16:00 ` Akira Yokosawa
  2017-10-05 20:48 ` [PATCH 00/10] Tweaks to follow guidelines in style guide Paul E. McKenney
  10 siblings, 0 replies; 12+ messages in thread
From: Akira Yokosawa @ 2017-10-05 16:00 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: perfbook, Akira Yokosawa

From 2890e0069882321553c16aac213d4bb8d0a06fb7 Mon Sep 17 00:00:00 2001
From: Akira Yokosawa <akiyks@gmail.com>
Date: Wed, 4 Oct 2017 08:18:50 +0900
Subject: [PATCH 10/10] styleguide: Reflect recent style improvements

Signed-off-by: Akira Yokosawa <akiyks@gmail.com>
---
 appendix/styleguide/styleguide.tex | 58 ++++++++++++--------------------------
 1 file changed, 18 insertions(+), 40 deletions(-)

diff --git a/appendix/styleguide/styleguide.tex b/appendix/styleguide/styleguide.tex
index bec700c..d7ee97d 100644
--- a/appendix/styleguide/styleguide.tex
+++ b/appendix/styleguide/styleguide.tex
@@ -181,29 +181,14 @@ Example:
   $45\degree$, rather than $45\,\degree$.
 \end{quote}

-\subsection{NIST Guide Yet To Be Followed}
-\label{sec:app:styleguide:NIST Guides Yet To Be Followed}
-
-There are a few cases where NIST style guide is not followed.
-Other English conventions are followed in such cases.
-NIST rules of
-Sections~\ref{sec:app:styleguide:Percent Symbol}
-and~\ref{sec:app:styleguide:Font Style}
-are deemed acceptable by the editor.
-Contributions in those areas should be welcome.
-
 \subsubsection{Percent Symbol}
 \label{sec:app:styleguide:Percent Symbol}

 NIST style guide treats the percent symbol (\%) as the same as SI unit
 symbols.
-In this textbook, no space is placed in front of a percent symbol.

 \begin{quote}
-\begin{tabular}{ll}
-  NIST guide:& 50\,\% possibility\\
-  Current convention:& 50\% possibility\\
-\end{tabular}
+  50\,\% possibility, rather than 50\% possibility.
 \end{quote}

 \subsubsection{Font Style}
@@ -235,12 +220,15 @@ For example,
   $\mathrm{e}^x$
 \end{quote}

-In this textbook, this rule is not much considered as of this writing.
-Most letters in math mode are italic regardless of what they
-represent. Exceptions are uppercase Greek letters, which are upright
-in math mode by default.\footnote{
-  See \url{https://tex.stackexchange.com/questions/119248/}
-  for the historical reason.}
+%\footnote{
+%  See \url{https://tex.stackexchange.com/questions/119248/}
+%  for the historical reason.}
+
+\subsection{NIST Guide Yet To Be Followed}
+\label{sec:app:styleguide:NIST Guides Yet To Be Followed}
+
+There are a few cases where NIST style guide is not followed.
+Other English conventions are followed in such cases.

 \subsubsection{Digit Grouping}
 \label{sec:app:styleguide:Digit Grouping}
@@ -670,7 +658,14 @@ Example with an en dash:
 \label{sec:app:styleguide:Numerical Minus Sign}

 Numerical minus signs should be coded as math mode minus signs,
-namely \qco{$-$}. For example,
+namely \qco{$-$}.\footnote{This rule assumes that math mode uses the
+  same upright glyph as text mode. Our default font choice meets
+  the assumption.
+\IfSansSerif{
+  One of the experimental targets ``1csf'' \emph{does} use a differnt font
+  for math mode figures as of October 2017.}{}
+}
+For example,

 \begin{quote}
   $-30$, rather than -30.
@@ -1399,28 +1394,11 @@ for examples of tables with complex headings.
 Other improvement candidates are listed in the source of this
 section as comments.

-% Capitalize initialism:
-%    Gnu Compiler Collection = GCC
-%    gcc should be used as a command name in \co{gcc}
-%    When mentioning GCC's C language, use `GNU C'
-%
 % Trademarks:
 %    As the Legal page covers trademarks, there is no need to
 %    use trademark symbol in the text. They seems to have been
 %    imported from original publications.
 %
-% Power or POWER?
-%    IBM's trademark page at https://www.ibm.com/legal/us/en/copytrade.shtml#section-P
-%    lists the following.
-%        PowerPC
-%        Power Architecture
-%        Power
-%        POWER
-%        POWER5
-%        POWER6
-%
-%    not Power5, POWER 5, nor Power-5
-%
 % Ugly line break by \co{}
 %                                 __
 %        atomic_store()
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 00/10] Tweaks to follow guidelines in style guide
  2017-10-05 15:47 [PATCH 00/10] Tweaks to follow guidelines in style guide Akira Yokosawa
                   ` (9 preceding siblings ...)
  2017-10-05 16:00 ` [PATCH 10/10] styleguide: Reflect recent style improvements Akira Yokosawa
@ 2017-10-05 20:48 ` Paul E. McKenney
  10 siblings, 0 replies; 12+ messages in thread
From: Paul E. McKenney @ 2017-10-05 20:48 UTC (permalink / raw)
  To: Akira Yokosawa; +Cc: perfbook

On Fri, Oct 06, 2017 at 12:47:21AM +0900, Akira Yokosawa wrote:
> >From 2890e0069882321553c16aac213d4bb8d0a06fb7 Mon Sep 17 00:00:00 2001
> From: Akira Yokosawa <akiyks@gmail.com>
> Date: Wed, 5 Oct 2017 22:57:46 +0900
> Subject: [PATCH 00/10] Tweaks to follow guidelines in style guide
> 
> Hi Paul,
> 
> This patch set consists of minor tweaks in regard to the suggestions
> having been presented in style guide for a while.
> 
> Patches #1 -- #5 are trivial changes.
> 
> Patch #6 attempts to improve consistency in denoting POWER series CPU
> by defining a macro "\Power{}".
> 
> Patch #7 substitutes "GCC" for "gcc". There are a few exceptions as
> mentioned in commit log.
> 
> Patch #8 substitutes "IRQ" for "irq" in the same way. You might like
> to skip this one, as I see "irq" more often than "IRQ" in Linux
> documentations.
> 
> Patch #9 is somewhat invasive. It switches Times font to that of
> "newtxtext" and "newtxmath" packages. The reason of the change
> is to have access to both upright and slated glyphs of Greek letters.
> Recent versions of these font packages give better looking result,
> especially in math mode. As noted in the commit log, newtxmath in
> TeX Live 2013/Debian has a few issues which have been fixed in later
> versions. It also switches font choice for the experimental target "1csf".
> 
> Patch #10 updates style guide to reflect the changes made in this
> patch set.

They look fine, so I applied them, thank you!  We might want to go with
"irq", but let's see how people react.  Easy to change, aside from
beginnings of sentences!

							Thanx, Paul

>         Thanks, Akira
> --
> Akira Yokosawa (10):
>   debugging: Insert narrow space in front of percent symbol
>   debugging: Use upright font for Euler's number
>   future/QC: Insert narrow space in front of percent symbol
>   future/QC: Use non-breakable hyphen for axis names
>   treewide: Insert narrow space in front of percent symbol
>   treewide: Use \Power{} macro for POWER CPU family
>   treewide: Call GNU C compiler as "GCC"
>   treewide: Use "IRQ" instead of "irq" used as abbreviation
>   future/QC: Use upright glyph for math constant and descriptive suffix
>   styleguide: Reflect recent style improvements
> 
>  FAQ-BUILD.txt                      |   4 +-
>  Makefile                           |   2 +-
>  SMPdesign/SMPdesign.tex            |   2 +-
>  SMPdesign/beyond.tex               |  14 ++---
>  advsync/advsync.tex                |   2 +-
>  appendix/styleguide/styleguide.tex |  64 ++++++++---------------
>  appendix/toyrcu/toyrcu.tex         |  26 +++++-----
>  count/count.tex                    |  28 +++++-----
>  cpu/hwfreelunch.tex                |   4 +-
>  datastruct/datastruct.tex          |   2 +-
>  debugging/debugging.tex            | 104 ++++++++++++++++++-------------------
>  defer/rcuapi.tex                   |   6 +--
>  defer/rcuusage.tex                 |   4 +-
>  formal/dyntickrcu.tex              |  52 +++++++++----------
>  formal/formal.tex                  |   2 +-
>  formal/spinhint.tex                |   2 +-
>  future/QC.tex                      |  92 ++++++++++++++++----------------
>  future/htm.tex                     |   2 +-
>  future/tm.tex                      |   4 +-
>  intro/intro.tex                    |   8 +--
>  memorder/memorder.tex              |  32 ++++++------
>  perfbook.tex                       |  20 +++++--
>  rt/rt.tex                          |  22 ++++----
>  toolsoftrade/toolsoftrade.tex      |  24 ++++-----
>  24 files changed, 257 insertions(+), 265 deletions(-)
> 
> -- 
> 2.7.4
> 
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2017-10-05 20:48 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-10-05 15:47 [PATCH 00/10] Tweaks to follow guidelines in style guide Akira Yokosawa
2017-10-05 15:48 ` [PATCH 01/10] debugging: Insert narrow space in front of percent symbol Akira Yokosawa
2017-10-05 15:49 ` [PATCH 02/10] debugging: Use upright font for Euler's number Akira Yokosawa
2017-10-05 15:51 ` [PATCH 03/10] future/QC: Insert narrow space in front of percent symbol Akira Yokosawa
2017-10-05 15:52 ` [PATCH 04/10] future/QC: Use non-breakable hyphen for axis names Akira Yokosawa
2017-10-05 15:53 ` [PATCH 05/10] treewide: Insert narrow space in front of percent symbol Akira Yokosawa
2017-10-05 15:54 ` [PATCH 06/10] treewide: Use \Power{} macro for POWER CPU family Akira Yokosawa
2017-10-05 15:55 ` [PATCH 07/10] treewide: Call GNU C compiler as "GCC" Akira Yokosawa
2017-10-05 15:56 ` [PATCH 08/10] treewide: Use "IRQ" instead of "irq" used as abbreviation Akira Yokosawa
2017-10-05 15:59 ` [PATCH 09/10] future/QC: Use upright glyph for math constant and descriptive suffix Akira Yokosawa
2017-10-05 16:00 ` [PATCH 10/10] styleguide: Reflect recent style improvements Akira Yokosawa
2017-10-05 20:48 ` [PATCH 00/10] Tweaks to follow guidelines in style guide Paul E. McKenney

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.