* [PATCH] Convert code snippets to 'listing' env (howto, toolsoftrade, count)
@ 2017-10-10 22:40 Akira Yokosawa
2017-10-10 23:38 ` Paul E. McKenney
0 siblings, 1 reply; 2+ messages in thread
From: Akira Yokosawa @ 2017-10-10 22:40 UTC (permalink / raw)
To: Paul E. McKenney; +Cc: perfbook, Akira Yokosawa
From d3f43d97fdbcbe68b6e5936b8d94c50b78bfeaaf Mon Sep 17 00:00:00 2001
From: Akira Yokosawa <akiyks@gmail.com>
Date: Mon, 10 Oct 2017 22:18:17 +0900
Subject: [PATCH] Convert code snippets to 'listing' env (howto, toolsoftrade, count)
Also fix a typo referring table by "Figure" in count.tex.
Signed-off-by: Akira Yokosawa <akiyks@gmail.com>
---
Hi Paul,
Looks like we have some time to do the conversion to "listing" environments
before the upcoming release.
This is the 1st round of the conversion.
Thanks, Akira
--
count/count.tex | 276 +++++++++++++++++++++---------------------
defer/rcuexercises.tex | 2 +-
howto/howto.tex | 20 +--
owned/owned.tex | 28 ++---
together/applyrcu.tex | 16 +--
toolsoftrade/toolsoftrade.tex | 140 ++++++++++-----------
6 files changed, 241 insertions(+), 241 deletions(-)
diff --git a/count/count.tex b/count/count.tex
index a213558..9638971 100644
--- a/count/count.tex
+++ b/count/count.tex
@@ -165,7 +165,7 @@ are more appropriate for advanced students.
\section{Why Isn't Concurrent Counting Trivial?}
\label{sec:count:Why Isn't Concurrent Counting Trivial?}
-\begin{figure}[bp]
+\begin{listing}[bp]
{ \scriptsize
\begin{verbbox}
1 long counter = 0;
@@ -184,12 +184,12 @@ are more appropriate for advanced students.
\centering
\theverbbox
\caption{Just Count!}
-\label{fig:count:Just Count!}
-\end{figure}
+\label{lst:count:Just Count!}
+\end{listing}
Let's start with something simple, for example, the straightforward
use of arithmetic shown in
-Figure~\ref{fig:count:Just Count!} (\path{count_nonatomic.c}).
+Listing~\ref{lst:count:Just Count!} (\path{count_nonatomic.c}).
Here, we have a counter on line~1, we increment it on line~5, and we
read out its value on line~10.
What could be simpler?
@@ -242,7 +242,7 @@ accuracies far greater than 50\,\% are almost always necessary.
Furthermore, proofs can have bugs just as easily as programs can!
} \QuickQuizEnd
-\begin{figure}[tbp]
+\begin{listing}[tbp]
{ \scriptsize
\begin{verbbox}
1 atomic_t counter = ATOMIC_INIT(0);
@@ -261,8 +261,8 @@ accuracies far greater than 50\,\% are almost always necessary.
\centering
\theverbbox
\caption{Just Count Atomically!}
-\label{fig:count:Just Count Atomically!}
-\end{figure}
+\label{lst:count:Just Count Atomically!}
+\end{listing}
\begin{figure}[tb]
\centering
@@ -273,7 +273,7 @@ accuracies far greater than 50\,\% are almost always necessary.
The straightforward way to count accurately is to use atomic operations,
as shown in
-Figure~\ref{fig:count:Just Count Atomically!} (\path{count_atomic.c}).
+Listing~\ref{lst:count:Just Count Atomically!} (\path{count_atomic.c}).
Line~1 defines an atomic variable, line~5 atomically increments it, and
line~10 reads it out.
Because this is atomic, it keeps perfect count.
@@ -491,7 +491,7 @@ thread (presumably cache aligned and padded to avoid false sharing).
Section~\ref{sec:count:Per-Thread-Variable-Based Implementation}.
} \QuickQuizEnd
-\begin{figure}[tbp]
+\begin{listing}[tbp]
{ \scriptsize
\begin{verbbox}
1 DEFINE_PER_THREAD(long, counter);
@@ -515,11 +515,11 @@ thread (presumably cache aligned and padded to avoid false sharing).
\centering
\theverbbox
\caption{Array-Based Per-Thread Statistical Counters}
-\label{fig:count:Array-Based Per-Thread Statistical Counters}
-\end{figure}
+\label{lst:count:Array-Based Per-Thread Statistical Counters}
+\end{listing}
Such an array can be wrapped into per-thread primitives, as shown in
-Figure~\ref{fig:count:Array-Based Per-Thread Statistical Counters}
+Listing~\ref{lst:count:Array-Based Per-Thread Statistical Counters}
(\path{count_stat.c}).
Line~1 defines an array containing a set of per-thread counters of
type \co{long} named, creatively enough, \co{counter}.
@@ -559,7 +559,7 @@ normal loads suffice, and no special atomic instructions are required.
\QuickQuiz{}
How does the per-thread \co{counter} variable in
- Figure~\ref{fig:count:Array-Based Per-Thread Statistical Counters}
+ Listing~\ref{lst:count:Array-Based Per-Thread Statistical Counters}
get initialized?
\QuickQuizAnswer{
The C standard specifies that the initial value of
@@ -573,7 +573,7 @@ normal loads suffice, and no special atomic instructions are required.
\QuickQuiz{}
How is the code in
- Figure~\ref{fig:count:Array-Based Per-Thread Statistical Counters}
+ Listing~\ref{lst:count:Array-Based Per-Thread Statistical Counters}
supposed to permit more than one counter?
\QuickQuizAnswer{
Indeed, this toy example does not support more than one counter.
@@ -602,7 +602,7 @@ at the beginning of this chapter.
and during that time, the counter could well be changing.
This means that the value returned by
\co{read_count()} in
- Figure~\ref{fig:count:Array-Based Per-Thread Statistical Counters}
+ Listing~\ref{lst:count:Array-Based Per-Thread Statistical Counters}
will not necessarily be exact.
Assume that the counter is being incremented at rate
$r$ counts per unit time, and that \co{read_count()}'s
@@ -743,7 +743,7 @@ date, however, once updates cease, the global counter will eventually
converge on the true value---hence this approach qualifies as
eventually consistent.
-\begin{figure}[tbp]
+\begin{listing}[tbp]
{ \scriptsize
\begin{verbbox}
1 DEFINE_PER_THREAD(unsigned long, counter);
@@ -802,11 +802,11 @@ eventually consistent.
\centering
\theverbbox
\caption{Array-Based Per-Thread Eventually Consistent Counters}
-\label{fig:count:Array-Based Per-Thread Eventually Consistent Counters}
-\end{figure}
+\label{lst:count:Array-Based Per-Thread Eventually Consistent Counters}
+\end{listing}
The implementation is shown in
-Figure~\ref{fig:count:Array-Based Per-Thread Eventually Consistent Counters}
+Listing~\ref{lst:count:Array-Based Per-Thread Eventually Consistent Counters}
(\path{count_stat_eventual.c}).
Lines~1-2 show the per-thread variable and the global variable that
track the counter's value, and line three shows \co{stopflag}
@@ -814,7 +814,7 @@ which is used to coordinate termination (for the case where we want
to terminate the program with an accurate counter value).
The \co{inc_count()} function shown on lines~5-9 is similar to its
counterpart in
-Figure~\ref{fig:count:Array-Based Per-Thread Statistical Counters}.
+Listing~\ref{lst:count:Array-Based Per-Thread Statistical Counters}.
The \co{read_count()} function shown on lines~11-14 simply returns the
value of the \co{global_count} variable.
@@ -834,7 +834,7 @@ comes at the cost of the additional thread running \co{eventual()}.
\QuickQuiz{}
Why doesn't \co{inc_count()} in
- Figure~\ref{fig:count:Array-Based Per-Thread Eventually Consistent Counters}
+ Listing~\ref{lst:count:Array-Based Per-Thread Eventually Consistent Counters}
need to use atomic instructions?
After all, we now have multiple threads accessing the per-thread
counters!
@@ -872,7 +872,7 @@ comes at the cost of the additional thread running \co{eventual()}.
\QuickQuiz{}
Won't the single global thread in the function \co{eventual()} of
- Figure~\ref{fig:count:Array-Based Per-Thread Eventually Consistent Counters}
+ Listing~\ref{lst:count:Array-Based Per-Thread Eventually Consistent Counters}
be just as severe a bottleneck as a global lock would be?
\QuickQuizAnswer{
In this case, no.
@@ -883,7 +883,7 @@ comes at the cost of the additional thread running \co{eventual()}.
\QuickQuiz{}
Won't the estimate returned by \co{read_count()} in
- Figure~\ref{fig:count:Array-Based Per-Thread Eventually Consistent Counters}
+ Listing~\ref{lst:count:Array-Based Per-Thread Eventually Consistent Counters}
become increasingly
inaccurate as the number of threads rises?
\QuickQuizAnswer{
@@ -897,7 +897,7 @@ comes at the cost of the additional thread running \co{eventual()}.
\QuickQuiz{}
Given that in the eventually\-/consistent algorithm shown in
- Figure~\ref{fig:count:Array-Based Per-Thread Eventually Consistent Counters}
+ Listing~\ref{lst:count:Array-Based Per-Thread Eventually Consistent Counters}
both reads and updates have extremely low overhead
and are extremely scalable, why would anyone bother with the
implementation described in
@@ -934,7 +934,7 @@ comes at the cost of the additional thread running \co{eventual()}.
\subsection{Per-Thread-Variable-Based Implementation}
\label{sec:count:Per-Thread-Variable-Based Implementation}
-\begin{figure}[tb]
+\begin{listing}[tb]
{ \scriptsize
\begin{verbbox}
1 long __thread counter = 0;
@@ -984,13 +984,13 @@ comes at the cost of the additional thread running \co{eventual()}.
\centering
\theverbbox
\caption{Per-Thread Statistical Counters}
-\label{fig:count:Per-Thread Statistical Counters}
-\end{figure}
+\label{lst:count:Per-Thread Statistical Counters}
+\end{listing}
Fortunately, \GCC\ provides an \co{__thread} storage class that provides
per-thread storage.
This can be used as shown in
-Figure~\ref{fig:count:Per-Thread Statistical Counters} (\path{count_end.c})
+Listing~\ref{lst:count:Per-Thread Statistical Counters} (\path{count_end.c})
to implement
a statistical counter that not only scales, but that also incurs little
or no performance penalty to incrementers compared to simple non-atomic
@@ -1058,7 +1058,7 @@ Finally, line~22 returns the sum.
\QuickQuiz{}
Doesn't the check for \co{NULL} on line~19 of
- Figure~\ref{fig:count:Per-Thread Statistical Counters}
+ Listing~\ref{lst:count:Per-Thread Statistical Counters}
add extra branch mispredictions?
Why not have a variable set permanently to zero, and point
unused counter-pointers to that variable rather than setting
@@ -1074,7 +1074,7 @@ Finally, line~22 returns the sum.
\QuickQuiz{}
Why on earth do we need something as heavyweight as a \emph{lock}
guarding the summation in the function \co{read_count()} in
- Figure~\ref{fig:count:Per-Thread Statistical Counters}?
+ Listing~\ref{lst:count:Per-Thread Statistical Counters}?
\QuickQuizAnswer{
Remember, when a thread exits, its per-thread variables disappear.
Therefore, if we attempt to access a given thread's per-thread
@@ -1105,7 +1105,7 @@ array to point to its per-thread \co{counter} variable.
\QuickQuiz{}
Why on earth do we need to acquire the lock in
\co{count_register_thread()} in
- Figure~\ref{fig:count:Per-Thread Statistical Counters}?
+ Listing~\ref{lst:count:Per-Thread Statistical Counters}?
It is a single properly aligned machine-word store to a location
that no other thread is modifying, so it should be atomic anyway,
right?
@@ -1150,7 +1150,7 @@ variables vanish when that thread exits.
accessible, even if the corresponding CPU is offline---even
if the corresponding CPU never existed and never will exist.
-\begin{figure}[tb]
+\begin{listing}[tb]
{ \scriptsize
\begin{verbbox}
1 long __thread counter = 0;
@@ -1196,12 +1196,12 @@ variables vanish when that thread exits.
\centering
\theverbbox
\caption{Per-Thread Statistical Counters With Lockless Summation}
-\label{fig:count:Per-Thread Statistical Counters With Lockless Summation}
-\end{figure}
+\label{lst:count:Per-Thread Statistical Counters With Lockless Summation}
+\end{listing}
One workaround is to ensure that each thread continues to exist
until all threads are finished, as shown in
- Figure~\ref{fig:count:Per-Thread Statistical Counters With Lockless Summation}
+ Listing~\ref{lst:count:Per-Thread Statistical Counters With Lockless Summation}
(\path{count_tstat.c}).
Analysis of this code is left as an exercise to the reader,
however, please note that it does not fit well into the
@@ -1388,7 +1388,7 @@ Section~\ref{sec:SMPdesign:Parallel Fastpath}.
\subsection{Simple Limit Counter Implementation}
\label{sec:count:Simple Limit Counter Implementation}
-\begin{figure}[tbp]
+\begin{listing}[tbp]
{ \scriptsize
\begin{verbbox}
1 unsigned long __thread counter = 0;
@@ -1403,8 +1403,8 @@ Section~\ref{sec:SMPdesign:Parallel Fastpath}.
\centering
\theverbbox
\caption{Simple Limit Counter Variables}
-\label{fig:count:Simple Limit Counter Variables}
-\end{figure}
+\label{lst:count:Simple Limit Counter Variables}
+\end{listing}
\begin{figure}[tb]
\centering
@@ -1413,7 +1413,7 @@ Section~\ref{sec:SMPdesign:Parallel Fastpath}.
\label{fig:count:Simple Limit Counter Variable Relationships}
\end{figure}
-Figure~\ref{fig:count:Simple Limit Counter Variables}
+Listing~\ref{lst:count:Simple Limit Counter Variables}
shows both the per-thread and global variables used by this
implementation.
The per-thread \co{counter} and \co{countermax} variables are the
@@ -1443,7 +1443,7 @@ spinlock guards all of the global variables, in other words, no thread
is permitted to access or modify any of the global variables unless it
has acquired \co{gblcnt_mutex}.
-\begin{figure}[tbp]
+\begin{listing}[tbp]
{ \scriptsize
\begin{verbbox}
1 int add_count(unsigned long delta)
@@ -1501,17 +1501,17 @@ has acquired \co{gblcnt_mutex}.
\centering
\theverbbox
\caption{Simple Limit Counter Add, Subtract, and Read}
-\label{fig:count:Simple Limit Counter Add, Subtract, and Read}
-\end{figure}
+\label{lst:count:Simple Limit Counter Add, Subtract, and Read}
+\end{listing}
-Figure~\ref{fig:count:Simple Limit Counter Add, Subtract, and Read}
+Listing~\ref{lst:count:Simple Limit Counter Add, Subtract, and Read}
shows the \co{add_count()}, \co{sub_count()}, and \co{read_count()}
functions (\path{count_lim.c}).
\QuickQuiz{}
Why does
- Figure~\ref{fig:count:Simple Limit Counter Add, Subtract, and Read}
+ Listing~\ref{lst:count:Simple Limit Counter Add, Subtract, and Read}
provide \co{add_count()} and \co{sub_count()} instead of the
\co{inc_count()} and \co{dec_count()} interfaces show in
Section~\ref{sec:count:Statistical Counters}?
@@ -1531,7 +1531,7 @@ references only per-thread variables, and should not incur any cache misses.
\QuickQuiz{}
What is with the strange form of the condition on line~3 of
- Figure~\ref{fig:count:Simple Limit Counter Add, Subtract, and Read}?
+ Listing~\ref{lst:count:Simple Limit Counter Add, Subtract, and Read}?
Why not the following more intuitive form of the fastpath?
\vspace{5pt}
@@ -1552,7 +1552,7 @@ references only per-thread variables, and should not incur any cache misses.
Try the above formulation with \co{counter} equal to 10 and
\co{delta} equal to \co{ULONG_MAX}.
Then try it again with the code shown in
- Figure~\ref{fig:count:Simple Limit Counter Add, Subtract, and Read}.
+ Listing~\ref{lst:count:Simple Limit Counter Add, Subtract, and Read}.
A good understanding of integer overflow will be required for
the rest of this example, so if you have never dealt with
@@ -1566,7 +1566,7 @@ If the test on line~3 fails, we must access global variables, and thus
must acquire \co{gblcnt_mutex} on line~7, which we release on line~11
in the failure case or on line~16 in the success case.
Line~8 invokes \co{globalize_count()}, shown in
-Figure~\ref{fig:count:Simple Limit Counter Utility Functions},
+Listing~\ref{lst:count:Simple Limit Counter Utility Functions},
which clears the thread-local variables, adjusting the global variables
as needed, thus simplifying global processing.
(But don't take \emph{my} word for it, try coding it yourself!)
@@ -1581,7 +1581,7 @@ returns indicating failure.
Otherwise, we take the slowpath.
Line~14 adds \co{delta} to \co{globalcount}, and then
line~15 invokes \co{balance_count()} (shown in
-Figure~\ref{fig:count:Simple Limit Counter Utility Functions})
+Listing~\ref{lst:count:Simple Limit Counter Utility Functions})
in order to update both the global and the per-thread variables.
This call to \co{balance_count()}
will usually set this thread's \co{countermax} to re-enable the fastpath.
@@ -1592,7 +1592,7 @@ line~17 returns indicating success.
\QuickQuiz{}
Why does \co{globalize_count()} zero the per-thread variables,
only to later call \co{balance_count()} to refill them in
- Figure~\ref{fig:count:Simple Limit Counter Add, Subtract, and Read}?
+ Listing~\ref{lst:count:Simple Limit Counter Add, Subtract, and Read}?
Why not just leave the per-thread variables non-zero?
\QuickQuizAnswer{
That is in fact what an earlier version of this code did.
@@ -1616,7 +1616,7 @@ Because the slowpath must access global state, line~26
acquires \co{gblcnt_mutex}, which is released either by line~29
(in case of failure) or by line~34 (in case of success).
Line~27 invokes \co{globalize_count()}, shown in
-Figure~\ref{fig:count:Simple Limit Counter Utility Functions},
+Listing~\ref{lst:count:Simple Limit Counter Utility Functions},
which again clears the thread-local variables, adjusting the global variables
as needed.
Line~28 checks to see if the counter can accommodate subtracting
@@ -1626,7 +1626,7 @@ Line~28 checks to see if the counter can accommodate subtracting
\QuickQuiz{}
Given that \co{globalreserve} counted against us in \co{add_count()},
why doesn't it count for us in \co{sub_count()} in
- Figure~\ref{fig:count:Simple Limit Counter Add, Subtract, and Read}?
+ Listing~\ref{lst:count:Simple Limit Counter Add, Subtract, and Read}?
\QuickQuizAnswer{
The \co{globalreserve} variable tracks the sum of all threads'
\co{countermax} variables.
@@ -1641,7 +1641,7 @@ Line~28 checks to see if the counter can accommodate subtracting
\QuickQuiz{}
Suppose that one thread invokes \co{add_count()} shown in
- Figure~\ref{fig:count:Simple Limit Counter Add, Subtract, and Read},
+ Listing~\ref{lst:count:Simple Limit Counter Add, Subtract, and Read},
and then another thread invokes \co{sub_count()}.
Won't \co{sub_count()} return failure even though the value of
the counter is non-zero?
@@ -1658,14 +1658,14 @@ If, on the other hand, line~28 finds that the counter \emph{can}
accommodate subtracting \co{delta}, we complete the slowpath.
Line~32 does the subtraction and then
line~33 invokes \co{balance_count()} (shown in
-Figure~\ref{fig:count:Simple Limit Counter Utility Functions})
+Listing~\ref{lst:count:Simple Limit Counter Utility Functions})
in order to update both global and per-thread variables
(hopefully re-enabling the fastpath).
Then line~34 releases \co{gblcnt_mutex}, and line~35 returns success.
\QuickQuiz{}
Why have both \co{add_count()} and \co{sub_count()} in
- Figure~\ref{fig:count:Simple Limit Counter Add, Subtract, and Read}?
+ Listing~\ref{lst:count:Simple Limit Counter Add, Subtract, and Read}?
Why not simply pass a negative number to \co{add_count()}?
\QuickQuizAnswer{
Given that \co{add_count()} takes an \co{unsigned} \co{long}
@@ -1686,7 +1686,7 @@ Line~44 initializes local variable \co{sum} to the value of
per-thread \co{counter} variables.
Line~49 then returns the sum.
-\begin{figure}[tbp]
+\begin{listing}[tbp]
{ \scriptsize
\begin{verbbox}
1 static void globalize_count(void)
@@ -1732,13 +1732,13 @@ Line~49 then returns the sum.
\centering
\theverbbox
\caption{Simple Limit Counter Utility Functions}
-\label{fig:count:Simple Limit Counter Utility Functions}
-\end{figure}
+\label{lst:count:Simple Limit Counter Utility Functions}
+\end{listing}
-Figure~\ref{fig:count:Simple Limit Counter Utility Functions}
+Listing~\ref{lst:count:Simple Limit Counter Utility Functions}
shows a number of utility functions used by the \co{add_count()},
\co{sub_count()}, and \co{read_count()} primitives shown in
-Figure~\ref{fig:count:Simple Limit Counter Add, Subtract, and Read}.
+Listing~\ref{lst:count:Simple Limit Counter Add, Subtract, and Read}.
Lines~1-7 show \co{globalize_count()}, which zeros the current thread's
per-thread counters, adjusting the global variables appropriately.
@@ -1782,7 +1782,7 @@ Finally, in either case, line~18 makes the corresponding adjustment to
\QuickQuiz{}
Why set \co{counter} to \co{countermax / 2} in line~15 of
- Figure~\ref{fig:count:Simple Limit Counter Utility Functions}?
+ Listing~\ref{lst:count:Simple Limit Counter Utility Functions}?
Wouldn't it be simpler to just take \co{countermax} counts?
\QuickQuizAnswer{
First, it really is reserving \co{countermax} counts
@@ -1902,7 +1902,7 @@ This task is undertaken in the next section.
\subsection{Approximate Limit Counter Implementation}
\label{sec:count:Approximate Limit Counter Implementation}
-\begin{figure}[tbp]
+\begin{listing}[tbp]
{ \scriptsize
\begin{verbbox}
1 unsigned long __thread counter = 0;
@@ -1918,10 +1918,10 @@ This task is undertaken in the next section.
\centering
\theverbbox
\caption{Approximate Limit Counter Variables}
-\label{fig:count:Approximate Limit Counter Variables}
-\end{figure}
+\label{lst:count:Approximate Limit Counter Variables}
+\end{listing}
-\begin{figure}[tbp]
+\begin{listing}[tbp]
{ \scriptsize
\begin{verbbox}
1 static void balance_count(void)
@@ -1942,25 +1942,25 @@ This task is undertaken in the next section.
\centering
\theverbbox
\caption{Approximate Limit Counter Balancing}
-\label{fig:count:Approximate Limit Counter Balancing}
-\end{figure}
+\label{lst:count:Approximate Limit Counter Balancing}
+\end{listing}
Because this implementation (\path{count_lim_app.c}) is quite similar to
that in the previous section
-(Figures~\ref{fig:count:Simple Limit Counter Variables},
-\ref{fig:count:Simple Limit Counter Add, Subtract, and Read}, and
-\ref{fig:count:Simple Limit Counter Utility Functions}),
+(Listings~\ref{lst:count:Simple Limit Counter Variables},
+\ref{lst:count:Simple Limit Counter Add, Subtract, and Read}, and
+\ref{lst:count:Simple Limit Counter Utility Functions}),
only the changes are shown here.
-Figure~\ref{fig:count:Approximate Limit Counter Variables}
+Listing~\ref{lst:count:Approximate Limit Counter Variables}
is identical to
-Figure~\ref{fig:count:Simple Limit Counter Variables},
+Listing~\ref{lst:count:Simple Limit Counter Variables},
with the addition of \co{MAX_COUNTERMAX}, which sets the maximum
permissible value of the per-thread \co{countermax} variable.
Similarly,
-Figure~\ref{fig:count:Approximate Limit Counter Balancing}
+Listing~\ref{lst:count:Approximate Limit Counter Balancing}
is identical to the \co{balance_count()} function in
-Figure~\ref{fig:count:Simple Limit Counter Utility Functions},
+Listing~\ref{lst:count:Simple Limit Counter Utility Functions},
with the addition of lines~6 and 7, which enforce the
\co{MAX_COUNTERMAX} limit on the per-thread \co{countermax} variable.
@@ -2038,7 +2038,7 @@ represent \co{counter} and the low-order 16 bits to represent
the reader.
} \QuickQuizEnd
-\begin{figure}[tbp]
+\begin{listing}[tbp]
{ \scriptsize
\begin{verbbox}
1 atomic_t __thread ctrandmax = ATOMIC_INIT(0);
@@ -2079,12 +2079,12 @@ represent \co{counter} and the low-order 16 bits to represent
\centering
\theverbbox
\caption{Atomic Limit Counter Variables and Access Functions}
-\label{fig:count:Atomic Limit Counter Variables and Access Functions}
-\end{figure}
+\label{lst:count:Atomic Limit Counter Variables and Access Functions}
+\end{listing}
The variables and access functions for a simple atomic limit counter
are shown in
-Figure~\ref{fig:count:Atomic Limit Counter Variables and Access Functions}
+Listing~\ref{lst:count:Atomic Limit Counter Variables and Access Functions}
(\path{count_lim_atomic.c}).
The \co{counter} and \co{countermax} variables in earlier algorithms
are combined into the single variable \co{ctrandmax} shown on
@@ -2096,7 +2096,7 @@ representation of \co{int}.
Lines~2-6 show the definitions for \co{globalcountmax}, \co{globalcount},
\co{globalreserve}, \co{counterp}, and \co{gblcnt_mutex}, all of which
take on roles similar to their counterparts in
-Figure~\ref{fig:count:Approximate Limit Counter Variables}.
+Listing~\ref{lst:count:Approximate Limit Counter Variables}.
Line~7 defines \co{CM_BITS}, which gives the number of bits in each half
of \co{ctrandmax}, and line~8 defines \co{MAX_COUNTERMAX}, which
gives the maximum value that may be held in either half of
@@ -2104,7 +2104,7 @@ gives the maximum value that may be held in either half of
\QuickQuiz{}
In what way does line~7 of
- Figure~\ref{fig:count:Atomic Limit Counter Variables and Access Functions}
+ Listing~\ref{lst:count:Atomic Limit Counter Variables and Access Functions}
violate the C standard?
\QuickQuizAnswer{
It assumes eight bits per byte.
@@ -2134,7 +2134,7 @@ it on line~24.
\QuickQuiz{}
Given that there is only one \co{ctrandmax} variable,
why bother passing in a pointer to it on line~18 of
- Figure~\ref{fig:count:Atomic Limit Counter Variables and Access Functions}?
+ Listing~\ref{lst:count:Atomic Limit Counter Variables and Access Functions}?
\QuickQuizAnswer{
There is only one \co{ctrandmax} variable \emph{per thread}.
Later, we will see code that needs to pass other threads'
@@ -2149,7 +2149,7 @@ the result.
\QuickQuiz{}
Why does \co{merge_ctrandmax()} in
- Figure~\ref{fig:count:Atomic Limit Counter Variables and Access Functions}
+ Listing~\ref{lst:count:Atomic Limit Counter Variables and Access Functions}
return an \co{int} rather than storing directly into an
\co{atomic_t}?
\QuickQuizAnswer{
@@ -2157,7 +2157,7 @@ the result.
to the \co{atomic_cmpxchg()} primitive.
} \QuickQuizEnd
-\begin{figure}[tbp]
+\begin{listing}[tbp]
{ \scriptsize
\begin{verbbox}
1 int add_count(unsigned long delta)
@@ -2228,10 +2228,10 @@ the result.
\centering
\theverbbox
\caption{Atomic Limit Counter Add and Subtract}
-\label{fig:count:Atomic Limit Counter Add and Subtract}
-\end{figure}
+\label{lst:count:Atomic Limit Counter Add and Subtract}
+\end{listing}
-Figure~\ref{fig:count:Atomic Limit Counter Add and Subtract}
+Listing~\ref{lst:count:Atomic Limit Counter Add and Subtract}
shows the \co{add_count()} and \co{sub_count()} functions.
Lines~1-32 show \co{add_count()}, whose fastpath spans lines~8-15,
@@ -2256,7 +2256,7 @@ execution continues in the loop at line~9.
\QuickQuiz{}
Yecch!
Why the ugly \co{goto} on line~11 of
- Figure~\ref{fig:count:Atomic Limit Counter Add and Subtract}?
+ Listing~\ref{lst:count:Atomic Limit Counter Add and Subtract}?
Haven't you heard of the \co{break} statement???
\QuickQuizAnswer{
Replacing the \co{goto} with a \co{break} would require keeping
@@ -2271,20 +2271,20 @@ execution continues in the loop at line~9.
\QuickQuiz{}
Why would the \co{atomic_cmpxchg()} primitive at lines~13-14 of
- Figure~\ref{fig:count:Atomic Limit Counter Add and Subtract}
+ Listing~\ref{lst:count:Atomic Limit Counter Add and Subtract}
ever fail?
After all, we picked up its old value on line~9 and have not
changed it!
\QuickQuizAnswer{
Later, we will see how the \co{flush_local_count()} function in
- Figure~\ref{fig:count:Atomic Limit Counter Utility Functions 1}
+ Listing~\ref{lst:count:Atomic Limit Counter Utility Functions 1}
might update this thread's \co{ctrandmax} variable concurrently
with the execution of the fastpath on lines~8-14 of
- Figure~\ref{fig:count:Atomic Limit Counter Add and Subtract}.
+ Listing~\ref{lst:count:Atomic Limit Counter Add and Subtract}.
} \QuickQuizEnd
Lines~16-31 of
-Figure~\ref{fig:count:Atomic Limit Counter Add and Subtract}
+Listing~\ref{lst:count:Atomic Limit Counter Add and Subtract}
show \co{add_count()}'s slowpath, which is protected by \co{gblcnt_mutex},
which is acquired on line~17 and released on lines~24 and 30.
Line~18 invokes \co{globalize_count()}, which moves this thread's
@@ -2304,14 +2304,14 @@ spreads counts to the local state if appropriate, line~30 releases
returns success.
Lines~34-63 of
-Figure~\ref{fig:count:Atomic Limit Counter Add and Subtract}
+Listing~\ref{lst:count:Atomic Limit Counter Add and Subtract}
show \co{sub_count()}, which is structured similarly to
\co{add_count()}, having a fastpath on lines~41-48 and a slowpath on
lines~49-62.
A line-by-line analysis of this function is left as an exercise to
the reader.
-\begin{figure}[tbp]
+\begin{listing}[tbp]
{ \scriptsize
\begin{verbbox}
1 unsigned long read_count(void)
@@ -2337,10 +2337,10 @@ the reader.
\centering
\theverbbox
\caption{Atomic Limit Counter Read}
-\label{fig:count:Atomic Limit Counter Read}
-\end{figure}
+\label{lst:count:Atomic Limit Counter Read}
+\end{listing}
-Figure~\ref{fig:count:Atomic Limit Counter Read} shows \co{read_count()}.
+Listing~\ref{lst:count:Atomic Limit Counter Read} shows \co{read_count()}.
Line~9 acquires \co{gblcnt_mutex} and line~16 releases it.
Line~10 initializes local variable \co{sum} to the value of
\co{globalcount}, and the loop spanning lines~11-15 adds the
@@ -2348,7 +2348,7 @@ per-thread counters to this sum, isolating each per-thread counter
using \co{split_ctrandmax} on line~13.
Finally, line~17 returns the sum.
-\begin{figure}[tbp]
+\begin{listing}[tbp]
{ \scriptsize
\begin{verbbox}
1 static void globalize_count(void)
@@ -2388,10 +2388,10 @@ Finally, line~17 returns the sum.
\centering
\theverbbox
\caption{Atomic Limit Counter Utility Functions 1}
-\label{fig:count:Atomic Limit Counter Utility Functions 1}
-\end{figure}
+\label{lst:count:Atomic Limit Counter Utility Functions 1}
+\end{listing}
-\begin{figure}[tb]
+\begin{listing}[tb]
{ \scriptsize
\begin{verbbox}
1 static void balance_count(void)
@@ -2440,11 +2440,11 @@ Finally, line~17 returns the sum.
\centering
\theverbbox
\caption{Atomic Limit Counter Utility Functions 2}
-\label{fig:count:Atomic Limit Counter Utility Functions 2}
-\end{figure}
+\label{lst:count:Atomic Limit Counter Utility Functions 2}
+\end{listing}
-Figures~\ref{fig:count:Atomic Limit Counter Utility Functions 1}
-and~\ref{fig:count:Atomic Limit Counter Utility Functions 2}
+Listings~\ref{lst:count:Atomic Limit Counter Utility Functions 1}
+and~\ref{lst:count:Atomic Limit Counter Utility Functions 2}
shows the utility functions
\co{globalize_count()},
\co{flush_local_count()},
@@ -2452,7 +2452,7 @@ shows the utility functions
\co{count_register_thread()}, and
\co{count_unregister_thread()}.
The code for \co{globalize_count()} is shown on lines~1-12,
-of Figure~\ref{fig:count:Atomic Limit Counter Utility Functions 1} and
+of Listing~\ref{lst:count:Atomic Limit Counter Utility Functions 1} and
is similar to that of previous algorithms, with the addition of
line~7, which is now required to split out \co{counter} and
\co{countermax} from \co{ctrandmax}.
@@ -2477,7 +2477,7 @@ line~30 subtracts this thread's \co{countermax} from \co{globalreserve}.
What stops a thread from simply refilling its
\co{ctrandmax} variable immediately after
\co{flush_local_count()} on line~14 of
- Figure~\ref{fig:count:Atomic Limit Counter Utility Functions 1}
+ Listing~\ref{lst:count:Atomic Limit Counter Utility Functions 1}
empties it?
\QuickQuizAnswer{
This other thread cannot refill its \co{ctrandmax}
@@ -2494,7 +2494,7 @@ line~30 subtracts this thread's \co{countermax} from \co{globalreserve}.
\co{add_count()} or \co{sub_count()} from interfering with
the \co{ctrandmax} variable while
\co{flush_local_count()} is accessing it on line~27 of
- Figure~\ref{fig:count:Atomic Limit Counter Utility Functions 1}
+ Listing~\ref{lst:count:Atomic Limit Counter Utility Functions 1}
empties it?
\QuickQuizAnswer{
Nothing.
@@ -2520,7 +2520,7 @@ line~30 subtracts this thread's \co{countermax} from \co{globalreserve}.
} \QuickQuizEnd
Lines~1-22 on
-Figure~\ref{fig:count:Atomic Limit Counter Utility Functions 2}
+Listing~\ref{lst:count:Atomic Limit Counter Utility Functions 2}
show the code for \co{balance_count()}, which refills
the calling thread's local \co{ctrandmax} variable.
This function is quite similar to that of the preceding algorithms,
@@ -2534,7 +2534,7 @@ line~33.
Given that the \co{atomic_set()} primitive does a simple
store to the specified \co{atomic_t}, how can line~21 of
\co{balance_count()} in
- Figure~\ref{fig:count:Atomic Limit Counter Utility Functions 2}
+ Listing~\ref{lst:count:Atomic Limit Counter Utility Functions 2}
work correctly in face of concurrent \co{flush_local_count()}
updates to this variable?
\QuickQuizAnswer{
@@ -2668,7 +2668,7 @@ The slowpath then sets that thread's \co{theft} state to IDLE.
\subsection{Signal-Theft Limit Counter Implementation}
\label{sec:count:Signal-Theft Limit Counter Implementation}
-\begin{figure}[tbp]
+\begin{listing}[tbp]
{ \scriptsize
\begin{verbbox}
1 #define THEFT_IDLE 0
@@ -2693,10 +2693,10 @@ The slowpath then sets that thread's \co{theft} state to IDLE.
\centering
\theverbbox
\caption{Signal-Theft Limit Counter Data}
-\label{fig:count:Signal-Theft Limit Counter Data}
-\end{figure}
+\label{lst:count:Signal-Theft Limit Counter Data}
+\end{listing}
-Figure~\ref{fig:count:Signal-Theft Limit Counter Data}
+Listing~\ref{lst:count:Signal-Theft Limit Counter Data}
(\path{count_lim_sig.c})
shows the data structures used by the signal-theft based counter
implementation.
@@ -2706,7 +2706,7 @@ Lines~8-17 are similar to earlier implementations, with the addition of
lines~14 and 15 to allow remote access to a thread's \co{countermax}
and \co{theft} variables, respectively.
-\begin{figure}[tbp]
+\begin{listing}[tbp]
{ \scriptsize
\begin{verbbox}
1 static void globalize_count(void)
@@ -2777,10 +2777,10 @@ and \co{theft} variables, respectively.
\centering
\theverbbox
\caption{Signal-Theft Limit Counter Value-Migration Functions}
-\label{fig:count:Signal-Theft Limit Counter Value-Migration Functions}
-\end{figure}
+\label{lst:count:Signal-Theft Limit Counter Value-Migration Functions}
+\end{listing}
-Figure~\ref{fig:count:Signal-Theft Limit Counter Value-Migration Functions}
+Listing~\ref{lst:count:Signal-Theft Limit Counter Value-Migration Functions}
shows the functions responsible for migrating counts between per-thread
variables and the global variables.
Lines~1-7 shows \co{globalize_count()}, which is identical to earlier
@@ -2796,7 +2796,7 @@ this thread's fastpaths are not running, line~16 sets the \co{theft}
state to READY.
\QuickQuiz{}
- In Figure~\ref{fig:count:Signal-Theft Limit Counter Value-Migration Functions}
+ In Listing~\ref{lst:count:Signal-Theft Limit Counter Value-Migration Functions}
function \co{flush_local_count_sig()}, why are there
\co{ACCESS_ONCE()} wrappers around the uses of the
\co{theft} per-thread variable?
@@ -2835,7 +2835,7 @@ Otherwise, line~32 sets the thread's \co{theft} state to REQ and
line~33 sends the thread a signal.
\QuickQuiz{}
- In Figure~\ref{fig:count:Signal-Theft Limit Counter Value-Migration Functions},
+ In Listing~\ref{lst:count:Signal-Theft Limit Counter Value-Migration Functions},
why is it safe for line~28 to directly access the other thread's
\co{countermax} variable?
\QuickQuizAnswer{
@@ -2849,7 +2849,7 @@ line~33 sends the thread a signal.
} \QuickQuizEnd
\QuickQuiz{}
- In Figure~\ref{fig:count:Signal-Theft Limit Counter Value-Migration Functions},
+ In Listing~\ref{lst:count:Signal-Theft Limit Counter Value-Migration Functions},
why doesn't line~33 check for the current thread sending itself
a signal?
\QuickQuizAnswer{
@@ -2861,7 +2861,7 @@ line~33 sends the thread a signal.
\QuickQuiz{}
The code in
- Figure~\ref{fig:count:Signal-Theft Limit Counter Value-Migration Functions},
+ Listing~\ref{lst:count:Signal-Theft Limit Counter Value-Migration Functions},
works with \GCC\ and POSIX.
What would be required to make it also conform to the ISO C standard?
\QuickQuizAnswer{
@@ -2882,7 +2882,7 @@ READY, so lines~43-46 do the thieving.
Line~47 then sets the thread's \co{theft} state back to IDLE.
\QuickQuiz{}
- In Figure~\ref{fig:count:Signal-Theft Limit Counter Value-Migration Functions}, why does line~41 resend the signal?
+ In Listing~\ref{lst:count:Signal-Theft Limit Counter Value-Migration Functions}, why does line~41 resend the signal?
\QuickQuizAnswer{
Because many operating systems over several decades have
had the property of losing the occasional signal.
@@ -2897,7 +2897,7 @@ Line~47 then sets the thread's \co{theft} state back to IDLE.
Lines~51-63 show \co{balance_count()}, which is similar to that of
earlier examples.
-\begin{figure}[tbp]
+\begin{listing}[tbp]
{ \scriptsize
\begin{verbbox}
1 int add_count(unsigned long delta)
@@ -2941,10 +2941,10 @@ earlier examples.
\centering
\theverbbox
\caption{Signal-Theft Limit Counter Add Function}
-\label{fig:count:Signal-Theft Limit Counter Add Function}
-\end{figure}
+\label{lst:count:Signal-Theft Limit Counter Add Function}
+\end{listing}
-\begin{figure}[tb]
+\begin{listing}[tb]
{ \scriptsize
\begin{verbbox}
38 int sub_count(unsigned long delta)
@@ -2986,10 +2986,10 @@ earlier examples.
\centering
\theverbbox
\caption{Signal-Theft Limit Counter Subtract Function}
-\label{fig:count:Signal-Theft Limit Counter Subtract Function}
-\end{figure}
+\label{lst:count:Signal-Theft Limit Counter Subtract Function}
+\end{listing}
-Figure~\ref{fig:count:Signal-Theft Limit Counter Add Function}
+Listing~\ref{lst:count:Signal-Theft Limit Counter Add Function}
shows the \co{add_count()} function.
The fastpath spans lines~5-20, and the slowpath lines~21-35.
Line~5 sets the per-thread \co{counting} variable to 1 so that
@@ -3014,7 +3014,7 @@ READY also sees the effects of line~9.
If the fastpath addition at line~9 was executed, then line~20 returns
success.
-\begin{figure}[tbp]
+\begin{listing}[tbp]
{ \scriptsize
\begin{verbbox}
1 unsigned long read_count(void)
@@ -3035,21 +3035,21 @@ success.
\centering
\theverbbox
\caption{Signal-Theft Limit Counter Read Function}
-\label{fig:count:Signal-Theft Limit Counter Read Function}
-\end{figure}
+\label{lst:count:Signal-Theft Limit Counter Read Function}
+\end{listing}
Otherwise, we fall through to the slowpath starting at line~21.
The structure of the slowpath is similar to those of earlier examples,
so its analysis is left as an exercise to the reader.
Similarly, the structure of \co{sub_count()} on
-Figure~\ref{fig:count:Signal-Theft Limit Counter Subtract Function}
+Listing~\ref{lst:count:Signal-Theft Limit Counter Subtract Function}
is the same
as that of \co{add_count()}, so the analysis of \co{sub_count()} is also
left as an exercise for the reader, as is the analysis of
\co{read_count()} in
-Figure~\ref{fig:count:Signal-Theft Limit Counter Read Function}.
+Listing~\ref{lst:count:Signal-Theft Limit Counter Read Function}.
-\begin{figure}[tbp]
+\begin{listing}[tbp]
{ \scriptsize
\begin{verbbox}
1 void count_init(void)
@@ -3092,11 +3092,11 @@ Figure~\ref{fig:count:Signal-Theft Limit Counter Read Function}.
\centering
\theverbbox
\caption{Signal-Theft Limit Counter Initialization Functions}
-\label{fig:count:Signal-Theft Limit Counter Initialization Functions}
-\end{figure}
+\label{lst:count:Signal-Theft Limit Counter Initialization Functions}
+\end{listing}
Lines~1-12 of
-Figure~\ref{fig:count:Signal-Theft Limit Counter Initialization Functions}
+Listing~\ref{lst:count:Signal-Theft Limit Counter Initialization Functions}
show \co{count_init()}, which set up \co{flush_local_count_sig()}
as the signal handler for \co{SIGUSR1},
enabling the \co{pthread_kill()} calls in \co{flush_local_count()}
@@ -3414,7 +3414,7 @@ courtesy of eventual consistency.
\label{tab:count:Limit Counter Performance on Power-6}
\end{table*}
-Figure~\ref{tab:count:Limit Counter Performance on Power-6}
+Table~\ref{tab:count:Limit Counter Performance on Power-6}
shows the performance of the parallel limit-counting algorithms.
Exact enforcement of the limits incurs a substantial performance
penalty, although on this 4.7\,GHz \Power{6} system that penalty can be reduced
diff --git a/defer/rcuexercises.tex b/defer/rcuexercises.tex
index 1d9e138..21d9683 100644
--- a/defer/rcuexercises.tex
+++ b/defer/rcuexercises.tex
@@ -14,7 +14,7 @@ suffice for most of these exercises.
\QuickQuiz{}
The statistical-counter implementation shown in
- Figure~\ref{fig:count:Per-Thread Statistical Counters}
+ Listing~\ref{lst:count:Per-Thread Statistical Counters}
(\path{count_end.c})
used a global lock to guard the summation in \co{read_count()},
which resulted in poor performance and negative scalability.
diff --git a/howto/howto.tex b/howto/howto.tex
index 90abb14..1a0b75c 100644
--- a/howto/howto.tex
+++ b/howto/howto.tex
@@ -361,7 +361,7 @@ Other types of systems have well-known ways of locating files by filename.
\section{Whose Book Is This?}
\label{sec:howto:Whose Book Is This?}
-\begin{figure*}[tbp]
+\begin{listing*}[tbp]
{
\scriptsize
\begin{verbbox}
@@ -376,10 +376,10 @@ Other types of systems have well-known ways of locating files by filename.
}
\hspace*{1in}\OneColumnHSpace{-0.5in}\theverbbox
\caption{Creating an Up-To-Date PDF}
-\label{fig:howto:Creating a Up-To-Date PDF}
-\end{figure*}
+\label{lst:howto:Creating a Up-To-Date PDF}
+\end{listing*}
-\begin{figure*}[tbp]
+\begin{listing*}[tbp]
{
\scriptsize
\begin{verbbox}
@@ -393,8 +393,8 @@ Other types of systems have well-known ways of locating files by filename.
}
\hspace*{1in}\OneColumnHSpace{-0.5in}\theverbbox
\caption{Generating an Updated PDF}
-\label{fig:howto:Generating an Updated PDF}
-\end{figure*}
+\label{lst:howto:Generating an Updated PDF}
+\end{listing*}
As the cover says, the editor is one Paul E.~McKenney.
However, the editor does accept contributions via the
@@ -416,18 +416,18 @@ in the file \path{FAQ-BUILD.txt} in the \LaTeX{} source to the book.
To create and display a current \LaTeX{} source tree of this book,
use the list of Linux commands shown in
-Figure~\ref{fig:howto:Creating a Up-To-Date PDF}.
+Listing~\ref{lst:howto:Creating a Up-To-Date PDF}.
In some environments, the \co{evince} command that displays \path{perfbook.pdf}
may need to be replaced, for example, with \co{acroread}.
The \co{git clone} command need only be used the first time you
create a PDF, subsequently, you can run the commands shown in
-Figure~\ref{fig:howto:Generating an Updated PDF} to pull in any updates
+Listing~\ref{lst:howto:Generating an Updated PDF} to pull in any updates
and generate an updated PDF.
The commands in
-Figure~\ref{fig:howto:Generating an Updated PDF}
+Listing~\ref{lst:howto:Generating an Updated PDF}
must be run within the \path{perfbook} directory created by the commands
shown in
-Figure~\ref{fig:howto:Creating a Up-To-Date PDF}.
+Listing~\ref{lst:howto:Creating a Up-To-Date PDF}.
PDFs of this book are sporadically posted at
\url{http://kernel.org/pub/linux/kernel/people/paulmck/perfbook/perfbook.html}
diff --git a/owned/owned.tex b/owned/owned.tex
index f74b5d6..d584e82 100644
--- a/owned/owned.tex
+++ b/owned/owned.tex
@@ -132,8 +132,8 @@ is obviously quite attractive.
} \QuickQuizEnd
This same pattern can be written in C as well as in \co{sh}, as illustrated by
-Figures~\ref{fig:toolsoftrade:Using the fork() Primitive}
-and~\ref{fig:toolsoftrade:Using the wait() Primitive}.
+Listings~\ref{lst:toolsoftrade:Using the fork() Primitive}
+and~\ref{lst:toolsoftrade:Using the wait() Primitive}.
The next section discusses use of data ownership in shared-memory
parallel programs.
@@ -150,8 +150,8 @@ of ownership and access rights.
For example, consider the per-thread statistical counter implementation
shown in
-Figure~\ref{fig:count:Per-Thread Statistical Counters} on
-page~\pageref{fig:count:Per-Thread Statistical Counters}.
+Listing~\ref{lst:count:Per-Thread Statistical Counters} on
+page~\pageref{lst:count:Per-Thread Statistical Counters}.
Here, \co{inc_count()} updates only the corresponding thread's
instance of \co{counter},
while \co{read_count()} accesses, but does not modify, all
@@ -195,9 +195,9 @@ beginning on
page~\pageref{sec:count:Signal-Theft Limit Counter Design},
in particular the \co{flush_local_count_sig()} and
\co{flush_local_count()} functions in
-Figure~\ref{fig:count:Signal-Theft Limit Counter Value-Migration Functions}
+Listing~\ref{lst:count:Signal-Theft Limit Counter Value-Migration Functions}
on
-page~\pageref{fig:count:Signal-Theft Limit Counter Value-Migration Functions}.
+page~\pageref{lst:count:Signal-Theft Limit Counter Value-Migration Functions}.
The \co{flush_local_count_sig()} function is a signal handler that
acts as the shipped function.
@@ -207,12 +207,12 @@ the shipped function executes.
This shipped function has the not-unusual added complication of
needing to interact with any concurrently executing \co{add_count()}
or \co{sub_count()} functions (see
-Figure~\ref{fig:count:Signal-Theft Limit Counter Add Function}
+Listing~\ref{lst:count:Signal-Theft Limit Counter Add Function}
on
-page~\pageref{fig:count:Signal-Theft Limit Counter Add Function} and
-Figure~\ref{fig:count:Signal-Theft Limit Counter Subtract Function}
+page~\pageref{lst:count:Signal-Theft Limit Counter Add Function} and
+Listing~\ref{lst:count:Signal-Theft Limit Counter Subtract Function}
on
-page~\pageref{fig:count:Signal-Theft Limit Counter Subtract Function}).
+page~\pageref{lst:count:Signal-Theft Limit Counter Subtract Function}).
\QuickQuiz{}
What mechanisms other than POSIX signals may be used for function
@@ -246,7 +246,7 @@ The eventually consistent counter implementation described in
Section~\ref{sec:count:Eventually Consistent Implementation} provides an example.
This implementation has a designated thread that runs the
\co{eventual()} function shown on lines~15-32 of
-Figure~\ref{fig:count:Array-Based Per-Thread Eventually Consistent Counters}.
+Listing~\ref{lst:count:Array-Based Per-Thread Eventually Consistent Counters}.
This \co{eventual()} thread periodically pulls the per-thread counts
into the global counter, so that accesses to the global counter will,
as the name says, eventually converge on the actual value.
@@ -254,7 +254,7 @@ as the name says, eventually converge on the actual value.
\QuickQuiz{}
But none of the data in the \co{eventual()} function shown on
lines~15-32 of
- Figure~\ref{fig:count:Array-Based Per-Thread Eventually Consistent Counters}
+ Listing~\ref{lst:count:Array-Based Per-Thread Eventually Consistent Counters}
is actually owned by the \co{eventual()} thread!
In just what way is this data ownership???
\QuickQuizAnswer{
@@ -298,8 +298,8 @@ a considerable reduction in the spread of certain types of disease.
In other cases, privatization imposes costs.
For example, consider the simple limit counter shown in
-Figure~\ref{fig:count:Simple Limit Counter Add, Subtract, and Read} on
-page~\pageref{fig:count:Simple Limit Counter Add, Subtract, and Read}.
+Listing~\ref{lst:count:Simple Limit Counter Add, Subtract, and Read} on
+page~\pageref{lst:count:Simple Limit Counter Add, Subtract, and Read}.
This is an example of an algorithm where threads can read each others'
data, but are only permitted to update their own data.
A quick review of the algorithm shows that the only cross-thread
diff --git a/together/applyrcu.tex b/together/applyrcu.tex
index e3d6f0d..d477ab5 100644
--- a/together/applyrcu.tex
+++ b/together/applyrcu.tex
@@ -21,8 +21,8 @@ Unfortunately, threads needing to read out the value via \co{read_count()}
were required to acquire a global
lock, and thus incurred high overhead and suffered poor scalability.
The code for the lock-based implementation is shown in
-Figure~\ref{fig:count:Per-Thread Statistical Counters} on
-Page~\pageref{fig:count:Per-Thread Statistical Counters}.
+Listing~\ref{lst:count:Per-Thread Statistical Counters} on
+Page~\pageref{lst:count:Per-Thread Statistical Counters}.
\QuickQuiz{}
Why on earth did we need that global lock in the first place?
@@ -56,8 +56,8 @@ execution.
Just what is the accuracy of \co{read_count()}, anyway?
\QuickQuizAnswer{
Refer to
- Figure~\ref{fig:count:Per-Thread Statistical Counters} on
- Page~\pageref{fig:count:Per-Thread Statistical Counters}.
+ Listing~\ref{lst:count:Per-Thread Statistical Counters} on
+ Page~\pageref{lst:count:Per-Thread Statistical Counters}.
Clearly, if there are no concurrent invocations of \co{inc_count()},
\co{read_count()} will return an exact result.
However, if there \emph{are} concurrent invocations of
@@ -205,7 +205,7 @@ the current \co{countarray} structure, and
the \co{final_mutex} spinlock.
Lines~10-13 show \co{inc_count()}, which is unchanged from
-Figure~\ref{fig:count:Per-Thread Statistical Counters}.
+Listing~\ref{lst:count:Per-Thread Statistical Counters}.
Lines~15-29 show \co{read_count()}, which has changed significantly.
Lines~21 and~27 substitute \co{rcu_read_lock()} and
@@ -280,13 +280,13 @@ Line~69 can then safely free the old \co{countarray} structure.
Wow!
Figure~\ref{fig:together:RCU and Per-Thread Statistical Counters}
contains 69 lines of code, compared to only 42 in
- Figure~\ref{fig:count:Per-Thread Statistical Counters}.
+ Listing~\ref{lst:count:Per-Thread Statistical Counters}.
Is this extra complexity really worth it?
\QuickQuizAnswer{
This of course needs to be decided on a case-by-case basis.
If you need an implementation of \co{read_count()} that
scales linearly, then the lock-based implementation shown in
- Figure~\ref{fig:count:Per-Thread Statistical Counters}
+ Listing~\ref{lst:count:Per-Thread Statistical Counters}
simply will not work for you.
On the other hand, if calls to \co{count_read()} are sufficiently
rare, then the lock-based version is simpler and might thus be
@@ -300,7 +300,7 @@ Line~69 can then safely free the old \co{countarray} structure.
and the use of RCU unnecessary.
This would in turn enable an implementation that
was even simpler than the one shown in
- Figure~\ref{fig:count:Per-Thread Statistical Counters}, but
+ Listing~\ref{lst:count:Per-Thread Statistical Counters}, but
with all the scalability and performance benefits of the
implementation shown in
Figure~\ref{fig:together:RCU and Per-Thread Statistical Counters}!
diff --git a/toolsoftrade/toolsoftrade.tex b/toolsoftrade/toolsoftrade.tex
index bd43879..b18d3d0 100644
--- a/toolsoftrade/toolsoftrade.tex
+++ b/toolsoftrade/toolsoftrade.tex
@@ -189,7 +189,7 @@ These concerns can of course add substantial complexity to the code.
For more information, see any of a number of textbooks on the
subject~\cite{WRichardStevens1992,StewartWeiss2013UNIX}.
-\begin{figure}[tbp]
+\begin{listing}[tbp]
{ \scriptsize
\begin{verbbox}
1 pid = fork();
@@ -207,14 +207,14 @@ subject~\cite{WRichardStevens1992,StewartWeiss2013UNIX}.
\centering
\theverbbox
\caption{Using the \tco{fork()} Primitive}
-\label{fig:toolsoftrade:Using the fork() Primitive}
-\end{figure}
+\label{lst:toolsoftrade:Using the fork() Primitive}
+\end{listing}
If \co{fork()} succeeds, it returns twice, once for the parent
and again for the child.
The value returned from \co{fork()} allows the caller to tell
the difference, as shown in
-Figure~\ref{fig:toolsoftrade:Using the fork() Primitive}
+Listing~\ref{lst:toolsoftrade:Using the fork() Primitive}
(\path{forkjoin.c}).
Line~1 executes the \co{fork()} primitive, and saves its return value
in local variable \co{pid}.
@@ -228,7 +228,7 @@ Otherwise, the \co{fork()} has executed successfully, and the parent
therefore executes line~9 with the variable \co{pid} containing the
process ID of the child.
-\begin{figure}[tbp]
+\begin{listing}[tbp]
{ \scriptsize
\begin{verbbox}
1 void waitall(void)
@@ -251,8 +251,8 @@ process ID of the child.
\centering
\theverbbox
\caption{Using the \tco{wait()} Primitive}
-\label{fig:toolsoftrade:Using the wait() Primitive}
-\end{figure}
+\label{lst:toolsoftrade:Using the wait() Primitive}
+\end{listing}
The parent process may use the \co{wait()} primitive to wait for its children
to complete.
@@ -261,7 +261,7 @@ counterpart, as each invocation of \co{wait()} waits for but one child
process.
It is therefore customary to wrap \co{wait()} into a function similar
to the \co{waitall()} function shown in
-Figure~\ref{fig:toolsoftrade:Using the wait() Primitive}
+Listing~\ref{lst:toolsoftrade:Using the wait() Primitive}
(\path{api-pthread.h}),
with this \co{waitall()} function having semantics similar to the
shell-script \co{wait} command.
@@ -283,14 +283,14 @@ Otherwise, lines~11 and 12 print an error and exit.
child individually.
In addition, some parallel applications need to detect the
reason that the child died.
- As we saw in Figure~\ref{fig:toolsoftrade:Using the wait() Primitive},
+ As we saw in Listing~\ref{lst:toolsoftrade:Using the wait() Primitive},
it is not hard to build a \co{waitall()} function out of
the \co{wait()} function, but it would be impossible to
do the reverse.
Once the information about a specific child is lost, it is lost.
} \QuickQuizEnd
-\begin{figure}[tbp]
+\begin{listing}[tbp]
{ \scriptsize
\begin{verbbox}
1 int x = 0;
@@ -313,13 +313,13 @@ Otherwise, lines~11 and 12 print an error and exit.
\centering
\theverbbox
\caption{Processes Created Via \tco{fork()} Do Not Share Memory}
-\label{fig:toolsoftrade:Processes Created Via fork() Do Not Share Memory}
-\end{figure}
+\label{lst:toolsoftrade:Processes Created Via fork() Do Not Share Memory}
+\end{listing}
It is critically important to note that the parent and child do \emph{not}
share memory.
This is illustrated by the program shown in
-Figure~\ref{fig:toolsoftrade:Processes Created Via fork() Do Not Share Memory}
+Listing~\ref{lst:toolsoftrade:Processes Created Via fork() Do Not Share Memory}
(\path{forkjoinvar.c}),
in which the child sets a global variable \co{x} to 1 on line~6,
prints a message on line~7, and exits on line~8.
@@ -353,7 +353,7 @@ Parent process sees x=0
source code of the Linux-kernel implementations themselves.
It is important to note that the parent process in
- Figure~\ref{fig:toolsoftrade:Processes Created Via fork() Do Not Share Memory}
+ Listing~\ref{lst:toolsoftrade:Processes Created Via fork() Do Not Share Memory}
waits until after the child terminates to do its \co{printf()}.
Using \co{printf()}'s buffered I/O concurrently to the same file
from multiple processes is non-trivial, and is best avoided.
@@ -376,7 +376,7 @@ than fork-join parallelism.
To create a thread within an existing process, invoke the
\co{pthread_create()} primitive, for example, as shown on lines~15
and~16 of
-Figure~\ref{fig:toolsoftrade:Threads Created Via pthread-create() Share Memory}
+Listing~\ref{lst:toolsoftrade:Threads Created Via pthread-create() Share Memory}
(\path{pcreate.c}).
The first argument is a pointer to a \co{pthread_t} in which to store the
ID of the thread to be created, the second \co{NULL} argument is a pointer
@@ -385,7 +385,7 @@ to an optional \co{pthread_attr_t}, the third argument is the function
that is to be invoked by the new thread, and the last \co{NULL} argument
is the argument that will be passed to \co{mythread}.
-\begin{figure}[tbp]
+\begin{listing}[tbp]
{ \scriptsize
\begin{verbbox}
1 int x = 0;
@@ -419,15 +419,15 @@ is the argument that will be passed to \co{mythread}.
\centering
\theverbbox
\caption{Threads Created Via \tco{pthread_create()} Share Memory}
-\label{fig:toolsoftrade:Threads Created Via pthread-create() Share Memory}
-\end{figure}
+\label{lst:toolsoftrade:Threads Created Via pthread-create() Share Memory}
+\end{listing}
In this example, \co{mythread()} simply returns, but it could instead
call \co{pthread_exit()}.
\QuickQuiz{}
If the \co{mythread()} function in
- Figure~\ref{fig:toolsoftrade:Threads Created Via pthread-create() Share Memory}
+ Listing~\ref{lst:toolsoftrade:Threads Created Via pthread-create() Share Memory}
can simply return, why bother with \co{pthread_exit()}?
\QuickQuizAnswer{
In this simple example, there is no reason whatsoever.
@@ -450,7 +450,7 @@ or the value returned by the thread's top-level function, depending on
how the thread in question exits.
The program shown in
-Figure~\ref{fig:toolsoftrade:Threads Created Via pthread-create() Share Memory}
+Listing~\ref{lst:toolsoftrade:Threads Created Via pthread-create() Share Memory}
produces output as follows, demonstrating that memory is in fact
shared between the two threads:
@@ -541,7 +541,7 @@ lock~\cite{Hoare74}.
to the reader.
} \QuickQuizEnd
-\begin{figure}[tbp]
+\begin{listing}[tbp]
{ \scriptsize
\begin{verbbox}
1 pthread_mutex_t lock_a = PTHREAD_MUTEX_INITIALIZER;
@@ -598,11 +598,11 @@ lock~\cite{Hoare74}.
\centering
\theverbbox
\caption{Demonstration of Exclusive Locks}
-\label{fig:toolsoftrade:Demonstration of Exclusive Locks}
-\end{figure}
+\label{lst:toolsoftrade:Demonstration of Exclusive Locks}
+\end{listing}
This exclusive-locking property is demonstrated using the code shown in
-Figure~\ref{fig:toolsoftrade:Demonstration of Exclusive Locks}
+Listing~\ref{lst:toolsoftrade:Demonstration of Exclusive Locks}
(\path{lock.c}).
Line~1 defines and initializes a POSIX lock named \co{lock_a}, while
line~2 similarly defines and initializes a lock named \co{lock_b}.
@@ -618,7 +618,7 @@ primitives.
\QuickQuiz{}
Why not simply make the argument to \co{lock_reader()}
on line~5 of
- Figure~\ref{fig:toolsoftrade:Demonstration of Exclusive Locks}
+ Listing~\ref{lst:toolsoftrade:Demonstration of Exclusive Locks}
be a pointer to a \co{pthread_mutex_t}?
\QuickQuizAnswer{
Because we will need to pass \co{lock_reader()} to
@@ -653,7 +653,7 @@ required by \co{pthread_create()}.
} \QuickQuizEnd
Lines~31-49 of
-Figure~\ref{fig:toolsoftrade:Demonstration of Exclusive Locks}
+Listing~\ref{lst:toolsoftrade:Demonstration of Exclusive Locks}
shows \co{lock_writer()}, which
periodically update the shared variable \co{x} while holding the
specified \co{pthread_mutex_t}.
@@ -664,7 +664,7 @@ While holding the lock, lines~40-43 increment the shared variable \co{x},
sleeping for five milliseconds between each increment.
Finally, lines~44-47 release the lock.
-\begin{figure}[tbp]
+\begin{listing}[tbp]
{ \scriptsize
\begin{verbbox}
1 printf("Creating two threads using same lock:\n");
@@ -691,10 +691,10 @@ Finally, lines~44-47 release the lock.
\centering
\theverbbox
\caption{Demonstration of Same Exclusive Lock}
-\label{fig:toolsoftrade:Demonstration of Same Exclusive Lock}
-\end{figure}
+\label{lst:toolsoftrade:Demonstration of Same Exclusive Lock}
+\end{listing}
-Figure~\ref{fig:toolsoftrade:Demonstration of Same Exclusive Lock}
+Listing~\ref{lst:toolsoftrade:Demonstration of Same Exclusive Lock}
shows a code fragment that runs \co{lock_reader()} and
\co{lock_writer()} as threads using the same lock, namely, \co{lock_a}.
Lines~2-6 create a thread running \co{lock_reader()}, and then
@@ -719,7 +719,7 @@ by \co{lock_writer()} while holding the lock.
\QuickQuiz{}
Is ``x = 0'' the only possible output from the code fragment
shown in
- Figure~\ref{fig:toolsoftrade:Demonstration of Same Exclusive Lock}?
+ Listing~\ref{lst:toolsoftrade:Demonstration of Same Exclusive Lock}?
If so, why?
If not, what other output could appear, and why?
\QuickQuizAnswer{
@@ -735,7 +735,7 @@ by \co{lock_writer()} while holding the lock.
However, there are no guarantees, especially on a busy system.
} \QuickQuizEnd
-\begin{figure}[tbp]
+\begin{listing}[tbp]
{ \scriptsize
\begin{verbbox}
1 printf("Creating two threads w/different locks:\n");
@@ -763,10 +763,10 @@ by \co{lock_writer()} while holding the lock.
\centering
\theverbbox
\caption{Demonstration of Different Exclusive Locks}
-\label{fig:toolsoftrade:Demonstration of Different Exclusive Locks}
-\end{figure}
+\label{lst:toolsoftrade:Demonstration of Different Exclusive Locks}
+\end{listing}
-Figure~\ref{fig:toolsoftrade:Demonstration of Different Exclusive Locks}
+Listing~\ref{lst:toolsoftrade:Demonstration of Different Exclusive Locks}
shows a similar code fragment, but this time using different locks:
\co{lock_a} for \co{lock_reader()} and \co{lock_b} for
\co{lock_writer()}.
@@ -811,7 +811,7 @@ values of \co{x} stored by \co{lock_writer()}.
\QuickQuiz{}
In the code shown in
- Figure~\ref{fig:toolsoftrade:Demonstration of Different Exclusive Locks},
+ Listing~\ref{lst:toolsoftrade:Demonstration of Different Exclusive Locks},
is \co{lock_reader()} guaranteed to see all the values produced
by \co{lock_writer()}?
Why or why not?
@@ -825,19 +825,19 @@ values of \co{x} stored by \co{lock_writer()}.
\QuickQuiz{}
Wait a minute here!!!
- Figure~\ref{fig:toolsoftrade:Demonstration of Same Exclusive Lock}
+ Listing~\ref{lst:toolsoftrade:Demonstration of Same Exclusive Lock}
didn't initialize shared variable \co{x},
so why does it need to be initialized in
- Figure~\ref{fig:toolsoftrade:Demonstration of Different Exclusive Locks}?
+ Listing~\ref{lst:toolsoftrade:Demonstration of Different Exclusive Locks}?
\QuickQuizAnswer{
See line~3 of
- Figure~\ref{fig:toolsoftrade:Demonstration of Exclusive Locks}.
+ Listing~\ref{lst:toolsoftrade:Demonstration of Exclusive Locks}.
Because the code in
- Figure~\ref{fig:toolsoftrade:Demonstration of Same Exclusive Lock}
+ Listing~\ref{lst:toolsoftrade:Demonstration of Same Exclusive Lock}
ran first, it could rely on the compile-time initialization of
\co{x}.
The code in
- Figure~\ref{fig:toolsoftrade:Demonstration of Different Exclusive Locks}
+ Listing~\ref{lst:toolsoftrade:Demonstration of Different Exclusive Locks}
ran next, so it had to re-initialize \co{x}.
} \QuickQuizEnd
@@ -873,7 +873,7 @@ an arbitrarily large number of readers to concurrently hold the lock.
However, in practice, we need to know how much additional scalability is
provided by reader-writer locks.
-\begin{figure}[tbp]
+\begin{listing}[tbp]
{ \scriptsize
\begin{verbbox}
1 pthread_rwlock_t rwl = PTHREAD_RWLOCK_INITIALIZER;
@@ -922,10 +922,10 @@ provided by reader-writer locks.
\centering
\theverbbox
\caption{Measuring Reader-Writer Lock Scalability}
-\label{fig:toolsoftrade:Measuring Reader-Writer Lock Scalability}
-\end{figure}
+\label{lst:toolsoftrade:Measuring Reader-Writer Lock Scalability}
+\end{listing}
-Figure~\ref{fig:toolsoftrade:Measuring Reader-Writer Lock Scalability}
+Listing~\ref{lst:toolsoftrade:Measuring Reader-Writer Lock Scalability}
(\path{rwlockscale.c})
shows one way of measuring reader-writer lock scalability.
Line~1 shows the definition and initialization of the reader-writer
@@ -955,7 +955,7 @@ rights to assume that the value of \co{goflag} would never change.
\QuickQuiz{}
Instead of using \co{READ_ONCE()} everywhere, why not just
declare \co{goflag} as \co{volatile} on line~10 of
- Figure~\ref{fig:toolsoftrade:Measuring Reader-Writer Lock Scalability}?
+ Listing~\ref{lst:toolsoftrade:Measuring Reader-Writer Lock Scalability}?
\QuickQuizAnswer{
A \co{volatile} declaration is in fact a reasonable alternative in
this particular case.
@@ -977,7 +977,7 @@ rights to assume that the value of \co{goflag} would never change.
Don't we also need memory barriers to make sure
that the change in \co{goflag}'s value propagates to the
CPU in a timely fashion in
- Figure~\ref{fig:toolsoftrade:Measuring Reader-Writer Lock Scalability}?
+ Listing~\ref{lst:toolsoftrade:Measuring Reader-Writer Lock Scalability}?
\QuickQuizAnswer{
No, memory barriers are not needed and won't help here.
Memory barriers only enforce ordering among multiple memory
@@ -1167,7 +1167,7 @@ to protect the tiniest of critical sections.
One such way are atomic operations.
We have seen one atomic operations already, in the form of the
\co{__sync_fetch_and_add()} primitive on line~18 of
-Figure~\ref{fig:toolsoftrade:Measuring Reader-Writer Lock Scalability}.
+Listing~\ref{lst:toolsoftrade:Measuring Reader-Writer Lock Scalability}.
This primitive atomically adds the value of its second argument to
the value referenced by its first argument, returning the old value
(which was ignored in this case).
@@ -1243,11 +1243,11 @@ In some cases, it is sufficient to constrain the compiler's ability
to reorder operations, while allowing the CPU free rein, in which
case the \co{barrier()} primitive may be used, as it in fact was
on line~28 of
-Figure~\ref{fig:toolsoftrade:Measuring Reader-Writer Lock Scalability}.
+Listing~\ref{lst:toolsoftrade:Measuring Reader-Writer Lock Scalability}.
In some cases, it is only necessary to ensure that the compiler
avoids optimizing away a given memory read, in which case the
\co{READ_ONCE()} primitive may be used, as it was on line~17 of
-Figure~\ref{fig:toolsoftrade:Demonstration of Exclusive Locks}.
+Listing~\ref{lst:toolsoftrade:Demonstration of Exclusive Locks}.
Similarly, the \co{WRITE_ONCE()} primitive may be used to prevent the
compiler from optimizing away a given memory write.
These last three primitives are not provided directly by \GCC,
@@ -1416,10 +1416,10 @@ Threads share everything except for per-thread local state,\footnote{
which includes program counter and stack.
The thread API is shown in
-Figure~\ref{fig:toolsoftrade:Thread API}, and members are described in the
+Listing~\ref{lst:toolsoftrade:Thread API}, and members are described in the
following sections.
-\begin{figure*}[tbp]
+\begin{listing*}[tbp]
{ \scriptsize
\begin{verbbox}
int smp_thread_id(void)
@@ -1433,8 +1433,8 @@ void wait_all_threads(void)
\centering
\theverbbox
\caption{Thread API}
-\label{fig:toolsoftrade:Thread API}
-\end{figure*}
+\label{lst:toolsoftrade:Thread API}
+\end{listing*}
\subsubsection{\tco{create_thread()}}
@@ -1499,7 +1499,7 @@ a run, so such synchronization is normally not needed.
\subsubsection{Example Usage}
-Figure~\ref{fig:toolsoftrade:Example Child Thread}
+Listing~\ref{lst:toolsoftrade:Example Child Thread}
shows an example hello-world-like child thread.
As noted earlier, each thread is allocated its own stack, so
each thread has its own private \co{arg} argument and \co{myarg} variable.
@@ -1509,7 +1509,7 @@ Note that the \co{return} statement on line~7 terminates the thread,
returning a \co{NULL} to whoever invokes \co{wait_thread()} on this
thread.
-\begin{figure}[tbp]
+\begin{listing}[tbp]
{ \scriptsize
\begin{verbbox}
1 void *thread_test(void *arg)
@@ -1525,11 +1525,11 @@ thread.
\centering
\theverbbox
\caption{Example Child Thread}
-\label{fig:toolsoftrade:Example Child Thread}
-\end{figure}
+\label{lst:toolsoftrade:Example Child Thread}
+\end{listing}
The parent program is shown in
-Figure~\ref{fig:toolsoftrade:Example Parent Thread}.
+Listing~\ref{lst:toolsoftrade:Example Parent Thread}.
It invokes \co{smp_init()} to initialize the threading system on
line~6,
parses arguments on lines~7-14, and announces its presence on line~15.
@@ -1538,7 +1538,7 @@ and waits for them to complete on line~18.
Note that \co{wait_all_threads()} discards the threads return values,
as in this case they are all \co{NULL}, which is not very interesting.
-\begin{figure}[tbp]
+\begin{listing}[tbp]
{ \scriptsize
\begin{verbbox}
1 int main(int argc, char *argv[])
@@ -1567,8 +1567,8 @@ as in this case they are all \co{NULL}, which is not very interesting.
\centering
\theverbbox
\caption{Example Parent Thread}
-\label{fig:toolsoftrade:Example Parent Thread}
-\end{figure}
+\label{lst:toolsoftrade:Example Parent Thread}
+\end{listing}
\QuickQuiz{}
What happened to the Linux-kernel equivalents to \co{fork()}
@@ -1584,11 +1584,11 @@ as in this case they are all \co{NULL}, which is not very interesting.
\label{sec:toolsoftrade:Locking}
A good starting subset of the Linux kernel's locking API is shown in
-Figure~\ref{fig:toolsoftrade:Locking API},
+Listing~\ref{lst:toolsoftrade:Locking API},
each API element being described in the following sections.
This book's CodeSamples locking API closely follows that of the Linux kernel.
-\begin{figure}[tbp]
+\begin{listing}[tbp]
{ \scriptsize
\begin{verbbox}
void spin_lock_init(spinlock_t *sp);
@@ -1600,8 +1600,8 @@ void spin_unlock(spinlock_t *sp);
\centering
\theverbbox
\caption{Locking API}
-\label{fig:toolsoftrade:Locking API}
-\end{figure}
+\label{lst:toolsoftrade:Locking API}
+\end{listing}
\subsubsection{\tco{spin_lock_init()}}
@@ -1721,7 +1721,7 @@ given per-CPU variable, \co{per_cpu()} to access a specified CPU's
instance of a given per-CPU variable, along with many other special-purpose
per-CPU operations.
-Figure~\ref{fig:toolsoftrade:Per-Thread-Variable API}
+Listing~\ref{lst:toolsoftrade:Per-Thread-Variable API}
shows this book's per-thread-variable API, which is patterned
after the Linux kernel's per-CPU-variable API.
This API provides the per-thread equivalent of global variables.
@@ -1729,7 +1729,7 @@ Although this API is, strictly speaking, not necessary\footnote{
You could instead use \co{__thread} or \co{_Thread_local}.},
it can provide a good userspace analogy to Linux kernel code.
-\begin{figure}[htbp]
+\begin{listing}[htbp]
{ \scriptsize
\begin{verbbox}
DEFINE_PER_THREAD(type, name)
@@ -1742,8 +1742,8 @@ init_per_thread(name, v)
\centering
\theverbbox
\caption{Per-Thread-Variable API}
-\label{fig:toolsoftrade:Per-Thread-Variable API}
-\end{figure}
+\label{lst:toolsoftrade:Per-Thread-Variable API}
+\end{listing}
\QuickQuiz{}
How could you work around the lack of a per-thread-variable
--
2.7.4
^ permalink raw reply related [flat|nested] 2+ messages in thread* Re: [PATCH] Convert code snippets to 'listing' env (howto, toolsoftrade, count)
2017-10-10 22:40 [PATCH] Convert code snippets to 'listing' env (howto, toolsoftrade, count) Akira Yokosawa
@ 2017-10-10 23:38 ` Paul E. McKenney
0 siblings, 0 replies; 2+ messages in thread
From: Paul E. McKenney @ 2017-10-10 23:38 UTC (permalink / raw)
To: Akira Yokosawa; +Cc: perfbook
On Wed, Oct 11, 2017 at 07:40:14AM +0900, Akira Yokosawa wrote:
> >From d3f43d97fdbcbe68b6e5936b8d94c50b78bfeaaf Mon Sep 17 00:00:00 2001
> From: Akira Yokosawa <akiyks@gmail.com>
> Date: Mon, 10 Oct 2017 22:18:17 +0900
> Subject: [PATCH] Convert code snippets to 'listing' env (howto, toolsoftrade, count)
>
> Also fix a typo referring table by "Figure" in count.tex.
>
> Signed-off-by: Akira Yokosawa <akiyks@gmail.com>
> ---
> Hi Paul,
>
> Looks like we have some time to do the conversion to "listing" environments
> before the upcoming release.
>
> This is the 1st round of the conversion.
Looks like a good time, thank you! Queued and pushed.
Thanx, Paul
> Thanks, Akira
> --
> count/count.tex | 276 +++++++++++++++++++++---------------------
> defer/rcuexercises.tex | 2 +-
> howto/howto.tex | 20 +--
> owned/owned.tex | 28 ++---
> together/applyrcu.tex | 16 +--
> toolsoftrade/toolsoftrade.tex | 140 ++++++++++-----------
> 6 files changed, 241 insertions(+), 241 deletions(-)
>
> diff --git a/count/count.tex b/count/count.tex
> index a213558..9638971 100644
> --- a/count/count.tex
> +++ b/count/count.tex
> @@ -165,7 +165,7 @@ are more appropriate for advanced students.
> \section{Why Isn't Concurrent Counting Trivial?}
> \label{sec:count:Why Isn't Concurrent Counting Trivial?}
>
> -\begin{figure}[bp]
> +\begin{listing}[bp]
> { \scriptsize
> \begin{verbbox}
> 1 long counter = 0;
> @@ -184,12 +184,12 @@ are more appropriate for advanced students.
> \centering
> \theverbbox
> \caption{Just Count!}
> -\label{fig:count:Just Count!}
> -\end{figure}
> +\label{lst:count:Just Count!}
> +\end{listing}
>
> Let's start with something simple, for example, the straightforward
> use of arithmetic shown in
> -Figure~\ref{fig:count:Just Count!} (\path{count_nonatomic.c}).
> +Listing~\ref{lst:count:Just Count!} (\path{count_nonatomic.c}).
> Here, we have a counter on line~1, we increment it on line~5, and we
> read out its value on line~10.
> What could be simpler?
> @@ -242,7 +242,7 @@ accuracies far greater than 50\,\% are almost always necessary.
> Furthermore, proofs can have bugs just as easily as programs can!
> } \QuickQuizEnd
>
> -\begin{figure}[tbp]
> +\begin{listing}[tbp]
> { \scriptsize
> \begin{verbbox}
> 1 atomic_t counter = ATOMIC_INIT(0);
> @@ -261,8 +261,8 @@ accuracies far greater than 50\,\% are almost always necessary.
> \centering
> \theverbbox
> \caption{Just Count Atomically!}
> -\label{fig:count:Just Count Atomically!}
> -\end{figure}
> +\label{lst:count:Just Count Atomically!}
> +\end{listing}
>
> \begin{figure}[tb]
> \centering
> @@ -273,7 +273,7 @@ accuracies far greater than 50\,\% are almost always necessary.
>
> The straightforward way to count accurately is to use atomic operations,
> as shown in
> -Figure~\ref{fig:count:Just Count Atomically!} (\path{count_atomic.c}).
> +Listing~\ref{lst:count:Just Count Atomically!} (\path{count_atomic.c}).
> Line~1 defines an atomic variable, line~5 atomically increments it, and
> line~10 reads it out.
> Because this is atomic, it keeps perfect count.
> @@ -491,7 +491,7 @@ thread (presumably cache aligned and padded to avoid false sharing).
> Section~\ref{sec:count:Per-Thread-Variable-Based Implementation}.
> } \QuickQuizEnd
>
> -\begin{figure}[tbp]
> +\begin{listing}[tbp]
> { \scriptsize
> \begin{verbbox}
> 1 DEFINE_PER_THREAD(long, counter);
> @@ -515,11 +515,11 @@ thread (presumably cache aligned and padded to avoid false sharing).
> \centering
> \theverbbox
> \caption{Array-Based Per-Thread Statistical Counters}
> -\label{fig:count:Array-Based Per-Thread Statistical Counters}
> -\end{figure}
> +\label{lst:count:Array-Based Per-Thread Statistical Counters}
> +\end{listing}
>
> Such an array can be wrapped into per-thread primitives, as shown in
> -Figure~\ref{fig:count:Array-Based Per-Thread Statistical Counters}
> +Listing~\ref{lst:count:Array-Based Per-Thread Statistical Counters}
> (\path{count_stat.c}).
> Line~1 defines an array containing a set of per-thread counters of
> type \co{long} named, creatively enough, \co{counter}.
> @@ -559,7 +559,7 @@ normal loads suffice, and no special atomic instructions are required.
>
> \QuickQuiz{}
> How does the per-thread \co{counter} variable in
> - Figure~\ref{fig:count:Array-Based Per-Thread Statistical Counters}
> + Listing~\ref{lst:count:Array-Based Per-Thread Statistical Counters}
> get initialized?
> \QuickQuizAnswer{
> The C standard specifies that the initial value of
> @@ -573,7 +573,7 @@ normal loads suffice, and no special atomic instructions are required.
>
> \QuickQuiz{}
> How is the code in
> - Figure~\ref{fig:count:Array-Based Per-Thread Statistical Counters}
> + Listing~\ref{lst:count:Array-Based Per-Thread Statistical Counters}
> supposed to permit more than one counter?
> \QuickQuizAnswer{
> Indeed, this toy example does not support more than one counter.
> @@ -602,7 +602,7 @@ at the beginning of this chapter.
> and during that time, the counter could well be changing.
> This means that the value returned by
> \co{read_count()} in
> - Figure~\ref{fig:count:Array-Based Per-Thread Statistical Counters}
> + Listing~\ref{lst:count:Array-Based Per-Thread Statistical Counters}
> will not necessarily be exact.
> Assume that the counter is being incremented at rate
> $r$ counts per unit time, and that \co{read_count()}'s
> @@ -743,7 +743,7 @@ date, however, once updates cease, the global counter will eventually
> converge on the true value---hence this approach qualifies as
> eventually consistent.
>
> -\begin{figure}[tbp]
> +\begin{listing}[tbp]
> { \scriptsize
> \begin{verbbox}
> 1 DEFINE_PER_THREAD(unsigned long, counter);
> @@ -802,11 +802,11 @@ eventually consistent.
> \centering
> \theverbbox
> \caption{Array-Based Per-Thread Eventually Consistent Counters}
> -\label{fig:count:Array-Based Per-Thread Eventually Consistent Counters}
> -\end{figure}
> +\label{lst:count:Array-Based Per-Thread Eventually Consistent Counters}
> +\end{listing}
>
> The implementation is shown in
> -Figure~\ref{fig:count:Array-Based Per-Thread Eventually Consistent Counters}
> +Listing~\ref{lst:count:Array-Based Per-Thread Eventually Consistent Counters}
> (\path{count_stat_eventual.c}).
> Lines~1-2 show the per-thread variable and the global variable that
> track the counter's value, and line three shows \co{stopflag}
> @@ -814,7 +814,7 @@ which is used to coordinate termination (for the case where we want
> to terminate the program with an accurate counter value).
> The \co{inc_count()} function shown on lines~5-9 is similar to its
> counterpart in
> -Figure~\ref{fig:count:Array-Based Per-Thread Statistical Counters}.
> +Listing~\ref{lst:count:Array-Based Per-Thread Statistical Counters}.
> The \co{read_count()} function shown on lines~11-14 simply returns the
> value of the \co{global_count} variable.
>
> @@ -834,7 +834,7 @@ comes at the cost of the additional thread running \co{eventual()}.
>
> \QuickQuiz{}
> Why doesn't \co{inc_count()} in
> - Figure~\ref{fig:count:Array-Based Per-Thread Eventually Consistent Counters}
> + Listing~\ref{lst:count:Array-Based Per-Thread Eventually Consistent Counters}
> need to use atomic instructions?
> After all, we now have multiple threads accessing the per-thread
> counters!
> @@ -872,7 +872,7 @@ comes at the cost of the additional thread running \co{eventual()}.
>
> \QuickQuiz{}
> Won't the single global thread in the function \co{eventual()} of
> - Figure~\ref{fig:count:Array-Based Per-Thread Eventually Consistent Counters}
> + Listing~\ref{lst:count:Array-Based Per-Thread Eventually Consistent Counters}
> be just as severe a bottleneck as a global lock would be?
> \QuickQuizAnswer{
> In this case, no.
> @@ -883,7 +883,7 @@ comes at the cost of the additional thread running \co{eventual()}.
>
> \QuickQuiz{}
> Won't the estimate returned by \co{read_count()} in
> - Figure~\ref{fig:count:Array-Based Per-Thread Eventually Consistent Counters}
> + Listing~\ref{lst:count:Array-Based Per-Thread Eventually Consistent Counters}
> become increasingly
> inaccurate as the number of threads rises?
> \QuickQuizAnswer{
> @@ -897,7 +897,7 @@ comes at the cost of the additional thread running \co{eventual()}.
>
> \QuickQuiz{}
> Given that in the eventually\-/consistent algorithm shown in
> - Figure~\ref{fig:count:Array-Based Per-Thread Eventually Consistent Counters}
> + Listing~\ref{lst:count:Array-Based Per-Thread Eventually Consistent Counters}
> both reads and updates have extremely low overhead
> and are extremely scalable, why would anyone bother with the
> implementation described in
> @@ -934,7 +934,7 @@ comes at the cost of the additional thread running \co{eventual()}.
> \subsection{Per-Thread-Variable-Based Implementation}
> \label{sec:count:Per-Thread-Variable-Based Implementation}
>
> -\begin{figure}[tb]
> +\begin{listing}[tb]
> { \scriptsize
> \begin{verbbox}
> 1 long __thread counter = 0;
> @@ -984,13 +984,13 @@ comes at the cost of the additional thread running \co{eventual()}.
> \centering
> \theverbbox
> \caption{Per-Thread Statistical Counters}
> -\label{fig:count:Per-Thread Statistical Counters}
> -\end{figure}
> +\label{lst:count:Per-Thread Statistical Counters}
> +\end{listing}
>
> Fortunately, \GCC\ provides an \co{__thread} storage class that provides
> per-thread storage.
> This can be used as shown in
> -Figure~\ref{fig:count:Per-Thread Statistical Counters} (\path{count_end.c})
> +Listing~\ref{lst:count:Per-Thread Statistical Counters} (\path{count_end.c})
> to implement
> a statistical counter that not only scales, but that also incurs little
> or no performance penalty to incrementers compared to simple non-atomic
> @@ -1058,7 +1058,7 @@ Finally, line~22 returns the sum.
>
> \QuickQuiz{}
> Doesn't the check for \co{NULL} on line~19 of
> - Figure~\ref{fig:count:Per-Thread Statistical Counters}
> + Listing~\ref{lst:count:Per-Thread Statistical Counters}
> add extra branch mispredictions?
> Why not have a variable set permanently to zero, and point
> unused counter-pointers to that variable rather than setting
> @@ -1074,7 +1074,7 @@ Finally, line~22 returns the sum.
> \QuickQuiz{}
> Why on earth do we need something as heavyweight as a \emph{lock}
> guarding the summation in the function \co{read_count()} in
> - Figure~\ref{fig:count:Per-Thread Statistical Counters}?
> + Listing~\ref{lst:count:Per-Thread Statistical Counters}?
> \QuickQuizAnswer{
> Remember, when a thread exits, its per-thread variables disappear.
> Therefore, if we attempt to access a given thread's per-thread
> @@ -1105,7 +1105,7 @@ array to point to its per-thread \co{counter} variable.
> \QuickQuiz{}
> Why on earth do we need to acquire the lock in
> \co{count_register_thread()} in
> - Figure~\ref{fig:count:Per-Thread Statistical Counters}?
> + Listing~\ref{lst:count:Per-Thread Statistical Counters}?
> It is a single properly aligned machine-word store to a location
> that no other thread is modifying, so it should be atomic anyway,
> right?
> @@ -1150,7 +1150,7 @@ variables vanish when that thread exits.
> accessible, even if the corresponding CPU is offline---even
> if the corresponding CPU never existed and never will exist.
>
> -\begin{figure}[tb]
> +\begin{listing}[tb]
> { \scriptsize
> \begin{verbbox}
> 1 long __thread counter = 0;
> @@ -1196,12 +1196,12 @@ variables vanish when that thread exits.
> \centering
> \theverbbox
> \caption{Per-Thread Statistical Counters With Lockless Summation}
> -\label{fig:count:Per-Thread Statistical Counters With Lockless Summation}
> -\end{figure}
> +\label{lst:count:Per-Thread Statistical Counters With Lockless Summation}
> +\end{listing}
>
> One workaround is to ensure that each thread continues to exist
> until all threads are finished, as shown in
> - Figure~\ref{fig:count:Per-Thread Statistical Counters With Lockless Summation}
> + Listing~\ref{lst:count:Per-Thread Statistical Counters With Lockless Summation}
> (\path{count_tstat.c}).
> Analysis of this code is left as an exercise to the reader,
> however, please note that it does not fit well into the
> @@ -1388,7 +1388,7 @@ Section~\ref{sec:SMPdesign:Parallel Fastpath}.
> \subsection{Simple Limit Counter Implementation}
> \label{sec:count:Simple Limit Counter Implementation}
>
> -\begin{figure}[tbp]
> +\begin{listing}[tbp]
> { \scriptsize
> \begin{verbbox}
> 1 unsigned long __thread counter = 0;
> @@ -1403,8 +1403,8 @@ Section~\ref{sec:SMPdesign:Parallel Fastpath}.
> \centering
> \theverbbox
> \caption{Simple Limit Counter Variables}
> -\label{fig:count:Simple Limit Counter Variables}
> -\end{figure}
> +\label{lst:count:Simple Limit Counter Variables}
> +\end{listing}
>
> \begin{figure}[tb]
> \centering
> @@ -1413,7 +1413,7 @@ Section~\ref{sec:SMPdesign:Parallel Fastpath}.
> \label{fig:count:Simple Limit Counter Variable Relationships}
> \end{figure}
>
> -Figure~\ref{fig:count:Simple Limit Counter Variables}
> +Listing~\ref{lst:count:Simple Limit Counter Variables}
> shows both the per-thread and global variables used by this
> implementation.
> The per-thread \co{counter} and \co{countermax} variables are the
> @@ -1443,7 +1443,7 @@ spinlock guards all of the global variables, in other words, no thread
> is permitted to access or modify any of the global variables unless it
> has acquired \co{gblcnt_mutex}.
>
> -\begin{figure}[tbp]
> +\begin{listing}[tbp]
> { \scriptsize
> \begin{verbbox}
> 1 int add_count(unsigned long delta)
> @@ -1501,17 +1501,17 @@ has acquired \co{gblcnt_mutex}.
> \centering
> \theverbbox
> \caption{Simple Limit Counter Add, Subtract, and Read}
> -\label{fig:count:Simple Limit Counter Add, Subtract, and Read}
> -\end{figure}
> +\label{lst:count:Simple Limit Counter Add, Subtract, and Read}
> +\end{listing}
>
> -Figure~\ref{fig:count:Simple Limit Counter Add, Subtract, and Read}
> +Listing~\ref{lst:count:Simple Limit Counter Add, Subtract, and Read}
> shows the \co{add_count()}, \co{sub_count()}, and \co{read_count()}
> functions (\path{count_lim.c}).
>
>
> \QuickQuiz{}
> Why does
> - Figure~\ref{fig:count:Simple Limit Counter Add, Subtract, and Read}
> + Listing~\ref{lst:count:Simple Limit Counter Add, Subtract, and Read}
> provide \co{add_count()} and \co{sub_count()} instead of the
> \co{inc_count()} and \co{dec_count()} interfaces show in
> Section~\ref{sec:count:Statistical Counters}?
> @@ -1531,7 +1531,7 @@ references only per-thread variables, and should not incur any cache misses.
>
> \QuickQuiz{}
> What is with the strange form of the condition on line~3 of
> - Figure~\ref{fig:count:Simple Limit Counter Add, Subtract, and Read}?
> + Listing~\ref{lst:count:Simple Limit Counter Add, Subtract, and Read}?
> Why not the following more intuitive form of the fastpath?
>
> \vspace{5pt}
> @@ -1552,7 +1552,7 @@ references only per-thread variables, and should not incur any cache misses.
> Try the above formulation with \co{counter} equal to 10 and
> \co{delta} equal to \co{ULONG_MAX}.
> Then try it again with the code shown in
> - Figure~\ref{fig:count:Simple Limit Counter Add, Subtract, and Read}.
> + Listing~\ref{lst:count:Simple Limit Counter Add, Subtract, and Read}.
>
> A good understanding of integer overflow will be required for
> the rest of this example, so if you have never dealt with
> @@ -1566,7 +1566,7 @@ If the test on line~3 fails, we must access global variables, and thus
> must acquire \co{gblcnt_mutex} on line~7, which we release on line~11
> in the failure case or on line~16 in the success case.
> Line~8 invokes \co{globalize_count()}, shown in
> -Figure~\ref{fig:count:Simple Limit Counter Utility Functions},
> +Listing~\ref{lst:count:Simple Limit Counter Utility Functions},
> which clears the thread-local variables, adjusting the global variables
> as needed, thus simplifying global processing.
> (But don't take \emph{my} word for it, try coding it yourself!)
> @@ -1581,7 +1581,7 @@ returns indicating failure.
> Otherwise, we take the slowpath.
> Line~14 adds \co{delta} to \co{globalcount}, and then
> line~15 invokes \co{balance_count()} (shown in
> -Figure~\ref{fig:count:Simple Limit Counter Utility Functions})
> +Listing~\ref{lst:count:Simple Limit Counter Utility Functions})
> in order to update both the global and the per-thread variables.
> This call to \co{balance_count()}
> will usually set this thread's \co{countermax} to re-enable the fastpath.
> @@ -1592,7 +1592,7 @@ line~17 returns indicating success.
> \QuickQuiz{}
> Why does \co{globalize_count()} zero the per-thread variables,
> only to later call \co{balance_count()} to refill them in
> - Figure~\ref{fig:count:Simple Limit Counter Add, Subtract, and Read}?
> + Listing~\ref{lst:count:Simple Limit Counter Add, Subtract, and Read}?
> Why not just leave the per-thread variables non-zero?
> \QuickQuizAnswer{
> That is in fact what an earlier version of this code did.
> @@ -1616,7 +1616,7 @@ Because the slowpath must access global state, line~26
> acquires \co{gblcnt_mutex}, which is released either by line~29
> (in case of failure) or by line~34 (in case of success).
> Line~27 invokes \co{globalize_count()}, shown in
> -Figure~\ref{fig:count:Simple Limit Counter Utility Functions},
> +Listing~\ref{lst:count:Simple Limit Counter Utility Functions},
> which again clears the thread-local variables, adjusting the global variables
> as needed.
> Line~28 checks to see if the counter can accommodate subtracting
> @@ -1626,7 +1626,7 @@ Line~28 checks to see if the counter can accommodate subtracting
> \QuickQuiz{}
> Given that \co{globalreserve} counted against us in \co{add_count()},
> why doesn't it count for us in \co{sub_count()} in
> - Figure~\ref{fig:count:Simple Limit Counter Add, Subtract, and Read}?
> + Listing~\ref{lst:count:Simple Limit Counter Add, Subtract, and Read}?
> \QuickQuizAnswer{
> The \co{globalreserve} variable tracks the sum of all threads'
> \co{countermax} variables.
> @@ -1641,7 +1641,7 @@ Line~28 checks to see if the counter can accommodate subtracting
>
> \QuickQuiz{}
> Suppose that one thread invokes \co{add_count()} shown in
> - Figure~\ref{fig:count:Simple Limit Counter Add, Subtract, and Read},
> + Listing~\ref{lst:count:Simple Limit Counter Add, Subtract, and Read},
> and then another thread invokes \co{sub_count()}.
> Won't \co{sub_count()} return failure even though the value of
> the counter is non-zero?
> @@ -1658,14 +1658,14 @@ If, on the other hand, line~28 finds that the counter \emph{can}
> accommodate subtracting \co{delta}, we complete the slowpath.
> Line~32 does the subtraction and then
> line~33 invokes \co{balance_count()} (shown in
> -Figure~\ref{fig:count:Simple Limit Counter Utility Functions})
> +Listing~\ref{lst:count:Simple Limit Counter Utility Functions})
> in order to update both global and per-thread variables
> (hopefully re-enabling the fastpath).
> Then line~34 releases \co{gblcnt_mutex}, and line~35 returns success.
>
> \QuickQuiz{}
> Why have both \co{add_count()} and \co{sub_count()} in
> - Figure~\ref{fig:count:Simple Limit Counter Add, Subtract, and Read}?
> + Listing~\ref{lst:count:Simple Limit Counter Add, Subtract, and Read}?
> Why not simply pass a negative number to \co{add_count()}?
> \QuickQuizAnswer{
> Given that \co{add_count()} takes an \co{unsigned} \co{long}
> @@ -1686,7 +1686,7 @@ Line~44 initializes local variable \co{sum} to the value of
> per-thread \co{counter} variables.
> Line~49 then returns the sum.
>
> -\begin{figure}[tbp]
> +\begin{listing}[tbp]
> { \scriptsize
> \begin{verbbox}
> 1 static void globalize_count(void)
> @@ -1732,13 +1732,13 @@ Line~49 then returns the sum.
> \centering
> \theverbbox
> \caption{Simple Limit Counter Utility Functions}
> -\label{fig:count:Simple Limit Counter Utility Functions}
> -\end{figure}
> +\label{lst:count:Simple Limit Counter Utility Functions}
> +\end{listing}
>
> -Figure~\ref{fig:count:Simple Limit Counter Utility Functions}
> +Listing~\ref{lst:count:Simple Limit Counter Utility Functions}
> shows a number of utility functions used by the \co{add_count()},
> \co{sub_count()}, and \co{read_count()} primitives shown in
> -Figure~\ref{fig:count:Simple Limit Counter Add, Subtract, and Read}.
> +Listing~\ref{lst:count:Simple Limit Counter Add, Subtract, and Read}.
>
> Lines~1-7 show \co{globalize_count()}, which zeros the current thread's
> per-thread counters, adjusting the global variables appropriately.
> @@ -1782,7 +1782,7 @@ Finally, in either case, line~18 makes the corresponding adjustment to
>
> \QuickQuiz{}
> Why set \co{counter} to \co{countermax / 2} in line~15 of
> - Figure~\ref{fig:count:Simple Limit Counter Utility Functions}?
> + Listing~\ref{lst:count:Simple Limit Counter Utility Functions}?
> Wouldn't it be simpler to just take \co{countermax} counts?
> \QuickQuizAnswer{
> First, it really is reserving \co{countermax} counts
> @@ -1902,7 +1902,7 @@ This task is undertaken in the next section.
> \subsection{Approximate Limit Counter Implementation}
> \label{sec:count:Approximate Limit Counter Implementation}
>
> -\begin{figure}[tbp]
> +\begin{listing}[tbp]
> { \scriptsize
> \begin{verbbox}
> 1 unsigned long __thread counter = 0;
> @@ -1918,10 +1918,10 @@ This task is undertaken in the next section.
> \centering
> \theverbbox
> \caption{Approximate Limit Counter Variables}
> -\label{fig:count:Approximate Limit Counter Variables}
> -\end{figure}
> +\label{lst:count:Approximate Limit Counter Variables}
> +\end{listing}
>
> -\begin{figure}[tbp]
> +\begin{listing}[tbp]
> { \scriptsize
> \begin{verbbox}
> 1 static void balance_count(void)
> @@ -1942,25 +1942,25 @@ This task is undertaken in the next section.
> \centering
> \theverbbox
> \caption{Approximate Limit Counter Balancing}
> -\label{fig:count:Approximate Limit Counter Balancing}
> -\end{figure}
> +\label{lst:count:Approximate Limit Counter Balancing}
> +\end{listing}
>
> Because this implementation (\path{count_lim_app.c}) is quite similar to
> that in the previous section
> -(Figures~\ref{fig:count:Simple Limit Counter Variables},
> -\ref{fig:count:Simple Limit Counter Add, Subtract, and Read}, and
> -\ref{fig:count:Simple Limit Counter Utility Functions}),
> +(Listings~\ref{lst:count:Simple Limit Counter Variables},
> +\ref{lst:count:Simple Limit Counter Add, Subtract, and Read}, and
> +\ref{lst:count:Simple Limit Counter Utility Functions}),
> only the changes are shown here.
> -Figure~\ref{fig:count:Approximate Limit Counter Variables}
> +Listing~\ref{lst:count:Approximate Limit Counter Variables}
> is identical to
> -Figure~\ref{fig:count:Simple Limit Counter Variables},
> +Listing~\ref{lst:count:Simple Limit Counter Variables},
> with the addition of \co{MAX_COUNTERMAX}, which sets the maximum
> permissible value of the per-thread \co{countermax} variable.
>
> Similarly,
> -Figure~\ref{fig:count:Approximate Limit Counter Balancing}
> +Listing~\ref{lst:count:Approximate Limit Counter Balancing}
> is identical to the \co{balance_count()} function in
> -Figure~\ref{fig:count:Simple Limit Counter Utility Functions},
> +Listing~\ref{lst:count:Simple Limit Counter Utility Functions},
> with the addition of lines~6 and 7, which enforce the
> \co{MAX_COUNTERMAX} limit on the per-thread \co{countermax} variable.
>
> @@ -2038,7 +2038,7 @@ represent \co{counter} and the low-order 16 bits to represent
> the reader.
> } \QuickQuizEnd
>
> -\begin{figure}[tbp]
> +\begin{listing}[tbp]
> { \scriptsize
> \begin{verbbox}
> 1 atomic_t __thread ctrandmax = ATOMIC_INIT(0);
> @@ -2079,12 +2079,12 @@ represent \co{counter} and the low-order 16 bits to represent
> \centering
> \theverbbox
> \caption{Atomic Limit Counter Variables and Access Functions}
> -\label{fig:count:Atomic Limit Counter Variables and Access Functions}
> -\end{figure}
> +\label{lst:count:Atomic Limit Counter Variables and Access Functions}
> +\end{listing}
>
> The variables and access functions for a simple atomic limit counter
> are shown in
> -Figure~\ref{fig:count:Atomic Limit Counter Variables and Access Functions}
> +Listing~\ref{lst:count:Atomic Limit Counter Variables and Access Functions}
> (\path{count_lim_atomic.c}).
> The \co{counter} and \co{countermax} variables in earlier algorithms
> are combined into the single variable \co{ctrandmax} shown on
> @@ -2096,7 +2096,7 @@ representation of \co{int}.
> Lines~2-6 show the definitions for \co{globalcountmax}, \co{globalcount},
> \co{globalreserve}, \co{counterp}, and \co{gblcnt_mutex}, all of which
> take on roles similar to their counterparts in
> -Figure~\ref{fig:count:Approximate Limit Counter Variables}.
> +Listing~\ref{lst:count:Approximate Limit Counter Variables}.
> Line~7 defines \co{CM_BITS}, which gives the number of bits in each half
> of \co{ctrandmax}, and line~8 defines \co{MAX_COUNTERMAX}, which
> gives the maximum value that may be held in either half of
> @@ -2104,7 +2104,7 @@ gives the maximum value that may be held in either half of
>
> \QuickQuiz{}
> In what way does line~7 of
> - Figure~\ref{fig:count:Atomic Limit Counter Variables and Access Functions}
> + Listing~\ref{lst:count:Atomic Limit Counter Variables and Access Functions}
> violate the C standard?
> \QuickQuizAnswer{
> It assumes eight bits per byte.
> @@ -2134,7 +2134,7 @@ it on line~24.
> \QuickQuiz{}
> Given that there is only one \co{ctrandmax} variable,
> why bother passing in a pointer to it on line~18 of
> - Figure~\ref{fig:count:Atomic Limit Counter Variables and Access Functions}?
> + Listing~\ref{lst:count:Atomic Limit Counter Variables and Access Functions}?
> \QuickQuizAnswer{
> There is only one \co{ctrandmax} variable \emph{per thread}.
> Later, we will see code that needs to pass other threads'
> @@ -2149,7 +2149,7 @@ the result.
>
> \QuickQuiz{}
> Why does \co{merge_ctrandmax()} in
> - Figure~\ref{fig:count:Atomic Limit Counter Variables and Access Functions}
> + Listing~\ref{lst:count:Atomic Limit Counter Variables and Access Functions}
> return an \co{int} rather than storing directly into an
> \co{atomic_t}?
> \QuickQuizAnswer{
> @@ -2157,7 +2157,7 @@ the result.
> to the \co{atomic_cmpxchg()} primitive.
> } \QuickQuizEnd
>
> -\begin{figure}[tbp]
> +\begin{listing}[tbp]
> { \scriptsize
> \begin{verbbox}
> 1 int add_count(unsigned long delta)
> @@ -2228,10 +2228,10 @@ the result.
> \centering
> \theverbbox
> \caption{Atomic Limit Counter Add and Subtract}
> -\label{fig:count:Atomic Limit Counter Add and Subtract}
> -\end{figure}
> +\label{lst:count:Atomic Limit Counter Add and Subtract}
> +\end{listing}
>
> -Figure~\ref{fig:count:Atomic Limit Counter Add and Subtract}
> +Listing~\ref{lst:count:Atomic Limit Counter Add and Subtract}
> shows the \co{add_count()} and \co{sub_count()} functions.
>
> Lines~1-32 show \co{add_count()}, whose fastpath spans lines~8-15,
> @@ -2256,7 +2256,7 @@ execution continues in the loop at line~9.
> \QuickQuiz{}
> Yecch!
> Why the ugly \co{goto} on line~11 of
> - Figure~\ref{fig:count:Atomic Limit Counter Add and Subtract}?
> + Listing~\ref{lst:count:Atomic Limit Counter Add and Subtract}?
> Haven't you heard of the \co{break} statement???
> \QuickQuizAnswer{
> Replacing the \co{goto} with a \co{break} would require keeping
> @@ -2271,20 +2271,20 @@ execution continues in the loop at line~9.
>
> \QuickQuiz{}
> Why would the \co{atomic_cmpxchg()} primitive at lines~13-14 of
> - Figure~\ref{fig:count:Atomic Limit Counter Add and Subtract}
> + Listing~\ref{lst:count:Atomic Limit Counter Add and Subtract}
> ever fail?
> After all, we picked up its old value on line~9 and have not
> changed it!
> \QuickQuizAnswer{
> Later, we will see how the \co{flush_local_count()} function in
> - Figure~\ref{fig:count:Atomic Limit Counter Utility Functions 1}
> + Listing~\ref{lst:count:Atomic Limit Counter Utility Functions 1}
> might update this thread's \co{ctrandmax} variable concurrently
> with the execution of the fastpath on lines~8-14 of
> - Figure~\ref{fig:count:Atomic Limit Counter Add and Subtract}.
> + Listing~\ref{lst:count:Atomic Limit Counter Add and Subtract}.
> } \QuickQuizEnd
>
> Lines~16-31 of
> -Figure~\ref{fig:count:Atomic Limit Counter Add and Subtract}
> +Listing~\ref{lst:count:Atomic Limit Counter Add and Subtract}
> show \co{add_count()}'s slowpath, which is protected by \co{gblcnt_mutex},
> which is acquired on line~17 and released on lines~24 and 30.
> Line~18 invokes \co{globalize_count()}, which moves this thread's
> @@ -2304,14 +2304,14 @@ spreads counts to the local state if appropriate, line~30 releases
> returns success.
>
> Lines~34-63 of
> -Figure~\ref{fig:count:Atomic Limit Counter Add and Subtract}
> +Listing~\ref{lst:count:Atomic Limit Counter Add and Subtract}
> show \co{sub_count()}, which is structured similarly to
> \co{add_count()}, having a fastpath on lines~41-48 and a slowpath on
> lines~49-62.
> A line-by-line analysis of this function is left as an exercise to
> the reader.
>
> -\begin{figure}[tbp]
> +\begin{listing}[tbp]
> { \scriptsize
> \begin{verbbox}
> 1 unsigned long read_count(void)
> @@ -2337,10 +2337,10 @@ the reader.
> \centering
> \theverbbox
> \caption{Atomic Limit Counter Read}
> -\label{fig:count:Atomic Limit Counter Read}
> -\end{figure}
> +\label{lst:count:Atomic Limit Counter Read}
> +\end{listing}
>
> -Figure~\ref{fig:count:Atomic Limit Counter Read} shows \co{read_count()}.
> +Listing~\ref{lst:count:Atomic Limit Counter Read} shows \co{read_count()}.
> Line~9 acquires \co{gblcnt_mutex} and line~16 releases it.
> Line~10 initializes local variable \co{sum} to the value of
> \co{globalcount}, and the loop spanning lines~11-15 adds the
> @@ -2348,7 +2348,7 @@ per-thread counters to this sum, isolating each per-thread counter
> using \co{split_ctrandmax} on line~13.
> Finally, line~17 returns the sum.
>
> -\begin{figure}[tbp]
> +\begin{listing}[tbp]
> { \scriptsize
> \begin{verbbox}
> 1 static void globalize_count(void)
> @@ -2388,10 +2388,10 @@ Finally, line~17 returns the sum.
> \centering
> \theverbbox
> \caption{Atomic Limit Counter Utility Functions 1}
> -\label{fig:count:Atomic Limit Counter Utility Functions 1}
> -\end{figure}
> +\label{lst:count:Atomic Limit Counter Utility Functions 1}
> +\end{listing}
>
> -\begin{figure}[tb]
> +\begin{listing}[tb]
> { \scriptsize
> \begin{verbbox}
> 1 static void balance_count(void)
> @@ -2440,11 +2440,11 @@ Finally, line~17 returns the sum.
> \centering
> \theverbbox
> \caption{Atomic Limit Counter Utility Functions 2}
> -\label{fig:count:Atomic Limit Counter Utility Functions 2}
> -\end{figure}
> +\label{lst:count:Atomic Limit Counter Utility Functions 2}
> +\end{listing}
>
> -Figures~\ref{fig:count:Atomic Limit Counter Utility Functions 1}
> -and~\ref{fig:count:Atomic Limit Counter Utility Functions 2}
> +Listings~\ref{lst:count:Atomic Limit Counter Utility Functions 1}
> +and~\ref{lst:count:Atomic Limit Counter Utility Functions 2}
> shows the utility functions
> \co{globalize_count()},
> \co{flush_local_count()},
> @@ -2452,7 +2452,7 @@ shows the utility functions
> \co{count_register_thread()}, and
> \co{count_unregister_thread()}.
> The code for \co{globalize_count()} is shown on lines~1-12,
> -of Figure~\ref{fig:count:Atomic Limit Counter Utility Functions 1} and
> +of Listing~\ref{lst:count:Atomic Limit Counter Utility Functions 1} and
> is similar to that of previous algorithms, with the addition of
> line~7, which is now required to split out \co{counter} and
> \co{countermax} from \co{ctrandmax}.
> @@ -2477,7 +2477,7 @@ line~30 subtracts this thread's \co{countermax} from \co{globalreserve}.
> What stops a thread from simply refilling its
> \co{ctrandmax} variable immediately after
> \co{flush_local_count()} on line~14 of
> - Figure~\ref{fig:count:Atomic Limit Counter Utility Functions 1}
> + Listing~\ref{lst:count:Atomic Limit Counter Utility Functions 1}
> empties it?
> \QuickQuizAnswer{
> This other thread cannot refill its \co{ctrandmax}
> @@ -2494,7 +2494,7 @@ line~30 subtracts this thread's \co{countermax} from \co{globalreserve}.
> \co{add_count()} or \co{sub_count()} from interfering with
> the \co{ctrandmax} variable while
> \co{flush_local_count()} is accessing it on line~27 of
> - Figure~\ref{fig:count:Atomic Limit Counter Utility Functions 1}
> + Listing~\ref{lst:count:Atomic Limit Counter Utility Functions 1}
> empties it?
> \QuickQuizAnswer{
> Nothing.
> @@ -2520,7 +2520,7 @@ line~30 subtracts this thread's \co{countermax} from \co{globalreserve}.
> } \QuickQuizEnd
>
> Lines~1-22 on
> -Figure~\ref{fig:count:Atomic Limit Counter Utility Functions 2}
> +Listing~\ref{lst:count:Atomic Limit Counter Utility Functions 2}
> show the code for \co{balance_count()}, which refills
> the calling thread's local \co{ctrandmax} variable.
> This function is quite similar to that of the preceding algorithms,
> @@ -2534,7 +2534,7 @@ line~33.
> Given that the \co{atomic_set()} primitive does a simple
> store to the specified \co{atomic_t}, how can line~21 of
> \co{balance_count()} in
> - Figure~\ref{fig:count:Atomic Limit Counter Utility Functions 2}
> + Listing~\ref{lst:count:Atomic Limit Counter Utility Functions 2}
> work correctly in face of concurrent \co{flush_local_count()}
> updates to this variable?
> \QuickQuizAnswer{
> @@ -2668,7 +2668,7 @@ The slowpath then sets that thread's \co{theft} state to IDLE.
> \subsection{Signal-Theft Limit Counter Implementation}
> \label{sec:count:Signal-Theft Limit Counter Implementation}
>
> -\begin{figure}[tbp]
> +\begin{listing}[tbp]
> { \scriptsize
> \begin{verbbox}
> 1 #define THEFT_IDLE 0
> @@ -2693,10 +2693,10 @@ The slowpath then sets that thread's \co{theft} state to IDLE.
> \centering
> \theverbbox
> \caption{Signal-Theft Limit Counter Data}
> -\label{fig:count:Signal-Theft Limit Counter Data}
> -\end{figure}
> +\label{lst:count:Signal-Theft Limit Counter Data}
> +\end{listing}
>
> -Figure~\ref{fig:count:Signal-Theft Limit Counter Data}
> +Listing~\ref{lst:count:Signal-Theft Limit Counter Data}
> (\path{count_lim_sig.c})
> shows the data structures used by the signal-theft based counter
> implementation.
> @@ -2706,7 +2706,7 @@ Lines~8-17 are similar to earlier implementations, with the addition of
> lines~14 and 15 to allow remote access to a thread's \co{countermax}
> and \co{theft} variables, respectively.
>
> -\begin{figure}[tbp]
> +\begin{listing}[tbp]
> { \scriptsize
> \begin{verbbox}
> 1 static void globalize_count(void)
> @@ -2777,10 +2777,10 @@ and \co{theft} variables, respectively.
> \centering
> \theverbbox
> \caption{Signal-Theft Limit Counter Value-Migration Functions}
> -\label{fig:count:Signal-Theft Limit Counter Value-Migration Functions}
> -\end{figure}
> +\label{lst:count:Signal-Theft Limit Counter Value-Migration Functions}
> +\end{listing}
>
> -Figure~\ref{fig:count:Signal-Theft Limit Counter Value-Migration Functions}
> +Listing~\ref{lst:count:Signal-Theft Limit Counter Value-Migration Functions}
> shows the functions responsible for migrating counts between per-thread
> variables and the global variables.
> Lines~1-7 shows \co{globalize_count()}, which is identical to earlier
> @@ -2796,7 +2796,7 @@ this thread's fastpaths are not running, line~16 sets the \co{theft}
> state to READY.
>
> \QuickQuiz{}
> - In Figure~\ref{fig:count:Signal-Theft Limit Counter Value-Migration Functions}
> + In Listing~\ref{lst:count:Signal-Theft Limit Counter Value-Migration Functions}
> function \co{flush_local_count_sig()}, why are there
> \co{ACCESS_ONCE()} wrappers around the uses of the
> \co{theft} per-thread variable?
> @@ -2835,7 +2835,7 @@ Otherwise, line~32 sets the thread's \co{theft} state to REQ and
> line~33 sends the thread a signal.
>
> \QuickQuiz{}
> - In Figure~\ref{fig:count:Signal-Theft Limit Counter Value-Migration Functions},
> + In Listing~\ref{lst:count:Signal-Theft Limit Counter Value-Migration Functions},
> why is it safe for line~28 to directly access the other thread's
> \co{countermax} variable?
> \QuickQuizAnswer{
> @@ -2849,7 +2849,7 @@ line~33 sends the thread a signal.
> } \QuickQuizEnd
>
> \QuickQuiz{}
> - In Figure~\ref{fig:count:Signal-Theft Limit Counter Value-Migration Functions},
> + In Listing~\ref{lst:count:Signal-Theft Limit Counter Value-Migration Functions},
> why doesn't line~33 check for the current thread sending itself
> a signal?
> \QuickQuizAnswer{
> @@ -2861,7 +2861,7 @@ line~33 sends the thread a signal.
>
> \QuickQuiz{}
> The code in
> - Figure~\ref{fig:count:Signal-Theft Limit Counter Value-Migration Functions},
> + Listing~\ref{lst:count:Signal-Theft Limit Counter Value-Migration Functions},
> works with \GCC\ and POSIX.
> What would be required to make it also conform to the ISO C standard?
> \QuickQuizAnswer{
> @@ -2882,7 +2882,7 @@ READY, so lines~43-46 do the thieving.
> Line~47 then sets the thread's \co{theft} state back to IDLE.
>
> \QuickQuiz{}
> - In Figure~\ref{fig:count:Signal-Theft Limit Counter Value-Migration Functions}, why does line~41 resend the signal?
> + In Listing~\ref{lst:count:Signal-Theft Limit Counter Value-Migration Functions}, why does line~41 resend the signal?
> \QuickQuizAnswer{
> Because many operating systems over several decades have
> had the property of losing the occasional signal.
> @@ -2897,7 +2897,7 @@ Line~47 then sets the thread's \co{theft} state back to IDLE.
> Lines~51-63 show \co{balance_count()}, which is similar to that of
> earlier examples.
>
> -\begin{figure}[tbp]
> +\begin{listing}[tbp]
> { \scriptsize
> \begin{verbbox}
> 1 int add_count(unsigned long delta)
> @@ -2941,10 +2941,10 @@ earlier examples.
> \centering
> \theverbbox
> \caption{Signal-Theft Limit Counter Add Function}
> -\label{fig:count:Signal-Theft Limit Counter Add Function}
> -\end{figure}
> +\label{lst:count:Signal-Theft Limit Counter Add Function}
> +\end{listing}
>
> -\begin{figure}[tb]
> +\begin{listing}[tb]
> { \scriptsize
> \begin{verbbox}
> 38 int sub_count(unsigned long delta)
> @@ -2986,10 +2986,10 @@ earlier examples.
> \centering
> \theverbbox
> \caption{Signal-Theft Limit Counter Subtract Function}
> -\label{fig:count:Signal-Theft Limit Counter Subtract Function}
> -\end{figure}
> +\label{lst:count:Signal-Theft Limit Counter Subtract Function}
> +\end{listing}
>
> -Figure~\ref{fig:count:Signal-Theft Limit Counter Add Function}
> +Listing~\ref{lst:count:Signal-Theft Limit Counter Add Function}
> shows the \co{add_count()} function.
> The fastpath spans lines~5-20, and the slowpath lines~21-35.
> Line~5 sets the per-thread \co{counting} variable to 1 so that
> @@ -3014,7 +3014,7 @@ READY also sees the effects of line~9.
> If the fastpath addition at line~9 was executed, then line~20 returns
> success.
>
> -\begin{figure}[tbp]
> +\begin{listing}[tbp]
> { \scriptsize
> \begin{verbbox}
> 1 unsigned long read_count(void)
> @@ -3035,21 +3035,21 @@ success.
> \centering
> \theverbbox
> \caption{Signal-Theft Limit Counter Read Function}
> -\label{fig:count:Signal-Theft Limit Counter Read Function}
> -\end{figure}
> +\label{lst:count:Signal-Theft Limit Counter Read Function}
> +\end{listing}
>
> Otherwise, we fall through to the slowpath starting at line~21.
> The structure of the slowpath is similar to those of earlier examples,
> so its analysis is left as an exercise to the reader.
> Similarly, the structure of \co{sub_count()} on
> -Figure~\ref{fig:count:Signal-Theft Limit Counter Subtract Function}
> +Listing~\ref{lst:count:Signal-Theft Limit Counter Subtract Function}
> is the same
> as that of \co{add_count()}, so the analysis of \co{sub_count()} is also
> left as an exercise for the reader, as is the analysis of
> \co{read_count()} in
> -Figure~\ref{fig:count:Signal-Theft Limit Counter Read Function}.
> +Listing~\ref{lst:count:Signal-Theft Limit Counter Read Function}.
>
> -\begin{figure}[tbp]
> +\begin{listing}[tbp]
> { \scriptsize
> \begin{verbbox}
> 1 void count_init(void)
> @@ -3092,11 +3092,11 @@ Figure~\ref{fig:count:Signal-Theft Limit Counter Read Function}.
> \centering
> \theverbbox
> \caption{Signal-Theft Limit Counter Initialization Functions}
> -\label{fig:count:Signal-Theft Limit Counter Initialization Functions}
> -\end{figure}
> +\label{lst:count:Signal-Theft Limit Counter Initialization Functions}
> +\end{listing}
>
> Lines~1-12 of
> -Figure~\ref{fig:count:Signal-Theft Limit Counter Initialization Functions}
> +Listing~\ref{lst:count:Signal-Theft Limit Counter Initialization Functions}
> show \co{count_init()}, which set up \co{flush_local_count_sig()}
> as the signal handler for \co{SIGUSR1},
> enabling the \co{pthread_kill()} calls in \co{flush_local_count()}
> @@ -3414,7 +3414,7 @@ courtesy of eventual consistency.
> \label{tab:count:Limit Counter Performance on Power-6}
> \end{table*}
>
> -Figure~\ref{tab:count:Limit Counter Performance on Power-6}
> +Table~\ref{tab:count:Limit Counter Performance on Power-6}
> shows the performance of the parallel limit-counting algorithms.
> Exact enforcement of the limits incurs a substantial performance
> penalty, although on this 4.7\,GHz \Power{6} system that penalty can be reduced
> diff --git a/defer/rcuexercises.tex b/defer/rcuexercises.tex
> index 1d9e138..21d9683 100644
> --- a/defer/rcuexercises.tex
> +++ b/defer/rcuexercises.tex
> @@ -14,7 +14,7 @@ suffice for most of these exercises.
>
> \QuickQuiz{}
> The statistical-counter implementation shown in
> - Figure~\ref{fig:count:Per-Thread Statistical Counters}
> + Listing~\ref{lst:count:Per-Thread Statistical Counters}
> (\path{count_end.c})
> used a global lock to guard the summation in \co{read_count()},
> which resulted in poor performance and negative scalability.
> diff --git a/howto/howto.tex b/howto/howto.tex
> index 90abb14..1a0b75c 100644
> --- a/howto/howto.tex
> +++ b/howto/howto.tex
> @@ -361,7 +361,7 @@ Other types of systems have well-known ways of locating files by filename.
> \section{Whose Book Is This?}
> \label{sec:howto:Whose Book Is This?}
>
> -\begin{figure*}[tbp]
> +\begin{listing*}[tbp]
> {
> \scriptsize
> \begin{verbbox}
> @@ -376,10 +376,10 @@ Other types of systems have well-known ways of locating files by filename.
> }
> \hspace*{1in}\OneColumnHSpace{-0.5in}\theverbbox
> \caption{Creating an Up-To-Date PDF}
> -\label{fig:howto:Creating a Up-To-Date PDF}
> -\end{figure*}
> +\label{lst:howto:Creating a Up-To-Date PDF}
> +\end{listing*}
>
> -\begin{figure*}[tbp]
> +\begin{listing*}[tbp]
> {
> \scriptsize
> \begin{verbbox}
> @@ -393,8 +393,8 @@ Other types of systems have well-known ways of locating files by filename.
> }
> \hspace*{1in}\OneColumnHSpace{-0.5in}\theverbbox
> \caption{Generating an Updated PDF}
> -\label{fig:howto:Generating an Updated PDF}
> -\end{figure*}
> +\label{lst:howto:Generating an Updated PDF}
> +\end{listing*}
>
> As the cover says, the editor is one Paul E.~McKenney.
> However, the editor does accept contributions via the
> @@ -416,18 +416,18 @@ in the file \path{FAQ-BUILD.txt} in the \LaTeX{} source to the book.
>
> To create and display a current \LaTeX{} source tree of this book,
> use the list of Linux commands shown in
> -Figure~\ref{fig:howto:Creating a Up-To-Date PDF}.
> +Listing~\ref{lst:howto:Creating a Up-To-Date PDF}.
> In some environments, the \co{evince} command that displays \path{perfbook.pdf}
> may need to be replaced, for example, with \co{acroread}.
> The \co{git clone} command need only be used the first time you
> create a PDF, subsequently, you can run the commands shown in
> -Figure~\ref{fig:howto:Generating an Updated PDF} to pull in any updates
> +Listing~\ref{lst:howto:Generating an Updated PDF} to pull in any updates
> and generate an updated PDF.
> The commands in
> -Figure~\ref{fig:howto:Generating an Updated PDF}
> +Listing~\ref{lst:howto:Generating an Updated PDF}
> must be run within the \path{perfbook} directory created by the commands
> shown in
> -Figure~\ref{fig:howto:Creating a Up-To-Date PDF}.
> +Listing~\ref{lst:howto:Creating a Up-To-Date PDF}.
>
> PDFs of this book are sporadically posted at
> \url{http://kernel.org/pub/linux/kernel/people/paulmck/perfbook/perfbook.html}
> diff --git a/owned/owned.tex b/owned/owned.tex
> index f74b5d6..d584e82 100644
> --- a/owned/owned.tex
> +++ b/owned/owned.tex
> @@ -132,8 +132,8 @@ is obviously quite attractive.
> } \QuickQuizEnd
>
> This same pattern can be written in C as well as in \co{sh}, as illustrated by
> -Figures~\ref{fig:toolsoftrade:Using the fork() Primitive}
> -and~\ref{fig:toolsoftrade:Using the wait() Primitive}.
> +Listings~\ref{lst:toolsoftrade:Using the fork() Primitive}
> +and~\ref{lst:toolsoftrade:Using the wait() Primitive}.
>
> The next section discusses use of data ownership in shared-memory
> parallel programs.
> @@ -150,8 +150,8 @@ of ownership and access rights.
>
> For example, consider the per-thread statistical counter implementation
> shown in
> -Figure~\ref{fig:count:Per-Thread Statistical Counters} on
> -page~\pageref{fig:count:Per-Thread Statistical Counters}.
> +Listing~\ref{lst:count:Per-Thread Statistical Counters} on
> +page~\pageref{lst:count:Per-Thread Statistical Counters}.
> Here, \co{inc_count()} updates only the corresponding thread's
> instance of \co{counter},
> while \co{read_count()} accesses, but does not modify, all
> @@ -195,9 +195,9 @@ beginning on
> page~\pageref{sec:count:Signal-Theft Limit Counter Design},
> in particular the \co{flush_local_count_sig()} and
> \co{flush_local_count()} functions in
> -Figure~\ref{fig:count:Signal-Theft Limit Counter Value-Migration Functions}
> +Listing~\ref{lst:count:Signal-Theft Limit Counter Value-Migration Functions}
> on
> -page~\pageref{fig:count:Signal-Theft Limit Counter Value-Migration Functions}.
> +page~\pageref{lst:count:Signal-Theft Limit Counter Value-Migration Functions}.
>
> The \co{flush_local_count_sig()} function is a signal handler that
> acts as the shipped function.
> @@ -207,12 +207,12 @@ the shipped function executes.
> This shipped function has the not-unusual added complication of
> needing to interact with any concurrently executing \co{add_count()}
> or \co{sub_count()} functions (see
> -Figure~\ref{fig:count:Signal-Theft Limit Counter Add Function}
> +Listing~\ref{lst:count:Signal-Theft Limit Counter Add Function}
> on
> -page~\pageref{fig:count:Signal-Theft Limit Counter Add Function} and
> -Figure~\ref{fig:count:Signal-Theft Limit Counter Subtract Function}
> +page~\pageref{lst:count:Signal-Theft Limit Counter Add Function} and
> +Listing~\ref{lst:count:Signal-Theft Limit Counter Subtract Function}
> on
> -page~\pageref{fig:count:Signal-Theft Limit Counter Subtract Function}).
> +page~\pageref{lst:count:Signal-Theft Limit Counter Subtract Function}).
>
> \QuickQuiz{}
> What mechanisms other than POSIX signals may be used for function
> @@ -246,7 +246,7 @@ The eventually consistent counter implementation described in
> Section~\ref{sec:count:Eventually Consistent Implementation} provides an example.
> This implementation has a designated thread that runs the
> \co{eventual()} function shown on lines~15-32 of
> -Figure~\ref{fig:count:Array-Based Per-Thread Eventually Consistent Counters}.
> +Listing~\ref{lst:count:Array-Based Per-Thread Eventually Consistent Counters}.
> This \co{eventual()} thread periodically pulls the per-thread counts
> into the global counter, so that accesses to the global counter will,
> as the name says, eventually converge on the actual value.
> @@ -254,7 +254,7 @@ as the name says, eventually converge on the actual value.
> \QuickQuiz{}
> But none of the data in the \co{eventual()} function shown on
> lines~15-32 of
> - Figure~\ref{fig:count:Array-Based Per-Thread Eventually Consistent Counters}
> + Listing~\ref{lst:count:Array-Based Per-Thread Eventually Consistent Counters}
> is actually owned by the \co{eventual()} thread!
> In just what way is this data ownership???
> \QuickQuizAnswer{
> @@ -298,8 +298,8 @@ a considerable reduction in the spread of certain types of disease.
>
> In other cases, privatization imposes costs.
> For example, consider the simple limit counter shown in
> -Figure~\ref{fig:count:Simple Limit Counter Add, Subtract, and Read} on
> -page~\pageref{fig:count:Simple Limit Counter Add, Subtract, and Read}.
> +Listing~\ref{lst:count:Simple Limit Counter Add, Subtract, and Read} on
> +page~\pageref{lst:count:Simple Limit Counter Add, Subtract, and Read}.
> This is an example of an algorithm where threads can read each others'
> data, but are only permitted to update their own data.
> A quick review of the algorithm shows that the only cross-thread
> diff --git a/together/applyrcu.tex b/together/applyrcu.tex
> index e3d6f0d..d477ab5 100644
> --- a/together/applyrcu.tex
> +++ b/together/applyrcu.tex
> @@ -21,8 +21,8 @@ Unfortunately, threads needing to read out the value via \co{read_count()}
> were required to acquire a global
> lock, and thus incurred high overhead and suffered poor scalability.
> The code for the lock-based implementation is shown in
> -Figure~\ref{fig:count:Per-Thread Statistical Counters} on
> -Page~\pageref{fig:count:Per-Thread Statistical Counters}.
> +Listing~\ref{lst:count:Per-Thread Statistical Counters} on
> +Page~\pageref{lst:count:Per-Thread Statistical Counters}.
>
> \QuickQuiz{}
> Why on earth did we need that global lock in the first place?
> @@ -56,8 +56,8 @@ execution.
> Just what is the accuracy of \co{read_count()}, anyway?
> \QuickQuizAnswer{
> Refer to
> - Figure~\ref{fig:count:Per-Thread Statistical Counters} on
> - Page~\pageref{fig:count:Per-Thread Statistical Counters}.
> + Listing~\ref{lst:count:Per-Thread Statistical Counters} on
> + Page~\pageref{lst:count:Per-Thread Statistical Counters}.
> Clearly, if there are no concurrent invocations of \co{inc_count()},
> \co{read_count()} will return an exact result.
> However, if there \emph{are} concurrent invocations of
> @@ -205,7 +205,7 @@ the current \co{countarray} structure, and
> the \co{final_mutex} spinlock.
>
> Lines~10-13 show \co{inc_count()}, which is unchanged from
> -Figure~\ref{fig:count:Per-Thread Statistical Counters}.
> +Listing~\ref{lst:count:Per-Thread Statistical Counters}.
>
> Lines~15-29 show \co{read_count()}, which has changed significantly.
> Lines~21 and~27 substitute \co{rcu_read_lock()} and
> @@ -280,13 +280,13 @@ Line~69 can then safely free the old \co{countarray} structure.
> Wow!
> Figure~\ref{fig:together:RCU and Per-Thread Statistical Counters}
> contains 69 lines of code, compared to only 42 in
> - Figure~\ref{fig:count:Per-Thread Statistical Counters}.
> + Listing~\ref{lst:count:Per-Thread Statistical Counters}.
> Is this extra complexity really worth it?
> \QuickQuizAnswer{
> This of course needs to be decided on a case-by-case basis.
> If you need an implementation of \co{read_count()} that
> scales linearly, then the lock-based implementation shown in
> - Figure~\ref{fig:count:Per-Thread Statistical Counters}
> + Listing~\ref{lst:count:Per-Thread Statistical Counters}
> simply will not work for you.
> On the other hand, if calls to \co{count_read()} are sufficiently
> rare, then the lock-based version is simpler and might thus be
> @@ -300,7 +300,7 @@ Line~69 can then safely free the old \co{countarray} structure.
> and the use of RCU unnecessary.
> This would in turn enable an implementation that
> was even simpler than the one shown in
> - Figure~\ref{fig:count:Per-Thread Statistical Counters}, but
> + Listing~\ref{lst:count:Per-Thread Statistical Counters}, but
> with all the scalability and performance benefits of the
> implementation shown in
> Figure~\ref{fig:together:RCU and Per-Thread Statistical Counters}!
> diff --git a/toolsoftrade/toolsoftrade.tex b/toolsoftrade/toolsoftrade.tex
> index bd43879..b18d3d0 100644
> --- a/toolsoftrade/toolsoftrade.tex
> +++ b/toolsoftrade/toolsoftrade.tex
> @@ -189,7 +189,7 @@ These concerns can of course add substantial complexity to the code.
> For more information, see any of a number of textbooks on the
> subject~\cite{WRichardStevens1992,StewartWeiss2013UNIX}.
>
> -\begin{figure}[tbp]
> +\begin{listing}[tbp]
> { \scriptsize
> \begin{verbbox}
> 1 pid = fork();
> @@ -207,14 +207,14 @@ subject~\cite{WRichardStevens1992,StewartWeiss2013UNIX}.
> \centering
> \theverbbox
> \caption{Using the \tco{fork()} Primitive}
> -\label{fig:toolsoftrade:Using the fork() Primitive}
> -\end{figure}
> +\label{lst:toolsoftrade:Using the fork() Primitive}
> +\end{listing}
>
> If \co{fork()} succeeds, it returns twice, once for the parent
> and again for the child.
> The value returned from \co{fork()} allows the caller to tell
> the difference, as shown in
> -Figure~\ref{fig:toolsoftrade:Using the fork() Primitive}
> +Listing~\ref{lst:toolsoftrade:Using the fork() Primitive}
> (\path{forkjoin.c}).
> Line~1 executes the \co{fork()} primitive, and saves its return value
> in local variable \co{pid}.
> @@ -228,7 +228,7 @@ Otherwise, the \co{fork()} has executed successfully, and the parent
> therefore executes line~9 with the variable \co{pid} containing the
> process ID of the child.
>
> -\begin{figure}[tbp]
> +\begin{listing}[tbp]
> { \scriptsize
> \begin{verbbox}
> 1 void waitall(void)
> @@ -251,8 +251,8 @@ process ID of the child.
> \centering
> \theverbbox
> \caption{Using the \tco{wait()} Primitive}
> -\label{fig:toolsoftrade:Using the wait() Primitive}
> -\end{figure}
> +\label{lst:toolsoftrade:Using the wait() Primitive}
> +\end{listing}
>
> The parent process may use the \co{wait()} primitive to wait for its children
> to complete.
> @@ -261,7 +261,7 @@ counterpart, as each invocation of \co{wait()} waits for but one child
> process.
> It is therefore customary to wrap \co{wait()} into a function similar
> to the \co{waitall()} function shown in
> -Figure~\ref{fig:toolsoftrade:Using the wait() Primitive}
> +Listing~\ref{lst:toolsoftrade:Using the wait() Primitive}
> (\path{api-pthread.h}),
> with this \co{waitall()} function having semantics similar to the
> shell-script \co{wait} command.
> @@ -283,14 +283,14 @@ Otherwise, lines~11 and 12 print an error and exit.
> child individually.
> In addition, some parallel applications need to detect the
> reason that the child died.
> - As we saw in Figure~\ref{fig:toolsoftrade:Using the wait() Primitive},
> + As we saw in Listing~\ref{lst:toolsoftrade:Using the wait() Primitive},
> it is not hard to build a \co{waitall()} function out of
> the \co{wait()} function, but it would be impossible to
> do the reverse.
> Once the information about a specific child is lost, it is lost.
> } \QuickQuizEnd
>
> -\begin{figure}[tbp]
> +\begin{listing}[tbp]
> { \scriptsize
> \begin{verbbox}
> 1 int x = 0;
> @@ -313,13 +313,13 @@ Otherwise, lines~11 and 12 print an error and exit.
> \centering
> \theverbbox
> \caption{Processes Created Via \tco{fork()} Do Not Share Memory}
> -\label{fig:toolsoftrade:Processes Created Via fork() Do Not Share Memory}
> -\end{figure}
> +\label{lst:toolsoftrade:Processes Created Via fork() Do Not Share Memory}
> +\end{listing}
>
> It is critically important to note that the parent and child do \emph{not}
> share memory.
> This is illustrated by the program shown in
> -Figure~\ref{fig:toolsoftrade:Processes Created Via fork() Do Not Share Memory}
> +Listing~\ref{lst:toolsoftrade:Processes Created Via fork() Do Not Share Memory}
> (\path{forkjoinvar.c}),
> in which the child sets a global variable \co{x} to 1 on line~6,
> prints a message on line~7, and exits on line~8.
> @@ -353,7 +353,7 @@ Parent process sees x=0
> source code of the Linux-kernel implementations themselves.
>
> It is important to note that the parent process in
> - Figure~\ref{fig:toolsoftrade:Processes Created Via fork() Do Not Share Memory}
> + Listing~\ref{lst:toolsoftrade:Processes Created Via fork() Do Not Share Memory}
> waits until after the child terminates to do its \co{printf()}.
> Using \co{printf()}'s buffered I/O concurrently to the same file
> from multiple processes is non-trivial, and is best avoided.
> @@ -376,7 +376,7 @@ than fork-join parallelism.
> To create a thread within an existing process, invoke the
> \co{pthread_create()} primitive, for example, as shown on lines~15
> and~16 of
> -Figure~\ref{fig:toolsoftrade:Threads Created Via pthread-create() Share Memory}
> +Listing~\ref{lst:toolsoftrade:Threads Created Via pthread-create() Share Memory}
> (\path{pcreate.c}).
> The first argument is a pointer to a \co{pthread_t} in which to store the
> ID of the thread to be created, the second \co{NULL} argument is a pointer
> @@ -385,7 +385,7 @@ to an optional \co{pthread_attr_t}, the third argument is the function
> that is to be invoked by the new thread, and the last \co{NULL} argument
> is the argument that will be passed to \co{mythread}.
>
> -\begin{figure}[tbp]
> +\begin{listing}[tbp]
> { \scriptsize
> \begin{verbbox}
> 1 int x = 0;
> @@ -419,15 +419,15 @@ is the argument that will be passed to \co{mythread}.
> \centering
> \theverbbox
> \caption{Threads Created Via \tco{pthread_create()} Share Memory}
> -\label{fig:toolsoftrade:Threads Created Via pthread-create() Share Memory}
> -\end{figure}
> +\label{lst:toolsoftrade:Threads Created Via pthread-create() Share Memory}
> +\end{listing}
>
> In this example, \co{mythread()} simply returns, but it could instead
> call \co{pthread_exit()}.
>
> \QuickQuiz{}
> If the \co{mythread()} function in
> - Figure~\ref{fig:toolsoftrade:Threads Created Via pthread-create() Share Memory}
> + Listing~\ref{lst:toolsoftrade:Threads Created Via pthread-create() Share Memory}
> can simply return, why bother with \co{pthread_exit()}?
> \QuickQuizAnswer{
> In this simple example, there is no reason whatsoever.
> @@ -450,7 +450,7 @@ or the value returned by the thread's top-level function, depending on
> how the thread in question exits.
>
> The program shown in
> -Figure~\ref{fig:toolsoftrade:Threads Created Via pthread-create() Share Memory}
> +Listing~\ref{lst:toolsoftrade:Threads Created Via pthread-create() Share Memory}
> produces output as follows, demonstrating that memory is in fact
> shared between the two threads:
>
> @@ -541,7 +541,7 @@ lock~\cite{Hoare74}.
> to the reader.
> } \QuickQuizEnd
>
> -\begin{figure}[tbp]
> +\begin{listing}[tbp]
> { \scriptsize
> \begin{verbbox}
> 1 pthread_mutex_t lock_a = PTHREAD_MUTEX_INITIALIZER;
> @@ -598,11 +598,11 @@ lock~\cite{Hoare74}.
> \centering
> \theverbbox
> \caption{Demonstration of Exclusive Locks}
> -\label{fig:toolsoftrade:Demonstration of Exclusive Locks}
> -\end{figure}
> +\label{lst:toolsoftrade:Demonstration of Exclusive Locks}
> +\end{listing}
>
> This exclusive-locking property is demonstrated using the code shown in
> -Figure~\ref{fig:toolsoftrade:Demonstration of Exclusive Locks}
> +Listing~\ref{lst:toolsoftrade:Demonstration of Exclusive Locks}
> (\path{lock.c}).
> Line~1 defines and initializes a POSIX lock named \co{lock_a}, while
> line~2 similarly defines and initializes a lock named \co{lock_b}.
> @@ -618,7 +618,7 @@ primitives.
> \QuickQuiz{}
> Why not simply make the argument to \co{lock_reader()}
> on line~5 of
> - Figure~\ref{fig:toolsoftrade:Demonstration of Exclusive Locks}
> + Listing~\ref{lst:toolsoftrade:Demonstration of Exclusive Locks}
> be a pointer to a \co{pthread_mutex_t}?
> \QuickQuizAnswer{
> Because we will need to pass \co{lock_reader()} to
> @@ -653,7 +653,7 @@ required by \co{pthread_create()}.
> } \QuickQuizEnd
>
> Lines~31-49 of
> -Figure~\ref{fig:toolsoftrade:Demonstration of Exclusive Locks}
> +Listing~\ref{lst:toolsoftrade:Demonstration of Exclusive Locks}
> shows \co{lock_writer()}, which
> periodically update the shared variable \co{x} while holding the
> specified \co{pthread_mutex_t}.
> @@ -664,7 +664,7 @@ While holding the lock, lines~40-43 increment the shared variable \co{x},
> sleeping for five milliseconds between each increment.
> Finally, lines~44-47 release the lock.
>
> -\begin{figure}[tbp]
> +\begin{listing}[tbp]
> { \scriptsize
> \begin{verbbox}
> 1 printf("Creating two threads using same lock:\n");
> @@ -691,10 +691,10 @@ Finally, lines~44-47 release the lock.
> \centering
> \theverbbox
> \caption{Demonstration of Same Exclusive Lock}
> -\label{fig:toolsoftrade:Demonstration of Same Exclusive Lock}
> -\end{figure}
> +\label{lst:toolsoftrade:Demonstration of Same Exclusive Lock}
> +\end{listing}
>
> -Figure~\ref{fig:toolsoftrade:Demonstration of Same Exclusive Lock}
> +Listing~\ref{lst:toolsoftrade:Demonstration of Same Exclusive Lock}
> shows a code fragment that runs \co{lock_reader()} and
> \co{lock_writer()} as threads using the same lock, namely, \co{lock_a}.
> Lines~2-6 create a thread running \co{lock_reader()}, and then
> @@ -719,7 +719,7 @@ by \co{lock_writer()} while holding the lock.
> \QuickQuiz{}
> Is ``x = 0'' the only possible output from the code fragment
> shown in
> - Figure~\ref{fig:toolsoftrade:Demonstration of Same Exclusive Lock}?
> + Listing~\ref{lst:toolsoftrade:Demonstration of Same Exclusive Lock}?
> If so, why?
> If not, what other output could appear, and why?
> \QuickQuizAnswer{
> @@ -735,7 +735,7 @@ by \co{lock_writer()} while holding the lock.
> However, there are no guarantees, especially on a busy system.
> } \QuickQuizEnd
>
> -\begin{figure}[tbp]
> +\begin{listing}[tbp]
> { \scriptsize
> \begin{verbbox}
> 1 printf("Creating two threads w/different locks:\n");
> @@ -763,10 +763,10 @@ by \co{lock_writer()} while holding the lock.
> \centering
> \theverbbox
> \caption{Demonstration of Different Exclusive Locks}
> -\label{fig:toolsoftrade:Demonstration of Different Exclusive Locks}
> -\end{figure}
> +\label{lst:toolsoftrade:Demonstration of Different Exclusive Locks}
> +\end{listing}
>
> -Figure~\ref{fig:toolsoftrade:Demonstration of Different Exclusive Locks}
> +Listing~\ref{lst:toolsoftrade:Demonstration of Different Exclusive Locks}
> shows a similar code fragment, but this time using different locks:
> \co{lock_a} for \co{lock_reader()} and \co{lock_b} for
> \co{lock_writer()}.
> @@ -811,7 +811,7 @@ values of \co{x} stored by \co{lock_writer()}.
>
> \QuickQuiz{}
> In the code shown in
> - Figure~\ref{fig:toolsoftrade:Demonstration of Different Exclusive Locks},
> + Listing~\ref{lst:toolsoftrade:Demonstration of Different Exclusive Locks},
> is \co{lock_reader()} guaranteed to see all the values produced
> by \co{lock_writer()}?
> Why or why not?
> @@ -825,19 +825,19 @@ values of \co{x} stored by \co{lock_writer()}.
>
> \QuickQuiz{}
> Wait a minute here!!!
> - Figure~\ref{fig:toolsoftrade:Demonstration of Same Exclusive Lock}
> + Listing~\ref{lst:toolsoftrade:Demonstration of Same Exclusive Lock}
> didn't initialize shared variable \co{x},
> so why does it need to be initialized in
> - Figure~\ref{fig:toolsoftrade:Demonstration of Different Exclusive Locks}?
> + Listing~\ref{lst:toolsoftrade:Demonstration of Different Exclusive Locks}?
> \QuickQuizAnswer{
> See line~3 of
> - Figure~\ref{fig:toolsoftrade:Demonstration of Exclusive Locks}.
> + Listing~\ref{lst:toolsoftrade:Demonstration of Exclusive Locks}.
> Because the code in
> - Figure~\ref{fig:toolsoftrade:Demonstration of Same Exclusive Lock}
> + Listing~\ref{lst:toolsoftrade:Demonstration of Same Exclusive Lock}
> ran first, it could rely on the compile-time initialization of
> \co{x}.
> The code in
> - Figure~\ref{fig:toolsoftrade:Demonstration of Different Exclusive Locks}
> + Listing~\ref{lst:toolsoftrade:Demonstration of Different Exclusive Locks}
> ran next, so it had to re-initialize \co{x}.
> } \QuickQuizEnd
>
> @@ -873,7 +873,7 @@ an arbitrarily large number of readers to concurrently hold the lock.
> However, in practice, we need to know how much additional scalability is
> provided by reader-writer locks.
>
> -\begin{figure}[tbp]
> +\begin{listing}[tbp]
> { \scriptsize
> \begin{verbbox}
> 1 pthread_rwlock_t rwl = PTHREAD_RWLOCK_INITIALIZER;
> @@ -922,10 +922,10 @@ provided by reader-writer locks.
> \centering
> \theverbbox
> \caption{Measuring Reader-Writer Lock Scalability}
> -\label{fig:toolsoftrade:Measuring Reader-Writer Lock Scalability}
> -\end{figure}
> +\label{lst:toolsoftrade:Measuring Reader-Writer Lock Scalability}
> +\end{listing}
>
> -Figure~\ref{fig:toolsoftrade:Measuring Reader-Writer Lock Scalability}
> +Listing~\ref{lst:toolsoftrade:Measuring Reader-Writer Lock Scalability}
> (\path{rwlockscale.c})
> shows one way of measuring reader-writer lock scalability.
> Line~1 shows the definition and initialization of the reader-writer
> @@ -955,7 +955,7 @@ rights to assume that the value of \co{goflag} would never change.
> \QuickQuiz{}
> Instead of using \co{READ_ONCE()} everywhere, why not just
> declare \co{goflag} as \co{volatile} on line~10 of
> - Figure~\ref{fig:toolsoftrade:Measuring Reader-Writer Lock Scalability}?
> + Listing~\ref{lst:toolsoftrade:Measuring Reader-Writer Lock Scalability}?
> \QuickQuizAnswer{
> A \co{volatile} declaration is in fact a reasonable alternative in
> this particular case.
> @@ -977,7 +977,7 @@ rights to assume that the value of \co{goflag} would never change.
> Don't we also need memory barriers to make sure
> that the change in \co{goflag}'s value propagates to the
> CPU in a timely fashion in
> - Figure~\ref{fig:toolsoftrade:Measuring Reader-Writer Lock Scalability}?
> + Listing~\ref{lst:toolsoftrade:Measuring Reader-Writer Lock Scalability}?
> \QuickQuizAnswer{
> No, memory barriers are not needed and won't help here.
> Memory barriers only enforce ordering among multiple memory
> @@ -1167,7 +1167,7 @@ to protect the tiniest of critical sections.
> One such way are atomic operations.
> We have seen one atomic operations already, in the form of the
> \co{__sync_fetch_and_add()} primitive on line~18 of
> -Figure~\ref{fig:toolsoftrade:Measuring Reader-Writer Lock Scalability}.
> +Listing~\ref{lst:toolsoftrade:Measuring Reader-Writer Lock Scalability}.
> This primitive atomically adds the value of its second argument to
> the value referenced by its first argument, returning the old value
> (which was ignored in this case).
> @@ -1243,11 +1243,11 @@ In some cases, it is sufficient to constrain the compiler's ability
> to reorder operations, while allowing the CPU free rein, in which
> case the \co{barrier()} primitive may be used, as it in fact was
> on line~28 of
> -Figure~\ref{fig:toolsoftrade:Measuring Reader-Writer Lock Scalability}.
> +Listing~\ref{lst:toolsoftrade:Measuring Reader-Writer Lock Scalability}.
> In some cases, it is only necessary to ensure that the compiler
> avoids optimizing away a given memory read, in which case the
> \co{READ_ONCE()} primitive may be used, as it was on line~17 of
> -Figure~\ref{fig:toolsoftrade:Demonstration of Exclusive Locks}.
> +Listing~\ref{lst:toolsoftrade:Demonstration of Exclusive Locks}.
> Similarly, the \co{WRITE_ONCE()} primitive may be used to prevent the
> compiler from optimizing away a given memory write.
> These last three primitives are not provided directly by \GCC,
> @@ -1416,10 +1416,10 @@ Threads share everything except for per-thread local state,\footnote{
> which includes program counter and stack.
>
> The thread API is shown in
> -Figure~\ref{fig:toolsoftrade:Thread API}, and members are described in the
> +Listing~\ref{lst:toolsoftrade:Thread API}, and members are described in the
> following sections.
>
> -\begin{figure*}[tbp]
> +\begin{listing*}[tbp]
> { \scriptsize
> \begin{verbbox}
> int smp_thread_id(void)
> @@ -1433,8 +1433,8 @@ void wait_all_threads(void)
> \centering
> \theverbbox
> \caption{Thread API}
> -\label{fig:toolsoftrade:Thread API}
> -\end{figure*}
> +\label{lst:toolsoftrade:Thread API}
> +\end{listing*}
>
> \subsubsection{\tco{create_thread()}}
>
> @@ -1499,7 +1499,7 @@ a run, so such synchronization is normally not needed.
>
> \subsubsection{Example Usage}
>
> -Figure~\ref{fig:toolsoftrade:Example Child Thread}
> +Listing~\ref{lst:toolsoftrade:Example Child Thread}
> shows an example hello-world-like child thread.
> As noted earlier, each thread is allocated its own stack, so
> each thread has its own private \co{arg} argument and \co{myarg} variable.
> @@ -1509,7 +1509,7 @@ Note that the \co{return} statement on line~7 terminates the thread,
> returning a \co{NULL} to whoever invokes \co{wait_thread()} on this
> thread.
>
> -\begin{figure}[tbp]
> +\begin{listing}[tbp]
> { \scriptsize
> \begin{verbbox}
> 1 void *thread_test(void *arg)
> @@ -1525,11 +1525,11 @@ thread.
> \centering
> \theverbbox
> \caption{Example Child Thread}
> -\label{fig:toolsoftrade:Example Child Thread}
> -\end{figure}
> +\label{lst:toolsoftrade:Example Child Thread}
> +\end{listing}
>
> The parent program is shown in
> -Figure~\ref{fig:toolsoftrade:Example Parent Thread}.
> +Listing~\ref{lst:toolsoftrade:Example Parent Thread}.
> It invokes \co{smp_init()} to initialize the threading system on
> line~6,
> parses arguments on lines~7-14, and announces its presence on line~15.
> @@ -1538,7 +1538,7 @@ and waits for them to complete on line~18.
> Note that \co{wait_all_threads()} discards the threads return values,
> as in this case they are all \co{NULL}, which is not very interesting.
>
> -\begin{figure}[tbp]
> +\begin{listing}[tbp]
> { \scriptsize
> \begin{verbbox}
> 1 int main(int argc, char *argv[])
> @@ -1567,8 +1567,8 @@ as in this case they are all \co{NULL}, which is not very interesting.
> \centering
> \theverbbox
> \caption{Example Parent Thread}
> -\label{fig:toolsoftrade:Example Parent Thread}
> -\end{figure}
> +\label{lst:toolsoftrade:Example Parent Thread}
> +\end{listing}
>
> \QuickQuiz{}
> What happened to the Linux-kernel equivalents to \co{fork()}
> @@ -1584,11 +1584,11 @@ as in this case they are all \co{NULL}, which is not very interesting.
> \label{sec:toolsoftrade:Locking}
>
> A good starting subset of the Linux kernel's locking API is shown in
> -Figure~\ref{fig:toolsoftrade:Locking API},
> +Listing~\ref{lst:toolsoftrade:Locking API},
> each API element being described in the following sections.
> This book's CodeSamples locking API closely follows that of the Linux kernel.
>
> -\begin{figure}[tbp]
> +\begin{listing}[tbp]
> { \scriptsize
> \begin{verbbox}
> void spin_lock_init(spinlock_t *sp);
> @@ -1600,8 +1600,8 @@ void spin_unlock(spinlock_t *sp);
> \centering
> \theverbbox
> \caption{Locking API}
> -\label{fig:toolsoftrade:Locking API}
> -\end{figure}
> +\label{lst:toolsoftrade:Locking API}
> +\end{listing}
>
> \subsubsection{\tco{spin_lock_init()}}
>
> @@ -1721,7 +1721,7 @@ given per-CPU variable, \co{per_cpu()} to access a specified CPU's
> instance of a given per-CPU variable, along with many other special-purpose
> per-CPU operations.
>
> -Figure~\ref{fig:toolsoftrade:Per-Thread-Variable API}
> +Listing~\ref{lst:toolsoftrade:Per-Thread-Variable API}
> shows this book's per-thread-variable API, which is patterned
> after the Linux kernel's per-CPU-variable API.
> This API provides the per-thread equivalent of global variables.
> @@ -1729,7 +1729,7 @@ Although this API is, strictly speaking, not necessary\footnote{
> You could instead use \co{__thread} or \co{_Thread_local}.},
> it can provide a good userspace analogy to Linux kernel code.
>
> -\begin{figure}[htbp]
> +\begin{listing}[htbp]
> { \scriptsize
> \begin{verbbox}
> DEFINE_PER_THREAD(type, name)
> @@ -1742,8 +1742,8 @@ init_per_thread(name, v)
> \centering
> \theverbbox
> \caption{Per-Thread-Variable API}
> -\label{fig:toolsoftrade:Per-Thread-Variable API}
> -\end{figure}
> +\label{lst:toolsoftrade:Per-Thread-Variable API}
> +\end{listing}
>
> \QuickQuiz{}
> How could you work around the lack of a per-thread-variable
> --
> 2.7.4
>
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2017-10-10 23:38 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-10-10 22:40 [PATCH] Convert code snippets to 'listing' env (howto, toolsoftrade, count) Akira Yokosawa
2017-10-10 23:38 ` Paul E. McKenney
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.