* [PATCH 0/7] Further conversion of code snippets (howto, cpu, SMPdesign)
@ 2018-11-04 0:07 Akira Yokosawa
2018-11-04 0:08 ` [PATCH 1/7] howto, cpu: Employ new scheme for command/code snippets Akira Yokosawa
` (7 more replies)
0 siblings, 8 replies; 9+ messages in thread
From: Akira Yokosawa @ 2018-11-04 0:07 UTC (permalink / raw)
To: Paul E. McKenney; +Cc: perfbook, Akira Yokosawa
Hi Paul,
This is another set of patches converting code snippets to new scheme.
Notes on the changes other than simple conversion follow.
Patch #2 contains somewhat ugly hack in lockhdeq.c to suppress
"____cacheline_internodealigned_in_smp" in the resulting snippet.
I'm afraid it could hurt your eyes.
Patch #5 renames "percpu*" to "perthread*" to respect the names
used in actual code.
Patch #6 includes substitution of ACCESS_ONCE -> READ_ONCE.
Patch #7 does the substitution under CodeSamples. It also adds
a couple of function prototypes in maze.h.
Thanks, Akira
--
Akira Yokosawa (7):
howto, cpu: Employ new scheme for command/code snippets
SMPdesign: Employ new scheme for snippet of lockhdeq.c
SMPdesign: Employ new scheme for snippet of lockhdeq.c and locktdeq.c
SMPdesign: Employ new scheme for inline snippets
SMPdesign: Employ new scheme for snippets from smpalloc.c
SMPdesign/beyond: Employ new scheme for inline pseudocode snippets
CodeSamples/SMPdesign/maze: Substitute {READ/WRITE}_ONCE() for
ACCESS_ONCE()
CodeSamples/SMPdesign/lockhdeq.c | 53 ++--
CodeSamples/SMPdesign/locktdeq.c | 70 ++---
CodeSamples/SMPdesign/maze/maze.h | 6 +
CodeSamples/SMPdesign/maze/maze_fg.c | 18 +-
CodeSamples/SMPdesign/maze/maze_part.c | 18 +-
CodeSamples/SMPdesign/smpalloc.c | 57 ++--
SMPdesign/SMPdesign.tex | 479 ++++++++++++++-------------------
SMPdesign/beyond.tex | 319 +++++++++++-----------
SMPdesign/partexercises.tex | 263 ++++++------------
cpu/overview.tex | 15 +-
howto/howto.tex | 60 ++---
11 files changed, 605 insertions(+), 753 deletions(-)
--
2.7.4
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 1/7] howto, cpu: Employ new scheme for command/code snippets
2018-11-04 0:07 [PATCH 0/7] Further conversion of code snippets (howto, cpu, SMPdesign) Akira Yokosawa
@ 2018-11-04 0:08 ` Akira Yokosawa
2018-11-04 0:09 ` [PATCH 2/7] SMPdesign: Employ new scheme for snippet of lockhdeq.c Akira Yokosawa
` (6 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Akira Yokosawa @ 2018-11-04 0:08 UTC (permalink / raw)
To: Paul E. McKenney; +Cc: perfbook, Akira Yokosawa
From 687613a6796f9c2b2539757e9041b6f89e94f426 Mon Sep 17 00:00:00 2001
From: Akira Yokosawa <akiyks@gmail.com>
Date: Sun, 28 Oct 2018 17:41:47 +0900
Subject: [PATCH 1/7] howto, cpu: Employ new scheme for command/code snippets
Signed-off-by: Akira Yokosawa <akiyks@gmail.com>
---
cpu/overview.tex | 15 +++++---------
howto/howto.tex | 60 ++++++++++++++++++++++----------------------------------
2 files changed, 28 insertions(+), 47 deletions(-)
diff --git a/cpu/overview.tex b/cpu/overview.tex
index 7c02c13..071cf7c 100644
--- a/cpu/overview.tex
+++ b/cpu/overview.tex
@@ -203,16 +203,11 @@ Appendix~\ref{chp:app:whymb:Why Memory Barriers?}.
In the meantime, consider the following simple lock-based critical
section:
-\vspace{5pt}
-\begin{minipage}[t]{\columnwidth}
-\small
-\begin{verbatim}
- 1 spin_lock(&mylock);
- 2 a = a + 1;
- 3 spin_unlock(&mylock);
-\end{verbatim}
-\end{minipage}
-\vspace{5pt}
+\begin{VerbatimN}
+spin_lock(&mylock);
+a = a + 1;
+spin_unlock(&mylock);
+\end{VerbatimN}
\begin{figure}[tb]
\centering
diff --git a/howto/howto.tex b/howto/howto.tex
index e88bd90..2fbf8a3 100644
--- a/howto/howto.tex
+++ b/howto/howto.tex
@@ -346,13 +346,9 @@ this source code may be found in the \path{CodeSamples} directory
of this book's git tree.
For example, on UNIX systems, you should be able to type the following:
-\begin{quote}
- {\scriptsize
- \begin{verbatim}
- find CodeSamples -name rcu_rcpls.c -print
- \end{verbatim}
- }
-\end{quote}
+\begin{VerbatimU}
+find CodeSamples -name rcu_rcpls.c -print
+\end{VerbatimU}
This command will locate the file \path{rcu_rcpls.c}, which is called out in
Appendix~\ref{chp:app:``Toy'' RCU Implementations}.
@@ -362,36 +358,28 @@ Other types of systems have well-known ways of locating files by filename.
\label{sec:howto:Whose Book Is This?}
\begin{listing*}[tbp]
-{
-\scriptsize
-\begin{verbbox}
- 1 git clone git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/perfbook.git
- 2 cd perfbook
- 3 # You may need to install a font here. See item 1 in FAQ.txt.
- 4 make
- 5 evince perfbook.pdf & # Two-column version
- 6 make perfbook-1c.pdf
- 7 evince perfbook-1c.pdf & # One-column version for e-readers
-\end{verbbox}
-}
-\hspace*{1in}\OneColumnHSpace{-0.5in}\theverbbox
+\begin{VerbatimL}
+git clone git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/perfbook.git
+cd perfbook
+# You may need to install a font here. See item 1 in FAQ.txt.
+make
+evince perfbook.pdf & # Two-column version
+make perfbook-1c.pdf
+evince perfbook-1c.pdf & # One-column version for e-readers
+\end{VerbatimL}
\caption{Creating an Up-To-Date PDF}
\label{lst:howto:Creating a Up-To-Date PDF}
\end{listing*}
\begin{listing*}[tbp]
-{
-\scriptsize
-\begin{verbbox}
- 1 git remote update
- 2 git checkout origin/master
- 3 make
- 4 evince perfbook.pdf & # Two-column version
- 5 make perfbook-1c.pdf
- 6 evince perfbook-1c.pdf & # One-column version for e-readers
-\end{verbbox}
-}
-\hspace*{1in}\OneColumnHSpace{-0.5in}\theverbbox
+\begin{VerbatimL}
+git remote update
+git checkout origin/master
+make
+evince perfbook.pdf & # Two-column version
+make perfbook-1c.pdf
+evince perfbook-1c.pdf & # One-column version for e-readers
+\end{VerbatimL}
\caption{Generating an Updated PDF}
\label{lst:howto:Generating an Updated PDF}
\end{listing*}
@@ -441,11 +429,9 @@ One important requirement is that each patch (or commit, in the case
of a \co{git pull} request) must contain a valid \co{Signed-off-by:} line,
which has the following format:
-\begin{quote}
- { \scriptsize
- \co{Signed-off-by: My Name <myname@example.org>}
- }
-\end{quote}
+\begin{VerbatimU}
+Signed-off-by: My Name <myname@example.org>
+\end{VerbatimU}
Please see \url{http://lkml.org/lkml/2007/1/15/219} for an example
patch containing a \co{Signed-off-by:} line.
--
2.7.4
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 2/7] SMPdesign: Employ new scheme for snippet of lockhdeq.c
2018-11-04 0:07 [PATCH 0/7] Further conversion of code snippets (howto, cpu, SMPdesign) Akira Yokosawa
2018-11-04 0:08 ` [PATCH 1/7] howto, cpu: Employ new scheme for command/code snippets Akira Yokosawa
@ 2018-11-04 0:09 ` Akira Yokosawa
2018-11-04 0:11 ` [PATCH 3/7] SMPdesign: Employ new scheme for snippet of lockhdeq.c and locktdeq.c Akira Yokosawa
` (5 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Akira Yokosawa @ 2018-11-04 0:09 UTC (permalink / raw)
To: Paul E. McKenney; +Cc: perfbook, Akira Yokosawa
From c516d1ab6d3930d0431aa30a4cf4e3004d70a865 Mon Sep 17 00:00:00 2001
From: Akira Yokosawa <akiyks@gmail.com>
Date: Sun, 28 Oct 2018 18:11:44 +0900
Subject: [PATCH 2/7] SMPdesign: Employ new scheme for snippet of lockhdeq.c
To remove " ____cacheline_internodealigned_in_smp" in the resulting
snippet, provide alternative statement in a comment block as follows:
spinlock_t rlock ____cacheline_internodealigned_in_smp; //\fcvexclude
/* -- begin alternative code for snippet \fcvexclude
spinlock_t rlock; //\lnlbl{rlock}
-- end alternative code for snippet \fcvexclude */
Signed-off-by: Akira Yokosawa <akiyks@gmail.com>
---
CodeSamples/SMPdesign/lockhdeq.c | 19 ++++++++++++-------
SMPdesign/partexercises.tex | 24 +++++++-----------------
2 files changed, 19 insertions(+), 24 deletions(-)
diff --git a/CodeSamples/SMPdesign/lockhdeq.c b/CodeSamples/SMPdesign/lockhdeq.c
index d6cc099..8ba635b 100644
--- a/CodeSamples/SMPdesign/lockhdeq.c
+++ b/CodeSamples/SMPdesign/lockhdeq.c
@@ -129,15 +129,20 @@ void deq_push_r(struct cds_list_head *e, struct deq *p)
#define PDEQ_N_BKTS 4
+//\begin{snippet}[labelbase=ln:SMPdesign:lockhdeq:struct_pdeq,commandchars=\\\@\$]
struct pdeq {
- spinlock_t llock;
- int lidx;
- /* char pad1[CACHE_LINE_SIZE - sizeof(spinlock_t) - sizeof(int)]; */
- spinlock_t rlock ____cacheline_internodealigned_in_smp;
- int ridx;
- /* char pad2[CACHE_LINE_SIZE - sizeof(spinlock_t) - sizeof(int)]; */
- struct deq bkt[PDEQ_N_BKTS];
+ spinlock_t llock; //\lnlbl{llock}
+ int lidx; //\lnlbl{lidx}
+ /* char pad1[CACHE_LINE_SIZE - sizeof(spinlock_t) - sizeof(int)]; */ //\fcvexclude
+ spinlock_t rlock ____cacheline_internodealigned_in_smp; //\fcvexclude
+/* -- begin alternative code for snippet \fcvexclude
+ spinlock_t rlock; //\lnlbl{rlock}
+ -- end alternative code for snippet \fcvexclude */
+ int ridx; //\lnlbl{ridx}
+ /* char pad2[CACHE_LINE_SIZE - sizeof(spinlock_t) - sizeof(int)]; */ //\fcvexclude
+ struct deq bkt[PDEQ_N_BKTS]; //\lnlbl{bkt}
};
+//\end{snippet}
static int moveleft(int idx)
{
diff --git a/SMPdesign/partexercises.tex b/SMPdesign/partexercises.tex
index db4b5d8..5b5fd5a 100644
--- a/SMPdesign/partexercises.tex
+++ b/SMPdesign/partexercises.tex
@@ -342,19 +342,7 @@ Each underlying single-lock double-ended queue holds a one-quarter
slice of the full parallel double-ended queue.
\begin{listing}[tbp]
-{ \scriptsize
-\begin{verbbox}
- 1 struct pdeq {
- 2 spinlock_t llock;
- 3 int lidx;
- 4 spinlock_t rlock;
- 5 int ridx;
- 6 struct deq bkt[DEQ_N_BKTS];
- 7 };
-\end{verbbox}
-}
-\centering
-\theverbbox
+\input{CodeSamples/SMPdesign/lockhdeq@struct_pdeq.fcv}
\caption{Lock-Based Parallel Double-Ended Queue Data Structure}
\label{lst:SMPdesign:Lock-Based Parallel Double-Ended Queue Data Structure}
\end{listing}
@@ -363,13 +351,15 @@ Listing~\ref{lst:SMPdesign:Lock-Based Parallel Double-Ended Queue Data Structure
shows the corresponding C-language data structure, assuming an
existing \co{struct deq} that provides a trivially locked
double-ended-queue implementation.
-This data structure contains the left-hand lock on line~2,
-the left-hand index on line~3, the right-hand lock on line~4
+\begin{lineref}[ln:SMPdesign:lockhdeq:struct_pdeq]
+This data structure contains the left-hand lock on line~\lnref{llock},
+the left-hand index on line~\lnref{lidx}, the right-hand lock on line~\lnref{rlock}
(which is cache-aligned in the actual implementation),
-the right-hand index on line~5, and, finally, the hashed array
-of simple lock-based double-ended queues on line~6.
+the right-hand index on line~\lnref{ridx}, and, finally, the hashed array
+of simple lock-based double-ended queues on line~\lnref{bkt}.
A high-performance implementation would of course use padding or special
alignment directives to avoid false sharing.
+\end{lineref}
\begin{listing*}[bp]
{ \scriptsize
--
2.7.4
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 3/7] SMPdesign: Employ new scheme for snippet of lockhdeq.c and locktdeq.c
2018-11-04 0:07 [PATCH 0/7] Further conversion of code snippets (howto, cpu, SMPdesign) Akira Yokosawa
2018-11-04 0:08 ` [PATCH 1/7] howto, cpu: Employ new scheme for command/code snippets Akira Yokosawa
2018-11-04 0:09 ` [PATCH 2/7] SMPdesign: Employ new scheme for snippet of lockhdeq.c Akira Yokosawa
@ 2018-11-04 0:11 ` Akira Yokosawa
2018-11-04 0:11 ` [PATCH 4/7] SMPdesign: Employ new scheme for inline snippets Akira Yokosawa
` (4 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Akira Yokosawa @ 2018-11-04 0:11 UTC (permalink / raw)
To: Paul E. McKenney; +Cc: perfbook, Akira Yokosawa
From 966f43a286dedd32a577409fa54221b42f50d8cd Mon Sep 17 00:00:00 2001
From: Akira Yokosawa <akiyks@gmail.com>
Date: Mon, 29 Oct 2018 21:07:28 +0900
Subject: [PATCH 3/7] SMPdesign: Employ new scheme for snippet of lockhdeq.c and locktdeq.c
Signed-off-by: Akira Yokosawa <akiyks@gmail.com>
---
CodeSamples/SMPdesign/lockhdeq.c | 34 +++---
CodeSamples/SMPdesign/locktdeq.c | 70 ++++++------
SMPdesign/partexercises.tex | 237 +++++++++++++--------------------------
3 files changed, 132 insertions(+), 209 deletions(-)
diff --git a/CodeSamples/SMPdesign/lockhdeq.c b/CodeSamples/SMPdesign/lockhdeq.c
index 8ba635b..1b57423 100644
--- a/CodeSamples/SMPdesign/lockhdeq.c
+++ b/CodeSamples/SMPdesign/lockhdeq.c
@@ -166,19 +166,20 @@ void init_pdeq(struct pdeq *d)
init_deq(&d->bkt[i]);
}
-struct cds_list_head *pdeq_pop_l(struct pdeq *d)
+//\begin{snippet}[labelbase=ln:SMPdesign:lockhdeq:pop_push,commandchars=\\\@\$]
+struct cds_list_head *pdeq_pop_l(struct pdeq *d)//\lnlbl{popl:b}
{
struct cds_list_head *e;
int i;
- spin_lock(&d->llock);
- i = moveright(d->lidx);
- e = deq_pop_l(&d->bkt[i]);
- if (e != NULL)
- d->lidx = i;
- spin_unlock(&d->llock);
- return e;
-}
+ spin_lock(&d->llock); //\lnlbl{popl:acq}
+ i = moveright(d->lidx); //\lnlbl{popl:idx}
+ e = deq_pop_l(&d->bkt[i]); //\lnlbl{popl:deque}
+ if (e != NULL) //\lnlbl{popl:check}
+ d->lidx = i; //\lnlbl{popl:record}
+ spin_unlock(&d->llock); //\lnlbl{popl:rel}
+ return e; //\lnlbl{popl:return}
+} //\lnlbl{popl:e}
struct cds_list_head *pdeq_pop_r(struct pdeq *d)
{
@@ -194,16 +195,16 @@ struct cds_list_head *pdeq_pop_r(struct pdeq *d)
return e;
}
-void pdeq_push_l(struct cds_list_head *e, struct pdeq *d)
+void pdeq_push_l(struct cds_list_head *e, struct pdeq *d)//\lnlbl{pushl:b}
{
int i;
- spin_lock(&d->llock);
- i = d->lidx;
- deq_push_l(e, &d->bkt[i]);
- d->lidx = moveleft(d->lidx);
- spin_unlock(&d->llock);
-}
+ spin_lock(&d->llock); //\lnlbl{pushl:acq}
+ i = d->lidx; //\lnlbl{pushl:idx}
+ deq_push_l(e, &d->bkt[i]); //\lnlbl{pushl:enque}
+ d->lidx = moveleft(d->lidx); //\lnlbl{pushl:update}
+ spin_unlock(&d->llock); //\lnlbl{pushl:rel}
+} //\lnlbl{pushl:e}
void pdeq_push_r(struct cds_list_head *e, struct pdeq *d)
{
@@ -215,6 +216,7 @@ void pdeq_push_r(struct cds_list_head *e, struct pdeq *d)
d->ridx = moveright(d->ridx);
spin_unlock(&d->rlock);
}
+//\end{snippet}
#ifdef TEST
#define DEQ_AND_PDEQ
diff --git a/CodeSamples/SMPdesign/locktdeq.c b/CodeSamples/SMPdesign/locktdeq.c
index 335a68b..4a0ca51 100644
--- a/CodeSamples/SMPdesign/locktdeq.c
+++ b/CodeSamples/SMPdesign/locktdeq.c
@@ -99,58 +99,60 @@ void init_pdeq(struct pdeq *d)
init_deq(&d->rdeq);
}
-struct cds_list_head *pdeq_pop_l(struct pdeq *d)
+//\begin{snippet}[labelbase=ln:SMPdesign:locktdeq:pop_push,commandchars=\\\@\$]
+struct cds_list_head *pdeq_pop_l(struct pdeq *d) //\lnlbl{popl:b}
{
struct cds_list_head *e;
- spin_lock(&d->llock);
- e = deq_pop_l(&d->ldeq);
+ spin_lock(&d->llock); //\lnlbl{popl:acq:l}
+ e = deq_pop_l(&d->ldeq); //\lnlbl{popl:deq:ll}
if (e == NULL) {
- spin_lock(&d->rlock);
- e = deq_pop_l(&d->rdeq);
- cds_list_splice(&d->rdeq.chain, &d->ldeq.chain);
- CDS_INIT_LIST_HEAD(&d->rdeq.chain);
- spin_unlock(&d->rlock);
- }
- spin_unlock(&d->llock);
+ spin_lock(&d->rlock); //\lnlbl{popl:acq:r}
+ e = deq_pop_l(&d->rdeq); //\lnlbl{popl:deq:lr}
+ cds_list_splice(&d->rdeq.chain, &d->ldeq.chain);//\lnlbl{popl:move}
+ CDS_INIT_LIST_HEAD(&d->rdeq.chain); //\lnlbl{popl:init:r}
+ spin_unlock(&d->rlock); //\lnlbl{popl:rel:r}
+ } //\lnlbl{popl:skip}
+ spin_unlock(&d->llock); //\lnlbl{popl:rel:l}
return e;
-}
+} //\lnlbl{popl:e}
-struct cds_list_head *pdeq_pop_r(struct pdeq *d)
+struct cds_list_head *pdeq_pop_r(struct pdeq *d) //\lnlbl{popr:b}
{
struct cds_list_head *e;
- spin_lock(&d->rlock);
- e = deq_pop_r(&d->rdeq);
- if (e == NULL) {
- spin_unlock(&d->rlock);
- spin_lock(&d->llock);
- spin_lock(&d->rlock);
- e = deq_pop_r(&d->rdeq);
- if (e == NULL) {
- e = deq_pop_r(&d->ldeq);
- cds_list_splice(&d->ldeq.chain, &d->rdeq.chain);
- CDS_INIT_LIST_HEAD(&d->ldeq.chain);
+ spin_lock(&d->rlock); //\lnlbl{popr:acq:r1}
+ e = deq_pop_r(&d->rdeq); //\lnlbl{popr:deq:rr1}
+ if (e == NULL) { //\lnlbl{popr:check1}
+ spin_unlock(&d->rlock); //\lnlbl{popr:rel:r1}
+ spin_lock(&d->llock); //\lnlbl{popr:acq:l}
+ spin_lock(&d->rlock); //\lnlbl{popr:acq:r2}
+ e = deq_pop_r(&d->rdeq); //\lnlbl{popr:deq:rr2}
+ if (e == NULL) { //\lnlbl{popr:check2}
+ e = deq_pop_r(&d->ldeq); //\lnlbl{popr:deq:rl}
+ cds_list_splice(&d->ldeq.chain, &d->rdeq.chain);//\lnlbl{popr:move}
+ CDS_INIT_LIST_HEAD(&d->ldeq.chain); //\lnlbl{popr:init:l}
}
- spin_unlock(&d->llock);
- }
- spin_unlock(&d->rlock);
+ spin_unlock(&d->llock); //\lnlbl{popr:rel:l}
+ } //\lnlbl{popr:skip2}
+ spin_unlock(&d->rlock); //\lnlbl{popr:rel:r2}
return e;
-}
+} //\lnlbl{popr:e}
-void pdeq_push_l(struct cds_list_head *e, struct pdeq *d)
+void pdeq_push_l(struct cds_list_head *e, struct pdeq *d) //\lnlbl{pushl:b}
{
- spin_lock(&d->llock);
- deq_push_l(e, &d->ldeq);
- spin_unlock(&d->llock);
-}
+ spin_lock(&d->llock); //\lnlbl{pushl:acq:l}
+ deq_push_l(e, &d->ldeq); //\lnlbl{pushl:que:l}
+ spin_unlock(&d->llock); //\lnlbl{pushl:rel:l}
+} //\lnlbl{pushl:e}
-void pdeq_push_r(struct cds_list_head *e, struct pdeq *d)
+void pdeq_push_r(struct cds_list_head *e, struct pdeq *d) //\lnlbl{pushr:b}
{
spin_lock(&d->rlock);
deq_push_r(e, &d->rdeq);
spin_unlock(&d->rlock);
-}
+} //\lnlbl{pushr:e}
+//\end{snippet}
#ifdef TEST
#include "deqtorture.h"
diff --git a/SMPdesign/partexercises.tex b/SMPdesign/partexercises.tex
index 5b5fd5a..5ce7e86 100644
--- a/SMPdesign/partexercises.tex
+++ b/SMPdesign/partexercises.tex
@@ -361,65 +361,11 @@ A high-performance implementation would of course use padding or special
alignment directives to avoid false sharing.
\end{lineref}
-\begin{listing*}[bp]
-{ \scriptsize
-\begin{verbbox}
- 1 struct cds_list_head *pdeq_pop_l(struct pdeq *d)
- 2 {
- 3 struct cds_list_head *e;
- 4 int i;
- 5
- 6 spin_lock(&d->llock);
- 7 i = moveright(d->lidx);
- 8 e = deq_pop_l(&d->bkt[i]);
- 9 if (e != NULL)
- 10 d->lidx = i;
- 11 spin_unlock(&d->llock);
- 12 return e;
- 13 }
- 14
- 15 struct cds_list_head *pdeq_pop_r(struct pdeq *d)
- 16 {
- 17 struct cds_list_head *e;
- 18 int i;
- 19
- 20 spin_lock(&d->rlock);
- 21 i = moveleft(d->ridx);
- 22 e = deq_pop_r(&d->bkt[i]);
- 23 if (e != NULL)
- 24 d->ridx = i;
- 25 spin_unlock(&d->rlock);
- 26 return e;
- 27 }
- 28
- 29 void pdeq_push_l(struct cds_list_head *e, struct pdeq *d)
- 30 {
- 31 int i;
- 32
- 33 spin_lock(&d->llock);
- 34 i = d->lidx;
- 35 deq_push_l(e, &d->bkt[i]);
- 36 d->lidx = moveleft(d->lidx);
- 37 spin_unlock(&d->llock);
- 38 }
- 39
- 40 void pdeq_push_r(struct cds_list_head *e, struct pdeq *d)
- 41 {
- 42 int i;
- 43
- 44 spin_lock(&d->rlock);
- 45 i = d->ridx;
- 46 deq_push_r(e, &d->bkt[i]);
- 47 d->ridx = moveright(d->ridx);
- 48 spin_unlock(&d->rlock);
- 49 }
-\end{verbbox}
-}
-\centering
-\theverbbox
+\begin{listing}[tbp]
+\input{CodeSamples/SMPdesign/lockhdeq@pop_push.fcv}
\caption{Lock-Based Parallel Double-Ended Queue Implementation}
\label{lst:SMPdesign:Lock-Based Parallel Double-Ended Queue Implementation}
-\end{listing*}
+\end{listing}
Listing~\ref{lst:SMPdesign:Lock-Based Parallel Double-Ended Queue Implementation}
(\path{lockhdeq.c})
@@ -430,22 +376,34 @@ shows the implementation of the enqueue and dequeue functions.\footnote{
Discussion will focus on the left-hand operations, as the right-hand
operations are trivially derived from them.
-Lines~1-13 show \co{pdeq_pop_l()}, which left\-/dequeues and returns
+\begin{lineref}[ln:SMPdesign:lockhdeq:pop_push:popl]
+Lines~\lnref{b}-\lnref{e} show \co{pdeq_pop_l()},
+which left\-/dequeues and returns
an element if possible, returning \co{NULL} otherwise.
-Line~6 acquires the left-hand spinlock, and line~7 computes the
+Line~\lnref{acq} acquires the left-hand spinlock,
+and line~\lnref{idx} computes the
index to be dequeued from.
-Line~8 dequeues the element, and, if line~9 finds the result to be
-non-\co{NULL}, line~10 records the new left-hand index.
-Either way, line~11 releases the lock, and, finally, line~12 returns
+Line~\lnref{deque} dequeues the element, and,
+if line~\lnref{check} finds the result to be
+non-\co{NULL}, line~\lnref{record} records the new left-hand index.
+Either way, line~\lnref{rel} releases the lock, and,
+finally, line~\lnref{return} returns
the element if there was one, or \co{NULL} otherwise.
+\end{lineref}
-Lines~29-38 shows \co{pdeq_push_l()}, which left-enqueues the specified
+\begin{lineref}[ln:SMPdesign:lockhdeq:pop_push:pushl]
+Lines~\lnref{b}-\lnref{e} shows \co{pdeq_push_l()},
+which left-enqueues the specified
element.
-Line~33 acquires the left-hand lock, and line~34 picks up the left-hand
+Line~\lnref{acq} acquires the left-hand lock,
+and line~\lnref{idx} picks up the left-hand
index.
-Line~35 left-enqueues the specified element onto the double-ended queue
+Line~\lnref{enque} left-enqueues the specified element
+onto the double-ended queue
indexed by the left-hand index.
-Line~36 then updates the left-hand index and line~37 releases the lock.
+Line~\lnref{update} then updates the left-hand index
+and line~\lnref{rel} releases the lock.
+\end{lineref}
As noted earlier, the right-hand operations are completely analogous
to their left-handed counterparts, so their analysis is left as an
@@ -500,72 +458,11 @@ the previous section, the compound implementation will build on
a sequential implementation of a double-ended queue that uses
neither locks nor atomic operations.
-\begin{listing*}[bp]
-{ \scriptsize
-\begin{verbbox}
- 1 struct cds_list_head *pdeq_pop_l(struct pdeq *d)
- 2 {
- 3 struct cds_list_head *e;
- 4
- 5 spin_lock(&d->llock);
- 6 e = deq_pop_l(&d->ldeq);
- 7 if (e == NULL) {
- 8 spin_lock(&d->rlock);
- 9 e = deq_pop_l(&d->rdeq);
- 10 cds_list_splice(&d->rdeq.chain, &d->ldeq.chain);
- 11 CDS_INIT_LIST_HEAD(&d->rdeq.chain);
- 12 spin_unlock(&d->rlock);
- 13 }
- 14 spin_unlock(&d->llock);
- 15 return e;
- 16 }
- 17
- 18 struct cds_list_head *pdeq_pop_r(struct pdeq *d)
- 19 {
- 20 struct cds_list_head *e;
- 21
- 22 spin_lock(&d->rlock);
- 23 e = deq_pop_r(&d->rdeq);
- 24 if (e == NULL) {
- 25 spin_unlock(&d->rlock);
- 26 spin_lock(&d->llock);
- 27 spin_lock(&d->rlock);
- 28 e = deq_pop_r(&d->rdeq);
- 29 if (e == NULL) {
- 30 e = deq_pop_r(&d->ldeq);
- 31 cds_list_splice(&d->ldeq.chain, &d->rdeq.chain);
- 32 CDS_INIT_LIST_HEAD(&d->ldeq.chain);
- 33 }
- 34 spin_unlock(&d->llock);
- 35 }
- 36 spin_unlock(&d->rlock);
- 37 return e;
- 38 }
- 39
- 40 void pdeq_push_l(struct cds_list_head *e, struct pdeq *d)
- 41 {
- 42 int i;
- 43
- 44 spin_lock(&d->llock);
- 45 deq_push_l(e, &d->ldeq);
- 46 spin_unlock(&d->llock);
- 47 }
- 48
- 49 void pdeq_push_r(struct cds_list_head *e, struct pdeq *d)
- 50 {
- 51 int i;
- 52
- 53 spin_lock(&d->rlock);
- 54 deq_push_r(e, &d->rdeq);
- 55 spin_unlock(&d->rlock);
- 56 }
-\end{verbbox}
-}
-\centering
-\theverbbox
+\begin{listing}[tbp]
+\input{CodeSamples/SMPdesign/locktdeq@pop_push.fcv}
\caption{Compound Parallel Double-Ended Queue Implementation}
\label{lst:SMPdesign:Compound Parallel Double-Ended Queue Implementation}
-\end{listing*}
+\end{listing}
Listing~\ref{lst:SMPdesign:Compound Parallel Double-Ended Queue Implementation}
shows the implementation.
@@ -583,50 +480,66 @@ and \co{pdeq_pop_r()} implementations separately.
(see Section~\ref{sec:SMPdesign:Dining Philosophers Problem}).
} \QuickQuizEnd
-The \co{pdeq_pop_l()} implementation is shown on lines~1-16
+\begin{lineref}[ln:SMPdesign:locktdeq:pop_push:popl]
+The \co{pdeq_pop_l()} implementation is shown on
+lines~\lnref{b}-\lnref{e}
of the figure.
-Line~5 acquires the left-hand lock, which line~14 releases.
-Line~6 attempts to left-dequeue an element from the left-hand underlying
-double-ended queue, and, if successful, skips lines~8-13 to simply
+Line~\lnref{acq:l} acquires the left-hand lock,
+which line~\lnref{rel:l} releases.
+Line~\lnref{deq:ll} attempts to left-dequeue an element
+from the left-hand underlying
+double-ended queue, and, if successful,
+skips lines~\lnref{acq:r}-\lnref{skip} to simply
return this element.
-Otherwise, line~8 acquires the right-hand lock, line~9
+Otherwise, line~\lnref{acq:r} acquires the right-hand lock, line~\lnref{deq:lr}
left-dequeues an element from the right-hand queue,
-and line~10 moves any remaining elements on the right-hand
-queue to the left-hand queue, line~11 initializes the right-hand queue,
-and line~12 releases the right-hand lock.
-The element, if any, that was dequeued on line~10 will be returned.
+and line~\lnref{move} moves any remaining elements on the right-hand
+queue to the left-hand queue, line~\lnref{init:r} initializes
+the right-hand queue,
+and line~\lnref{rel:r} releases the right-hand lock.
+The element, if any, that was dequeued on line~\lnref{deq:lr} will be returned.
+\end{lineref}
-The \co{pdeq_pop_r()} implementation is shown on lines~18-38
+\begin{lineref}[ln:SMPdesign:locktdeq:pop_push:popr]
+The \co{pdeq_pop_r()} implementation is shown on lines~\lnref{b}-\lnref{e}
of the figure.
-As before, line~22 acquires the right-hand lock (and line~36
-releases it), and line~23 attempts to right-dequeue an element
-from the right-hand queue, and, if successful, skips lines~24-35
+As before, line~\lnref{acq:r1} acquires the right-hand lock
+(and line~\lnref{rel:r2}
+releases it), and line~\lnref{deq:rr1} attempts to right-dequeue an element
+from the right-hand queue, and, if successful,
+skips lines~\lnref{rel:r1}-\lnref{skip2}
to simply return this element.
-However, if line~24 determines that there was no element to dequeue,
-line~25 releases the right-hand lock and lines~26-27 acquire both
+However, if line~\lnref{check1} determines that there was no element to dequeue,
+line~\lnref{rel:r1} releases the right-hand lock and
+lines~\lnref{acq:l}-\lnref{acq:r2} acquire both
locks in the proper order.
-Line~28 then attempts to right-dequeue an element from the right-hand
-list again, and if line~29 determines that this second attempt has
-failed, line~30 right-dequeues an element from the left-hand queue
-(if there is one available), line~31 moves any remaining elements
-from the left-hand queue to the right-hand queue, and line~32
+Line~\lnref{deq:rr2} then attempts to right-dequeue an element
+from the right-hand
+list again, and if line~\lnref{check2} determines that this second attempt has
+failed, line~\lnref{deq:rl} right-dequeues an element from the left-hand queue
+(if there is one available), line~\lnref{move} moves any remaining elements
+from the left-hand queue to the right-hand queue, and line~\lnref{init:l}
initializes the left-hand queue.
-Either way, line~34 releases the left-hand lock.
+Either way, line~\lnref{rel:l} releases the left-hand lock.
+\end{lineref}
\QuickQuiz{}
Why is it necessary to retry the right-dequeue operation
- on line~28 of
+ on line~\ref{ln:SMPdesign:locktdeq:pop_push:popr:deq:rr2} of
Listing~\ref{lst:SMPdesign:Compound Parallel Double-Ended Queue Implementation}?
\QuickQuizAnswer{
+ \begin{lineref}[ln:SMPdesign:locktdeq:pop_push:popr]
This retry is necessary because some other thread might have
enqueued an element between the time that this thread dropped
- \co{d->rlock} on line~25 and the time that it reacquired this
- same lock on line~27.
+ \co{d->rlock} on line~\lnref{rel:r1} and the time that it reacquired this
+ same lock on line~\lnref{acq:r2}.
+ \end{lineref}
} \QuickQuizEnd
\QuickQuiz{}
Surely the left-hand lock must \emph{sometimes} be available!!!
- So why is it necessary that line~25 of
+ So why is it necessary that
+ line~\ref{ln:SMPdesign:locktdeq:pop_push:popr:rel:r1} of
Listing~\ref{lst:SMPdesign:Compound Parallel Double-Ended Queue Implementation}
unconditionally release the right-hand lock?
\QuickQuizAnswer{
@@ -638,13 +551,19 @@ Either way, line~34 releases the left-hand lock.
it is worthwhile) is left as an exercise for the reader.
} \QuickQuizEnd
-The \co{pdeq_push_l()} implementation is shown on lines~40-47 of
+\begin{lineref}[ln:SMPdesign:locktdeq:pop_push:pushl]
+The \co{pdeq_push_l()} implementation is shown on
+lines~\lnref{b}-\lnref{e} of
Listing~\ref{lst:SMPdesign:Compound Parallel Double-Ended Queue Implementation}.
-Line~44 acquires the left-hand spinlock, line~45 left-enqueues the
-element onto the left-hand queue, and finally line~46 releases
+Line~\lnref{acq:l} acquires the left-hand spinlock,
+line~\lnref{que:l} left-enqueues the
+element onto the left-hand queue, and finally line~\lnref{rel:l} releases
the lock.
-The \co{pdeq_enqueue_r()} implementation (shown on lines~49-56)
+\end{lineref}
+\begin{lineref}[ln:SMPdesign:locktdeq:pop_push:pushr]
+The \co{pdeq_push_r()} implementation (shown on lines~\lnref{b}-\lnref{e})
is quite similar.
+\end{lineref}
\QuickQuiz{}
But in the case where data is flowing in only one direction,
--
2.7.4
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 4/7] SMPdesign: Employ new scheme for inline snippets
2018-11-04 0:07 [PATCH 0/7] Further conversion of code snippets (howto, cpu, SMPdesign) Akira Yokosawa
` (2 preceding siblings ...)
2018-11-04 0:11 ` [PATCH 3/7] SMPdesign: Employ new scheme for snippet of lockhdeq.c and locktdeq.c Akira Yokosawa
@ 2018-11-04 0:11 ` Akira Yokosawa
2018-11-04 0:13 ` [PATCH 5/7] SMPdesign: Employ new scheme for snippets from smpalloc.c Akira Yokosawa
` (3 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Akira Yokosawa @ 2018-11-04 0:11 UTC (permalink / raw)
To: Paul E. McKenney; +Cc: perfbook, Akira Yokosawa
From 798445931ab13a7fab28d9a9271f719adf14c89a Mon Sep 17 00:00:00 2001
From: Akira Yokosawa <akiyks@gmail.com>
Date: Thu, 1 Nov 2018 07:45:05 +0900
Subject: [PATCH 4/7] SMPdesign: Employ new scheme for inline snippets
Signed-off-by: Akira Yokosawa <akiyks@gmail.com>
---
SMPdesign/SMPdesign.tex | 357 +++++++++++++++++++++++-------------------------
1 file changed, 170 insertions(+), 187 deletions(-)
diff --git a/SMPdesign/SMPdesign.tex b/SMPdesign/SMPdesign.tex
index 824de41..06a0c8c 100644
--- a/SMPdesign/SMPdesign.tex
+++ b/SMPdesign/SMPdesign.tex
@@ -120,36 +120,32 @@ In contrast, speedups due to sequential optimizations, for example,
careful choice of data structure, can be arbitrarily large.
\begin{listing}[tbhp]
-{ \scriptsize
-\begin{verbbox}
- 1 struct hash_table
- 2 {
- 3 long nbuckets;
- 4 struct node **buckets;
- 5 };
- 6
- 7 typedef struct node {
- 8 unsigned long key;
- 9 struct node *next;
- 10 } node_t;
- 11
- 12 int hash_search(struct hash_table *h, long key)
- 13 {
- 14 struct node *cur;
- 15
- 16 cur = h->buckets[key % h->nbuckets];
- 17 while (cur != NULL) {
- 18 if (cur->key >= key) {
- 19 return (cur->key == key);
- 20 }
- 21 cur = cur->next;
- 22 }
- 23 return 0;
- 24 }
-\end{verbbox}
+\begin{VerbatimL}[commandchars=\\\[\]]
+struct hash_table
+{
+ long nbuckets;
+ struct node **buckets;
+};
+
+typedef struct node {
+ unsigned long key;
+ struct node *next;
+} node_t;
+
+int hash_search(struct hash_table *h, long key)
+{
+ struct node *cur;
+
+ cur = h->buckets[key % h->nbuckets];
+ while (cur != NULL) {
+ if (cur->key >= key) {
+ return (cur->key == key);
+ }
+ cur = cur->next;
+ }
+ return 0;
}
-\centering
-\theverbbox
+\end{VerbatimL}
\caption{Sequential-Program Hash Table Search}
\label{lst:SMPdesign:Sequential-Program Hash Table Search}
\end{listing}
@@ -197,43 +193,39 @@ has now become three statements due to the need to release the
lock before returning.
\begin{listing}[tbhp]
-{ \scriptsize
-\begin{verbbox}
- 1 spinlock_t hash_lock;
- 2
- 3 struct hash_table
- 4 {
- 5 long nbuckets;
- 6 struct node **buckets;
- 7 };
- 8
- 9 typedef struct node {
- 10 unsigned long key;
- 11 struct node *next;
- 12 } node_t;
- 13
- 14 int hash_search(struct hash_table *h, long key)
- 15 {
- 16 struct node *cur;
- 17 int retval;
- 18
- 19 spin_lock(&hash_lock);
- 20 cur = h->buckets[key % h->nbuckets];
- 21 while (cur != NULL) {
- 22 if (cur->key >= key) {
- 23 retval = (cur->key == key);
- 24 spin_unlock(&hash_lock);
- 25 return retval;
- 26 }
- 27 cur = cur->next;
- 28 }
- 29 spin_unlock(&hash_lock);
- 30 return 0;
- 31 }
-\end{verbbox}
+\begin{VerbatimL}[commandchars=\\\[\]]
+spinlock_t hash_lock;
+
+struct hash_table
+{
+ long nbuckets;
+ struct node **buckets;
+};
+
+typedef struct node {
+ unsigned long key;
+ struct node *next;
+} node_t;
+
+int hash_search(struct hash_table *h, long key)
+{
+ struct node *cur;
+ int retval;
+
+ spin_lock(&hash_lock);
+ cur = h->buckets[key % h->nbuckets];
+ while (cur != NULL) {
+ if (cur->key >= key) {
+ retval = (cur->key == key);
+ spin_unlock(&hash_lock);
+ return retval;
+ }
+ cur = cur->next;
+ }
+ spin_unlock(&hash_lock);
+ return 0;
}
-\centering
-\theverbbox
+\end{VerbatimL}
\caption{Code-Locking Hash Table Search}
\label{lst:SMPdesign:Code-Locking Hash Table Search}
\end{listing}
@@ -292,48 +284,44 @@ The increased scalability again results in a slight increase in complexity
in the form of an additional data structure, the \co{struct bucket}.
\begin{listing}[tb]
-{ \scriptsize
-\begin{verbbox}
- 1 struct hash_table
- 2 {
- 3 long nbuckets;
- 4 struct bucket **buckets;
- 5 };
- 6
- 7 struct bucket {
- 8 spinlock_t bucket_lock;
- 9 node_t *list_head;
- 10 };
- 11
- 12 typedef struct node {
- 13 unsigned long key;
- 14 struct node *next;
- 15 } node_t;
- 16
- 17 int hash_search(struct hash_table *h, long key)
- 18 {
- 19 struct bucket *bp;
- 20 struct node *cur;
- 21 int retval;
- 22
- 23 bp = h->buckets[key % h->nbuckets];
- 24 spin_lock(&bp->bucket_lock);
- 25 cur = bp->list_head;
- 26 while (cur != NULL) {
- 27 if (cur->key >= key) {
- 28 retval = (cur->key == key);
- 29 spin_unlock(&bp->bucket_lock);
- 30 return retval;
- 31 }
- 32 cur = cur->next;
- 33 }
- 34 spin_unlock(&bp->bucket_lock);
- 35 return 0;
- 36 }
-\end{verbbox}
+\begin{VerbatimL}
+struct hash_table
+{
+ long nbuckets;
+ struct bucket **buckets;
+};
+
+struct bucket {
+ spinlock_t bucket_lock;
+ node_t *list_head;
+};
+
+typedef struct node {
+ unsigned long key;
+ struct node *next;
+} node_t;
+
+int hash_search(struct hash_table *h, long key)
+{
+ struct bucket *bp;
+ struct node *cur;
+ int retval;
+
+ bp = h->buckets[key % h->nbuckets];
+ spin_lock(&bp->bucket_lock);
+ cur = bp->list_head;
+ while (cur != NULL) {
+ if (cur->key >= key) {
+ retval = (cur->key == key);
+ spin_unlock(&bp->bucket_lock);
+ return retval;
+ }
+ cur = cur->next;
+ }
+ spin_unlock(&bp->bucket_lock);
+ return 0;
}
-\centering
-\theverbbox
+\end{VerbatimL}
\caption{Data-Locking Hash Table Search}
\label{lst:SMPdesign:Data-Locking Hash Table Search}
\end{listing}
@@ -805,43 +793,39 @@ Listing~\ref{lst:SMPdesign:Reader-Writer-Locking Hash Table Search}
shows how the hash search might be implemented using reader-writer locking.
\begin{listing}[tbp]
-{ \scriptsize
-\begin{verbbox}
- 1 rwlock_t hash_lock;
- 2
- 3 struct hash_table
- 4 {
- 5 long nbuckets;
- 6 struct node **buckets;
- 7 };
- 8
- 9 typedef struct node {
- 10 unsigned long key;
- 11 struct node *next;
- 12 } node_t;
- 13
- 14 int hash_search(struct hash_table *h, long key)
- 15 {
- 16 struct node *cur;
- 17 int retval;
- 18
- 19 read_lock(&hash_lock);
- 20 cur = h->buckets[key % h->nbuckets];
- 21 while (cur != NULL) {
- 22 if (cur->key >= key) {
- 23 retval = (cur->key == key);
- 24 read_unlock(&hash_lock);
- 25 return retval;
- 26 }
- 27 cur = cur->next;
- 28 }
- 29 read_unlock(&hash_lock);
- 30 return 0;
- 31 }
-\end{verbbox}
+\begin{VerbatimL}[commandchars=\\\[\]]
+rwlock_t hash_lock;
+
+struct hash_table
+{
+ long nbuckets;
+ struct node **buckets;
+};
+
+typedef struct node {
+ unsigned long key;
+ struct node *next;
+} node_t;
+
+int hash_search(struct hash_table *h, long key)
+{
+ struct node *cur;
+ int retval;
+
+ read_lock(&hash_lock);
+ cur = h->buckets[key % h->nbuckets];
+ while (cur != NULL) {
+ if (cur->key >= key) {
+ retval = (cur->key == key);
+ read_unlock(&hash_lock);
+ return retval;
+ }
+ cur = cur->next;
+ }
+ read_unlock(&hash_lock);
+ return 0;
}
-\centering
-\theverbbox
+\end{VerbatimL}
\caption{Reader-Writer-Locking Hash Table Search}
\label{lst:SMPdesign:Reader-Writer-Locking Hash Table Search}
\end{listing}
@@ -868,51 +852,49 @@ In this case, the simpler data-locking approach would be simpler
and likely perform better.
\begin{listing}[tb]
-{ \scriptsize
-\begin{verbbox}
- 1 struct hash_table
- 2 {
- 3 long nbuckets;
- 4 struct bucket **buckets;
- 5 };
- 6
- 7 struct bucket {
- 8 spinlock_t bucket_lock;
- 9 node_t *list_head;
- 10 };
- 11
- 12 typedef struct node {
- 13 spinlock_t node_lock;
- 14 unsigned long key;
- 15 struct node *next;
- 16 } node_t;
- 17
- 18 int hash_search(struct hash_table *h, long key)
- 19 {
- 20 struct bucket *bp;
- 21 struct node *cur;
- 22 int retval;
- 23
- 24 bp = h->buckets[key % h->nbuckets];
- 25 spin_lock(&bp->bucket_lock);
- 26 cur = bp->list_head;
- 27 while (cur != NULL) {
- 28 if (cur->key >= key) {
- 29 spin_lock(&cur->node_lock);
- 30 spin_unlock(&bp->bucket_lock);
- 31 retval = (cur->key == key);
- 32 spin_unlock(&cur->node_lock);
- 33 return retval;
- 34 }
- 35 cur = cur->next;
- 36 }
- 37 spin_unlock(&bp->bucket_lock);
- 38 return 0;
- 39 }
-\end{verbbox}
+\begin{linelabel}[ln:SMPdesign:Hierarchical-Locking Hash Table Search]
+\begin{VerbatimL}[commandchars=\\\[\]]
+struct hash_table
+{
+ long nbuckets;
+ struct bucket **buckets;
+};
+
+struct bucket {
+ spinlock_t bucket_lock;
+ node_t *list_head;
+};
+
+typedef struct node {
+ spinlock_t node_lock;
+ unsigned long key;
+ struct node *next;
+} node_t;
+
+int hash_search(struct hash_table *h, long key)
+{
+ struct bucket *bp;
+ struct node *cur;
+ int retval;
+
+ bp = h->buckets[key % h->nbuckets];
+ spin_lock(&bp->bucket_lock);
+ cur = bp->list_head;
+ while (cur != NULL) {
+ if (cur->key >= key) {
+ spin_lock(&cur->node_lock);
+ spin_unlock(&bp->bucket_lock);
+ retval = (cur->key == key);\lnlbl[retval]
+ spin_unlock(&cur->node_lock);
+ return retval;
+ }
+ cur = cur->next;
+ }
+ spin_unlock(&bp->bucket_lock);
+ return 0;
}
-\centering
-\theverbbox
+\end{VerbatimL}
+\end{linelabel}
\caption{Hierarchical-Locking Hash Table Search}
\label{lst:SMPdesign:Hierarchical-Locking Hash Table Search}
\end{listing}
@@ -920,7 +902,8 @@ and likely perform better.
\QuickQuiz{}
In what situation would hierarchical locking work well?
\QuickQuizAnswer{
- If the comparison on line~31 of
+ If the comparison on
+ line~\ref{ln:SMPdesign:Hierarchical-Locking Hash Table Search:retval} of
Listing~\ref{lst:SMPdesign:Hierarchical-Locking Hash Table Search}
were replaced by a much heavier-weight operation,
then releasing \co{bp->bucket_lock} \emph{might} reduce lock
--
2.7.4
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 5/7] SMPdesign: Employ new scheme for snippets from smpalloc.c
2018-11-04 0:07 [PATCH 0/7] Further conversion of code snippets (howto, cpu, SMPdesign) Akira Yokosawa
` (3 preceding siblings ...)
2018-11-04 0:11 ` [PATCH 4/7] SMPdesign: Employ new scheme for inline snippets Akira Yokosawa
@ 2018-11-04 0:13 ` Akira Yokosawa
2018-11-04 0:13 ` [PATCH 6/7] SMPdesign/beyond: Employ new scheme for inline pseudocode snippets Akira Yokosawa
` (2 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Akira Yokosawa @ 2018-11-04 0:13 UTC (permalink / raw)
To: Paul E. McKenney; +Cc: perfbook, Akira Yokosawa
From b5efb413687874a667443943069c021e6c827d7b Mon Sep 17 00:00:00 2001
From: Akira Yokosawa <akiyks@gmail.com>
Date: Fri, 2 Nov 2018 00:22:35 +0900
Subject: [PATCH 5/7] SMPdesign: Employ new scheme for snippets from smpalloc.c
NOTE: Names "percpumem" and "percpumempool" are replaced with
"pertheardmem" and "perthreadmempool" to respect those used in
CodeSamples/SMPdesign/smpalloc.c. Wording of "per-CPU" around them
in the text is also modified to "per-thread".
Signed-off-by: Akira Yokosawa <akiyks@gmail.com>
---
CodeSamples/SMPdesign/smpalloc.c | 57 ++++++++++--------
SMPdesign/SMPdesign.tex | 122 ++++++++++-----------------------------
2 files changed, 63 insertions(+), 116 deletions(-)
diff --git a/CodeSamples/SMPdesign/smpalloc.c b/CodeSamples/SMPdesign/smpalloc.c
index 149e226..72262d6 100644
--- a/CodeSamples/SMPdesign/smpalloc.c
+++ b/CodeSamples/SMPdesign/smpalloc.c
@@ -21,13 +21,14 @@
#include "../api.h"
+//\begin{snippet}[labelbase=ln:SMPdesign:smpalloc:data_struct,commandchars=\\\@\$]
#define TARGET_POOL_SIZE 3
#define GLOBAL_POOL_SIZE 40
-struct memblock {
- char *bytes[CACHE_LINE_SIZE];
-} memblocks[GLOBAL_POOL_SIZE];
-
+struct memblock { //\fcvexclude
+ char *bytes[CACHE_LINE_SIZE]; //\fcvexclude
+} memblocks[GLOBAL_POOL_SIZE]; //\fcvexclude
+ //\fcvexclude
struct globalmempool {
spinlock_t mutex;
int cur;
@@ -40,46 +41,54 @@ struct perthreadmempool {
};
DEFINE_PER_THREAD(struct perthreadmempool, perthreadmem);
+//\end{snippet}
+//\begin{snippet}[labelbase=ln:SMPdesign:smpalloc:alloc,commandchars=\\\@\$]
struct memblock *memblock_alloc(void)
{
int i;
struct memblock *p;
- struct perthreadmempool *pcpp = &__get_thread_var(perthreadmem);
+ struct perthreadmempool *pcpp;
- if (pcpp->cur < 0) {
- spin_lock(&globalmem.mutex);
- for (i = 0; i < TARGET_POOL_SIZE && globalmem.cur >= 0; i++) {
+ pcpp = &__get_thread_var(perthreadmem); //\lnlbl{pick}
+ if (pcpp->cur < 0) { //\lnlbl{chk:empty}
+ spin_lock(&globalmem.mutex); //\lnlbl{ack}
+ for (i = 0; i < TARGET_POOL_SIZE && //\lnlbl{loop:b}
+ globalmem.cur >= 0; i++) {
pcpp->pool[i] = globalmem.pool[globalmem.cur];
globalmem.pool[globalmem.cur--] = NULL;
- }
- pcpp->cur = i - 1;
- spin_unlock(&globalmem.mutex);
+ } //\lnlbl{loop:e}
+ pcpp->cur = i - 1; //\lnlbl{set}
+ spin_unlock(&globalmem.mutex); //\lnlbl{rel}
}
- if (pcpp->cur >= 0) {
- p = pcpp->pool[pcpp->cur];
+ if (pcpp->cur >= 0) { //\lnlbl{chk:notempty}
+ p = pcpp->pool[pcpp->cur]; //\lnlbl{rem:b}
pcpp->pool[pcpp->cur--] = NULL;
- return p;
+ return p; //\lnlbl{rem:e}
}
- return NULL;
+ return NULL; //\lnlbl{ret:NULL}
}
+//\end{snippet}
+//\begin{snippet}[labelbase=ln:SMPdesign:smpalloc:free,commandchars=\\\@\$]
void memblock_free(struct memblock *p)
{
int i;
- struct perthreadmempool *pcpp = &__get_thread_var(perthreadmem);
+ struct perthreadmempool *pcpp;
- if (pcpp->cur >= 2 * TARGET_POOL_SIZE - 1) {
- spin_lock(&globalmem.mutex);
- for (i = pcpp->cur; i >= TARGET_POOL_SIZE; i--) {
+ pcpp = &__get_thread_var(perthreadmem); //\lnlbl{get}
+ if (pcpp->cur >= 2 * TARGET_POOL_SIZE - 1) { //\lnlbl{chk:full}
+ spin_lock(&globalmem.mutex); //\lnlbl{acq}
+ for (i = pcpp->cur; i >= TARGET_POOL_SIZE; i--) {//\lnlbl{loop:b}
globalmem.pool[++globalmem.cur] = pcpp->pool[i];
pcpp->pool[i] = NULL;
- }
- pcpp->cur = i;
- spin_unlock(&globalmem.mutex);
- }
- pcpp->pool[++pcpp->cur] = p;
+ } //\lnlbl{loop:e}
+ pcpp->cur = i; //\lnlbl{set}
+ spin_unlock(&globalmem.mutex); //\lnlbl{rel}
+ } //\lnlbl{empty:e}
+ pcpp->pool[++pcpp->cur] = p; //\lnlbl{place}
}
+//\end{snippet}
void initpools(void)
{
diff --git a/SMPdesign/SMPdesign.tex b/SMPdesign/SMPdesign.tex
index 06a0c8c..3019ef4 100644
--- a/SMPdesign/SMPdesign.tex
+++ b/SMPdesign/SMPdesign.tex
@@ -971,8 +971,8 @@ caches are shown in
Listing~\ref{lst:SMPdesign:Allocator-Cache Data Structures}.
The ``Global Pool'' of Figure~\ref{fig:SMPdesign:Allocator Cache Schematic}
is implemented by \co{globalmem} of type \co{struct globalmempool},
-and the two CPU pools by the per-CPU variable \co{percpumem} of
-type \co{struct percpumempool}.
+and the two CPU pools by the per-thread variable \co{perthreadmem} of
+type \co{struct perthreadmempool}.
Both of these data structures have arrays of pointers to blocks
in their \co{pool} fields, which are filled from index zero upwards.
Thus, if \co{globalmem.pool[3]} is \co{NULL}, then the remainder of
@@ -988,27 +988,7 @@ must be empty.\footnote{
a feel for its operation.}
\begin{listing}[tbp]
-{ \scriptsize
-\begin{verbbox}
- 1 #define TARGET_POOL_SIZE 3
- 2 #define GLOBAL_POOL_SIZE 40
- 3
- 4 struct globalmempool {
- 5 spinlock_t mutex;
- 6 int cur;
- 7 struct memblock *pool[GLOBAL_POOL_SIZE];
- 8 } globalmem;
- 9
- 10 struct percpumempool {
- 11 int cur;
- 12 struct memblock *pool[2 * TARGET_POOL_SIZE];
- 13 };
- 14
- 15 DEFINE_PER_THREAD(struct percpumempool, percpumem);
-\end{verbbox}
-}
-\centering
-\theverbbox
+\input{CodeSamples/SMPdesign/smpalloc@data_struct.fcv}
\caption{Allocator-Cache Data Structures}
\label{lst:SMPdesign:Allocator-Cache Data Structures}
\end{listing}
@@ -1033,97 +1013,55 @@ smaller than the number of non-\co{NULL} pointers.
\subsubsection{Allocation Function}
+\begin{lineref}[ln:SMPdesign:smpalloc:alloc]
The allocation function \co{memblock_alloc()} may be seen in
Listing~\ref{lst:SMPdesign:Allocator-Cache Allocator Function}.
-Line~7 picks up the current thread's per-thread pool,
+Line~\lnref{pick} picks up the current thread's per-thread pool,
and line~8 check to see if it is empty.
-If so, lines~9-16 attempt to refill it from the global pool
-under the spinlock acquired on line~9 and released on line~16.
-Lines~10-14 move blocks from the global to the per-thread pool until
+If so, lines~\lnref{ack}-\lnref{rel} attempt to refill it
+from the global pool
+under the spinlock acquired on line~\lnref{ack} and released on line~\lnref{rel}.
+Lines~\lnref{loop:b}-\lnref{loop:e} move blocks from the global
+to the per-thread pool until
either the local pool reaches its target size (half full) or
-the global pool is exhausted, and line~15 sets the per-thread pool's
+the global pool is exhausted, and line~\lnref{set} sets the per-thread pool's
count to the proper value.
-In either case, line~18 checks for the per-thread pool still being
-empty, and if not, lines~19-21 remove a block and return it.
-Otherwise, line~23 tells the sad tale of memory exhaustion.
+In either case, line~\lnref{chk:notempty} checks for the per-thread
+pool still being
+empty, and if not, lines~\lnref{rem:b}-\lnref{rem:e} remove a block and return it.
+Otherwise, line~\lnref{ret:NULL} tells the sad tale of memory exhaustion.
+\end{lineref}
\begin{listing}[tbp]
-{ \scriptsize
-\begin{verbbox}
- 1 struct memblock *memblock_alloc(void)
- 2 {
- 3 int i;
- 4 struct memblock *p;
- 5 struct percpumempool *pcpp;
- 6
- 7 pcpp = &__get_thread_var(percpumem);
- 8 if (pcpp->cur < 0) {
- 9 spin_lock(&globalmem.mutex);
- 10 for (i = 0; i < TARGET_POOL_SIZE &&
- 11 globalmem.cur >= 0; i++) {
- 12 pcpp->pool[i] = globalmem.pool[globalmem.cur];
- 13 globalmem.pool[globalmem.cur--] = NULL;
- 14 }
- 15 pcpp->cur = i - 1;
- 16 spin_unlock(&globalmem.mutex);
- 17 }
- 18 if (pcpp->cur >= 0) {
- 19 p = pcpp->pool[pcpp->cur];
- 20 pcpp->pool[pcpp->cur--] = NULL;
- 21 return p;
- 22 }
- 23 return NULL;
- 24 }
-\end{verbbox}
-}
-\centering
-\theverbbox
+\input{CodeSamples/SMPdesign/smpalloc@alloc.fcv}
\caption{Allocator-Cache Allocator Function}
\label{lst:SMPdesign:Allocator-Cache Allocator Function}
\end{listing}
\subsubsection{Free Function}
+\begin{lineref}[ln:SMPdesign:smpalloc:free]
Listing~\ref{lst:SMPdesign:Allocator-Cache Free Function} shows
the memory-block free function.
-Line~6 gets a pointer to this thread's pool, and
-line~7 checks to see if this per-thread pool is full.
-
-If so, lines~8-15 empty half of the per-thread pool into the global pool,
-with lines~8 and~14 acquiring and releasing the spinlock.
-Lines~9-12 implement the loop moving blocks from the local to the
-global pool, and line~13 sets the per-thread pool's count to the proper
+Line~\lnref{get} gets a pointer to this thread's pool, and
+line~\lnref{chk:full} checks to see if this per-thread pool is full.
+
+If so, lines~\lnref{acq}-\lnref{empty:e} empty half of the per-thread pool
+into the global pool,
+with lines~\lnref{acq} and~\lnref{rel} acquiring and releasing the spinlock.
+Lines~\lnref{loop:b}-\lnref{loop:e} implement the loop moving blocks
+from the local to the
+global pool, and line~\lnref{set} sets the per-thread pool's count to the proper
value.
-In either case, line~16 then places the newly freed block into the
+In either case, line~\lnref{place} then places the newly freed block into the
per-thread pool.
+\end{lineref}
\begin{listing}[tbp]
-{ \scriptsize
-\begin{verbbox}
- 1 void memblock_free(struct memblock *p)
- 2 {
- 3 int i;
- 4 struct percpumempool *pcpp;
- 5
- 6 pcpp = &__get_thread_var(percpumem);
- 7 if (pcpp->cur >= 2 * TARGET_POOL_SIZE - 1) {
- 8 spin_lock(&globalmem.mutex);
- 9 for (i = pcpp->cur; i >= TARGET_POOL_SIZE; i--) {
- 10 globalmem.pool[++globalmem.cur] = pcpp->pool[i];
- 11 pcpp->pool[i] = NULL;
- 12 }
- 13 pcpp->cur = i;
- 14 spin_unlock(&globalmem.mutex);
- 15 }
- 16 pcpp->pool[++pcpp->cur] = p;
- 17 }
-\end{verbbox}
-}
-\centering
-\theverbbox
+\input{CodeSamples/SMPdesign/smpalloc@free.fcv}
\caption{Allocator-Cache Free Function}
\label{lst:SMPdesign:Allocator-Cache Free Function}
\end{listing}
--
2.7.4
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 6/7] SMPdesign/beyond: Employ new scheme for inline pseudocode snippets
2018-11-04 0:07 [PATCH 0/7] Further conversion of code snippets (howto, cpu, SMPdesign) Akira Yokosawa
` (4 preceding siblings ...)
2018-11-04 0:13 ` [PATCH 5/7] SMPdesign: Employ new scheme for snippets from smpalloc.c Akira Yokosawa
@ 2018-11-04 0:13 ` Akira Yokosawa
2018-11-04 0:15 ` [PATCH 7/7] CodeSamples/SMPdesign/maze: Substitute {READ/WRITE}_ONCE() for ACCESS_ONCE() Akira Yokosawa
2018-11-04 20:30 ` [PATCH 0/7] Further conversion of code snippets (howto, cpu, SMPdesign) Paul E. McKenney
7 siblings, 0 replies; 9+ messages in thread
From: Akira Yokosawa @ 2018-11-04 0:13 UTC (permalink / raw)
To: Paul E. McKenney; +Cc: perfbook, Akira Yokosawa
From b36ad7a4fc1e62b276bb607a6a654e7efa5fec38 Mon Sep 17 00:00:00 2001
From: Akira Yokosawa <akiyks@gmail.com>
Date: Sat, 3 Nov 2018 00:13:28 +0900
Subject: [PATCH 6/7] SMPdesign/beyond: Employ new scheme for inline pseudocode snippets
Also mention Listing~\ref{lst:SMPdesign:SEQ Pseudocode} as pseudocode,
And replace remaining ACCESS_ONCE()s with READ_ONCE()s.
Signed-off-by: Akira Yokosawa <akiyks@gmail.com>
---
SMPdesign/beyond.tex | 319 +++++++++++++++++++++++++++------------------------
1 file changed, 168 insertions(+), 151 deletions(-)
diff --git a/SMPdesign/beyond.tex b/SMPdesign/beyond.tex
index 88e2b31..5997f86 100644
--- a/SMPdesign/beyond.tex
+++ b/SMPdesign/beyond.tex
@@ -54,112 +54,125 @@ presents future directions and concluding remarks.
\label{sec:SMPdesign:Work-Queue Parallel Maze Solver}
\begin{listing}[tbp]
-{ \scriptsize
-\begin{verbbox}
- 1 int maze_solve(maze *mp, cell sc, cell ec)
- 2 {
- 3 cell c = sc;
- 4 cell n;
- 5 int vi = 0;
- 6
- 7 maze_try_visit_cell(mp, c, c, &n, 1);
- 8 for (;;) {
- 9 while (!maze_find_any_next_cell(mp, c, &n)) {
- 10 if (++vi >= mp->vi)
- 11 return 0;
- 12 c = mp->visited[vi].c;
- 13 }
- 14 do {
- 15 if (n == ec) {
- 16 return 1;
- 17 }
- 18 c = n;
- 19 } while (maze_find_any_next_cell(mp, c, &n));
- 20 c = mp->visited[vi].c;
- 21 }
- 22 }
-\end{verbbox}
+\begin{linelabel}[ln:SMPdesign:SEQ Pseudocode]
+\begin{VerbatimL}[commandchars=\\\@\$]
+int maze_solve(maze *mp, cell sc, cell ec)
+{
+ cell c = sc;
+ cell n;
+ int vi = 0;
+
+ maze_try_visit_cell(mp, c, c, &n, 1); \lnlbl@initcell$
+ for (;;) { \lnlbl@loop:b$
+ while (!maze_find_any_next_cell(mp, c, &n)) {\lnlbl@loop2:b$
+ if (++vi >= mp->vi)
+ return 0;
+ c = mp->visited[vi].c;
+ } \lnlbl@loop2:e$
+ do { \lnlbl@loop3:b$
+ if (n == ec) {
+ return 1;
+ }
+ c = n;
+ } while (maze_find_any_next_cell(mp, c, &n));\lnlbl@loop3:e$
+ c = mp->visited[vi].c; \lnlbl@finalize$
+ } \lnlbl@loop:e$
}
-\centering
-\theverbbox
+\end{VerbatimL}
+\end{linelabel}
\caption{SEQ Pseudocode}
\label{lst:SMPdesign:SEQ Pseudocode}
\end{listing}
PWQ is based on SEQ, which is shown in
Listing~\ref{lst:SMPdesign:SEQ Pseudocode}
-(\path{maze_seq.c}).
+(pseudocode for \path{maze_seq.c}).
The maze is represented by a 2D array of cells and
a linear-array-based work queue named \co{->visited}.
-Line~7 visits the initial cell, and each iteration of the loop spanning
-lines~8-21 traverses passages headed by one cell.
-The loop spanning lines~9-13 scans the \co{->visited[]} array for a
+\begin{lineref}[ln:SMPdesign:SEQ Pseudocode]
+Line~\lnref{initcell} visits the initial cell, and each iteration of the loop spanning
+lines~\lnref{loop:b}-\lnref{loop:e} traverses passages headed by one cell.
+The loop spanning
+lines~\lnref{loop2:b}-\lnref{loop2:e} scans the \co{->visited[]} array for a
visited cell with an unvisited neighbor, and the loop spanning
-lines~14-19 traverses one fork of the submaze headed by that neighbor.
-Line~20 initializes for the next pass through the outer loop.
+lines~\lnref{loop3:b}-\lnref{loop3:e} traverses one fork of the submaze
+headed by that neighbor.
+Line~\lnref{finalize} initializes for the next pass through the outer loop.
+\end{lineref}
\begin{listing}[tbp]
-{ \scriptsize
-\begin{verbbox}
- 1 int maze_try_visit_cell(struct maze *mp, cell c, cell t,
- 2 cell *n, int d)
- 3 {
- 4 if (!maze_cells_connected(mp, c, t) ||
- 5 (*celladdr(mp, t) & VISITED))
- 6 return 0;
- 7 *n = t;
- 8 mp->visited[mp->vi] = t;
- 9 mp->vi++;
- 10 *celladdr(mp, t) |= VISITED | d;
- 11 return 1;
- 12 }
- 13
- 14 int maze_find_any_next_cell(struct maze *mp, cell c,
- 15 cell *n)
- 16 {
- 17 int d = (*celladdr(mp, c) & DISTANCE) + 1;
- 18
- 19 if (maze_try_visit_cell(mp, c, prevcol(c), n, d))
- 20 return 1;
- 21 if (maze_try_visit_cell(mp, c, nextcol(c), n, d))
- 22 return 1;
- 23 if (maze_try_visit_cell(mp, c, prevrow(c), n, d))
- 24 return 1;
- 25 if (maze_try_visit_cell(mp, c, nextrow(c), n, d))
- 26 return 1;
- 27 return 0;
- 28 }
-\end{verbbox}
-}
-\centering
-\theverbbox
+\begin{linelabel}[ln:SMPdesign:SEQ Helper Pseudocode]
+\begin{VerbatimL}[commandchars=\\\@\$]
+int maze_try_visit_cell(struct maze *mp, cell c, cell t, \lnlbl@try:b$
+ cell *n, int d)
+{
+ if (!maze_cells_connected(mp, c, t) || \lnlbl@try:chk:adj$
+ (*celladdr(mp, t) & VISITED)) \lnlbl@try:chk:not:visited$
+ return 0; \lnlbl@try:ret:failure$
+ *n = t; \lnlbl@try:nextcell$
+ mp->visited[mp->vi] = t; \lnlbl@try:recordnext$
+ mp->vi++; \lnlbl@try:next:visited$
+ *celladdr(mp, t) |= VISITED | d; \lnlbl@try:mark:visited$
+ return 1; \lnlbl@try:ret:success$
+} \lnlbl@try:e$
+
+int maze_find_any_next_cell(struct maze *mp, cell c, \lnlbl@find:b$
+ cell *n)
+{
+ int d = (*celladdr(mp, c) & DISTANCE) + 1; \lnlbl@find:curplus1$
+
+ if (maze_try_visit_cell(mp, c, prevcol(c), n, d))\lnlbl@find:chk:prevcol$
+ return 1; \lnlbl@find:ret:prevcol$
+ if (maze_try_visit_cell(mp, c, nextcol(c), n, d))\lnlbl@find:chk:nextcol$
+ return 1; \lnlbl@find:ret:nextcol$
+ if (maze_try_visit_cell(mp, c, prevrow(c), n, d))\lnlbl@find:chk:prevrow$
+ return 1; \lnlbl@find:ret:prevrow$
+ if (maze_try_visit_cell(mp, c, nextrow(c), n, d))\lnlbl@find:chk:nextrow$
+ return 1; \lnlbl@find:ret:nextrow$
+ return 0; \lnlbl@find:ret:false$
+} \lnlbl@find:e$
+\end{VerbatimL}
+\end{linelabel}
\caption{SEQ Helper Pseudocode}
\label{lst:SMPdesign:SEQ Helper Pseudocode}
\end{listing}
-The pseudocode for \co{maze_try_visit_cell()} is shown on lines~1-12
+\begin{lineref}[ln:SMPdesign:SEQ Helper Pseudocode:try]
+The pseudocode for \co{maze_try_visit_cell()} is shown on
+lines~\lnref{b}-\lnref{e}
of Listing~\ref{lst:SMPdesign:SEQ Helper Pseudocode}
(\path{maze.c}).
-Line~4 checks to see if cells \co{c} and \co{t} are adjacent and connected,
-while line~5 checks to see if cell \co{t} has not yet been visited.
+Line~\lnref{chk:adj} checks to see if cells \co{c} and \co{t} are
+adjacent and connected,
+while line~\lnref{chk:not:visited} checks to see if cell \co{t} has
+not yet been visited.
The \co{celladdr()} function returns the address of the specified cell.
-If either check fails, line~6 returns failure.
-Line~7 indicates the next cell, line~8 records this cell in the next
-slot of the \co{->visited[]} array, line~9 indicates that this slot
-is now full, and line~10 marks this cell as visited and also records
-the distance from the maze start. Line~11 then returns success.
-
-The pseudocode for \co{maze_find_any_next_cell()} is shown on lines~14-28
+If either check fails, line~\lnref{ret:failure} returns failure.
+Line~\lnref{nextcell} indicates the next cell,
+line~\lnref{recordnext} records this cell in the next
+slot of the \co{->visited[]} array,
+line~\lnref{next:visited} indicates that this slot
+is now full, and line~\lnref{mark:visited} marks this cell as visited and also records
+the distance from the maze start. Line~\lnref{ret:success} then returns success.
+\end{lineref}
+
+\begin{lineref}[ln:SMPdesign:SEQ Helper Pseudocode:find]
+The pseudocode for \co{maze_find_any_next_cell()} is shown on
+lines~\lnref{b}-\lnref{e}
of Listing~\ref{lst:SMPdesign:SEQ Helper Pseudocode}
(\path{maze.c}).
-Line~17 picks up the current cell's distance plus 1,
-while lines~19, 21, 23, and~25
-check the cell in each direction, and lines~20, 22, 24, and~26
+Line~\lnref{curplus1} picks up the current cell's distance plus 1,
+while lines~\lnref{chk:prevcol}, \lnref{chk:nextcol}, \lnref{chk:prevrow},
+and~\lnref{chk:nextrow}
+check the cell in each direction, and
+lines~\lnref{ret:prevcol}, \lnref{ret:nextcol}, \lnref{ret:prevrow},
+and~\lnref{ret:nextrow}
return true if the corresponding cell is a candidate next cell.
The \co{prevcol()}, \co{nextcol()}, \co{prevrow()}, and \co{nextrow()}
each do the specified array-index-conversion operation.
-If none of the cells is a candidate, line~27 returns false.
+If none of the cells is a candidate, line~\lnref{ret:false} returns false.
+\end{lineref}
\begin{figure}[tb]
\centering
@@ -228,60 +241,59 @@ at opposite ends of the solution path, and takes a brief look at the
performance and scalability consequences.
\begin{listing}[tbp]
-{ \scriptsize
-\begin{verbbox}
- 1 int maze_solve_child(maze *mp, cell *visited, cell sc)
- 2 {
- 3 cell c;
- 4 cell n;
- 5 int vi = 0;
- 6
- 7 myvisited = visited; myvi = &vi;
- 8 c = visited[vi];
- 9 do {
- 10 while (!maze_find_any_next_cell(mp, c, &n)) {
- 11 if (visited[++vi].row < 0)
- 12 return 0;
- 13 if (ACCESS_ONCE(mp->done))
- 14 return 1;
- 15 c = visited[vi];
- 16 }
- 17 do {
- 18 if (ACCESS_ONCE(mp->done))
- 19 return 1;
- 20 c = n;
- 21 } while (maze_find_any_next_cell(mp, c, &n));
- 22 c = visited[vi];
- 23 } while (!ACCESS_ONCE(mp->done));
- 24 return 1;
- 25 }
-\end{verbbox}
+\begin{linelabel}[ln:SMPdesign:Partitioned Parallel Solver Pseudocode]
+\begin{VerbatimL}[commandchars=\\\@\$]
+int maze_solve_child(maze *mp, cell *visited, cell sc) \lnlbl@b$
+{
+ cell c;
+ cell n;
+ int vi = 0;
+
+ myvisited = visited; myvi = &vi; \lnlbl@store:ptr$
+ c = visited[vi]; \lnlbl@retrieve$
+ do {
+ while (!maze_find_any_next_cell(mp, c, &n)) {
+ if (visited[++vi].row < 0)
+ return 0;
+ if (READ_ONCE(mp->done)) \lnlbl@chk:done1$
+ return 1;
+ c = visited[vi];
+ }
+ do {
+ if (READ_ONCE(mp->done)) \lnlbl@chk:done2$
+ return 1;
+ c = n;
+ } while (maze_find_any_next_cell(mp, c, &n));
+ c = visited[vi];
+ } while (!READ_ONCE(mp->done)); \lnlbl@chk:done3$
+ return 1;
}
-\centering
-\theverbbox
+\end{VerbatimL}
+\end{linelabel}
\caption{Partitioned Parallel Solver Pseudocode}
\label{lst:SMPdesign:Partitioned Parallel Solver Pseudocode}
\end{listing}
+\begin{lineref}[ln:SMPdesign:Partitioned Parallel Solver Pseudocode]
The partitioned parallel algorithm (PART), shown in
Listing~\ref{lst:SMPdesign:Partitioned Parallel Solver Pseudocode}
(\path{maze_part.c}),
is similar to SEQ, but has a few important differences.
First, each child thread has its own \co{visited} array, passed in by
-the parent as shown on line~1,
+the parent as shown on line~\lnref{b},
which must be initialized to all [$-1$, $-1$].
-Line~7 stores a pointer to this array into the per-thread variable
+Line~\lnref{store:ptr} stores a pointer to this array into the per-thread variable
\co{myvisited} to allow access by helper functions, and similarly stores
a pointer to the local visit index.
Second, the parent visits the first cell on each child's behalf,
-which the child retrieves on line~8.
+which the child retrieves on line~\lnref{retrieve}.
Third, the maze is solved as soon as one child locates a cell that has
been visited by the other child.
When \co{maze_try_visit_cell()} detects this,
it sets a \co{->done} field in the maze structure.
Fourth, each child must therefore periodically check the \co{->done}
-field, as shown on lines~13, 18, and~23.
-The \co{ACCESS_ONCE()} primitive must disable any compiler
+field, as shown on lines~\lnref{chk:done1}, \lnref{chk:done2}, and~\lnref{chk:done3}.
+The \co{READ_ONCE()} primitive must disable any compiler
optimizations that might combine consecutive loads or that
might reload the value.
A C++1x volatile relaxed load suffices~\cite{PeteBecker2011N3242}.
@@ -289,55 +301,60 @@ Finally, the \co{maze_find_any_next_cell()} function must use
compare-and-swap to mark a cell as visited, however
no constraints on ordering are required beyond those provided by
thread creation and join.
+\end{lineref}
\begin{listing}[tbp]
-{ \scriptsize
-\begin{verbbox}
- 1 int maze_try_visit_cell(struct maze *mp, int c, int t,
- 2 int *n, int d)
- 3 {
- 4 cell_t t;
- 5 cell_t *tp;
- 6 int vi;
- 7
- 8 if (!maze_cells_connected(mp, c, t))
- 9 return 0;
- 10 tp = celladdr(mp, t);
- 11 do {
- 12 t = ACCESS_ONCE(*tp);
- 13 if (t & VISITED) {
- 14 if ((t & TID) != mytid)
- 15 mp->done = 1;
- 16 return 0;
- 17 }
- 18 } while (!CAS(tp, t, t | VISITED | myid | d));
- 19 *n = t;
- 20 vi = (*myvi)++;
- 21 myvisited[vi] = t;
- 22 return 1;
- 23 }
-\end{verbbox}
+\begin{linelabel}[ln:SMPdesign:Partitioned Parallel Helper Pseudocode]
+\begin{VerbatimL}[commandchars=\\\@\$]
+int maze_try_visit_cell(struct maze *mp, int c, int t,
+ int *n, int d)
+{
+ cell_t t;
+ cell_t *tp;
+ int vi;
+
+ if (!maze_cells_connected(mp, c, t)) \lnlbl@chk:conn:b$
+ return 0; \lnlbl@chk:conn:e$
+ tp = celladdr(mp, t);
+ do { \lnlbl@loop:b$
+ t = READ_ONCE(*tp);
+ if (t & VISITED) { \lnlbl@chk:visited$
+ if ((t & TID) != mytid) \lnlbl@chk:other$
+ mp->done = 1; \lnlbl@located$
+ return 0; \lnlbl@ret:fail$
+ }
+ } while (!CAS(tp, t, t | VISITED | myid | d)); \lnlbl@loop:e$
+ *n = t; \lnlbl@update:new$
+ vi = (*myvi)++; \lnlbl@update:visited:b$
+ myvisited[vi] = t; \lnlbl@update:visited:e$
+ return 1; \lnlbl@ret:success$
}
-\centering
-\theverbbox
+\end{VerbatimL}
+\end{linelabel}
\caption{Partitioned Parallel Helper Pseudocode}
\label{lst:SMPdesign:Partitioned Parallel Helper Pseudocode}
\end{listing}
+\begin{lineref}[ln:SMPdesign:Partitioned Parallel Helper Pseudocode]
The pseudocode for \co{maze_find_any_next_cell()} is identical to that shown in
Listing~\ref{lst:SMPdesign:SEQ Helper Pseudocode},
but the pseudocode for \co{maze_try_visit_cell()} differs, and
is shown in
Listing~\ref{lst:SMPdesign:Partitioned Parallel Helper Pseudocode}.
-Lines~8-9 check to see if the cells are connected, returning failure
+Lines~\lnref{chk:conn:b}-\lnref{chk:conn:e}
+check to see if the cells are connected, returning failure
if not.
-The loop spanning lines~11-18 attempts to mark the new cell visited.
-Line~13 checks to see if it has already been visited, in which case
-line~16 returns failure, but only after line~14 checks to see if
-we have encountered the other thread, in which case line~15 indicates
+The loop spanning lines~\lnref{loop:b}-\lnref{loop:e} attempts to mark
+the new cell visited.
+Line~\lnref{chk:visited} checks to see if it has already been visited, in which case
+line~\lnref{ret:fail} returns failure, but only after line~\lnref{chk:other}
+checks to see if
+we have encountered the other thread, in which case line~\lnref{located} indicates
that the solution has been located.
-Line~19 updates to the new cell, lines~20 and~21 update this thread's visited
-array, and line~22 returns success.
+Line~\lnref{update:new} updates to the new cell,
+lines~\lnref{update:visited:b} and~\lnref{update:visited:e} update this thread's visited
+array, and line~\lnref{ret:success} returns success.
+\end{lineref}
\begin{figure}[tb]
\centering
--
2.7.4
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 7/7] CodeSamples/SMPdesign/maze: Substitute {READ/WRITE}_ONCE() for ACCESS_ONCE()
2018-11-04 0:07 [PATCH 0/7] Further conversion of code snippets (howto, cpu, SMPdesign) Akira Yokosawa
` (5 preceding siblings ...)
2018-11-04 0:13 ` [PATCH 6/7] SMPdesign/beyond: Employ new scheme for inline pseudocode snippets Akira Yokosawa
@ 2018-11-04 0:15 ` Akira Yokosawa
2018-11-04 20:30 ` [PATCH 0/7] Further conversion of code snippets (howto, cpu, SMPdesign) Paul E. McKenney
7 siblings, 0 replies; 9+ messages in thread
From: Akira Yokosawa @ 2018-11-04 0:15 UTC (permalink / raw)
To: Paul E. McKenney; +Cc: perfbook, Akira Yokosawa
From c79bceb489aa3742e9b9668c84fc1b09613b5fc8 Mon Sep 17 00:00:00 2001
From: Akira Yokosawa <akiyks@gmail.com>
Date: Sat, 3 Nov 2018 08:25:16 +0900
Subject: [PATCH 7/7] CodeSamples/SMPdesign/maze: Substitute {READ/WRITE}_ONCE() for ACCESS_ONCE()
Also fill in a couple of func prototype in maze.h.
Signed-off-by: Akira Yokosawa <akiyks@gmail.com>
---
CodeSamples/SMPdesign/maze/maze.h | 6 ++++++
CodeSamples/SMPdesign/maze/maze_fg.c | 18 +++++++++---------
CodeSamples/SMPdesign/maze/maze_part.c | 18 +++++++++---------
3 files changed, 24 insertions(+), 18 deletions(-)
diff --git a/CodeSamples/SMPdesign/maze/maze.h b/CodeSamples/SMPdesign/maze/maze.h
index a256278..f1d1343 100644
--- a/CodeSamples/SMPdesign/maze/maze.h
+++ b/CodeSamples/SMPdesign/maze/maze.h
@@ -79,6 +79,10 @@ struct maze {
};
#define ACCESS_ONCE(x) (*(volatile typeof(x) *)&(x))
+#define READ_ONCE(x) \
+ ({ typeof(x) ___x = ACCESS_ONCE(x); ___x; })
+#define WRITE_ONCE(x, val) ({ ACCESS_ONCE(x) = (val); })
+
/* CLOCK_MONOTONIC_RAW prefered, but the older CLOCK_MONOTONIC will do. */
#ifdef CLOCK_MONOTONIC_RAW
@@ -190,8 +194,10 @@ int maze_same_tids(struct maze *mp, int r1, int c1, int r2, int c2);
void new_empty_maze_solve(struct maze *mp);
int maze_try_visit_cell(struct maze *mp, int cr, int cc, int tr, int tc,
int *nextrow, int *nextcol, int distance);
+int maze_row_col_frac(int rc, int num, int den);
int maze_solve(struct maze *mp, int startrow, int startcol,
int endrow, int endcol, unsigned long long *t);
+void usage(char *progname, const char *format, ...);
void maze_solve_usage(void);
int maze_solve_parse(int i, int argc, char *argv[]);
diff --git a/CodeSamples/SMPdesign/maze/maze_fg.c b/CodeSamples/SMPdesign/maze/maze_fg.c
index aa1bafa..69af238 100644
--- a/CodeSamples/SMPdesign/maze/maze_fg.c
+++ b/CodeSamples/SMPdesign/maze/maze_fg.c
@@ -75,7 +75,7 @@ int maze_try_visit_cell(struct maze *mp, int cr, int cc, int tr, int tc,
return -1;
tp = maze_get_cell_addr(mp, tr, tc);
do {
- t = ACCESS_ONCE(*tp);
+ t = READ_ONCE(*tp);
if (t & VISITED)
return 1;
} while (!__sync_bool_compare_and_swap(tp, t, t | VISITED | myid));
@@ -85,7 +85,7 @@ int maze_try_visit_cell(struct maze *mp, int cr, int cc, int tr, int tc,
vi = __sync_add_and_fetch(&mp->vi, 1);
mp->visited[vi].row = tr;
__sync_synchronize();
- ACCESS_ONCE(mp->visited[vi].col) = tc;
+ WRITE_ONCE(mp->visited[vi].col, tc);
return 0;
}
@@ -121,8 +121,8 @@ void *maze_solve_child(void *arg)
* cell to be added to the visited list, we
* are done and the maze has no solution.
*/
- if (vi >= ACCESS_ONCE(mp->vi) + nthreads) {
- ACCESS_ONCE(mcsp->done) = -1;
+ if (vi >= READ_ONCE(mp->vi) + nthreads) {
+ WRITE_ONCE(mcsp->done, -1);
return NULL;
}
@@ -131,11 +131,11 @@ void *maze_solve_child(void *arg)
* for one of the other threads to find the
* end cell.
*/
- while ((vi > ACCESS_ONCE(mp->vi) ||
- ACCESS_ONCE(mp->visited[vi].col) < 0) &&
- !ACCESS_ONCE(mcsp->done))
+ while ((vi > READ_ONCE(mp->vi) ||
+ READ_ONCE(mp->visited[vi].col) < 0) &&
+ !READ_ONCE(mcsp->done))
continue;
- if (ACCESS_ONCE(mcsp->done))
+ if (READ_ONCE(mcsp->done))
return NULL;
/* check of .col before read of .row. */
@@ -151,7 +151,7 @@ void *maze_solve_child(void *arg)
do {
/* Did we find the solution? */
if (nr == mcsp->endrow && nc == mcsp->endcol) {
- ACCESS_ONCE(mcsp->done) = 1;
+ WRITE_ONCE(mcsp->done, 1);
return NULL;
}
diff --git a/CodeSamples/SMPdesign/maze/maze_part.c b/CodeSamples/SMPdesign/maze/maze_part.c
index 16080d4..2f92e11 100644
--- a/CodeSamples/SMPdesign/maze/maze_part.c
+++ b/CodeSamples/SMPdesign/maze/maze_part.c
@@ -85,12 +85,12 @@ void maze_solve_propagate(struct maze_child *mcp1, struct maze_child *mcp2)
if (__sync_fetch_and_add(&mcp1->see_start, 0)) {
(void)__sync_lock_test_and_set(&mcp2->see_start, 1);
if (__sync_fetch_and_add(&mcp2->see_end, 0))
- ACCESS_ONCE(mcsp->done) = 1;
+ WRITE_ONCE(mcsp->done, 1);
}
if (__sync_fetch_and_add(&mcp1->see_end, 0)) {
(void)__sync_lock_test_and_set(&mcp2->see_end, 1);
if (__sync_fetch_and_add(&mcp2->see_start, 0))
- ACCESS_ONCE(mcsp->done) = 1;
+ WRITE_ONCE(mcsp->done, 1);
}
}
@@ -111,7 +111,7 @@ void record_encounter(struct maze *mp, int cr, int cc, int tr, int tc)
mymcp->adj[theirtid].tr = tr;
mymcp->adj[theirtid].tc = tc;
if (nthreads == 2)
- ACCESS_ONCE(mcsp->done) = 1;
+ WRITE_ONCE(mcsp->done, 1);
maze_solve_propagate(mymcp, &mymcp->mcp0[theirtid]);
maze_solve_propagate(&mymcp->mcp0[theirtid], mymcp);
}
@@ -128,7 +128,7 @@ int maze_try_visit_cell(struct maze *mp, int cr, int cc, int tr, int tc,
return -1;
tp = maze_get_cell_addr(mp, tr, tc);
do {
- t = ACCESS_ONCE(*tp);
+ t = READ_ONCE(*tp);
if (t & VISITED) {
record_encounter(mp, cr, cc, tr, tc);
return 1;
@@ -496,7 +496,7 @@ int maze_solve_child_done_check(void)
struct maze_child *theirmcp;
if (nthreads <= 2)
- return ACCESS_ONCE(mcsp->done);
+ return READ_ONCE(mcsp->done);
if (!mymcp->see_start_snap &&
__sync_fetch_and_add(&mymcp->see_start, 0)) {
mymcp->see_start_snap = 1;
@@ -508,7 +508,7 @@ int maze_solve_child_done_check(void)
need_propagate = 1;
}
if (!need_propagate)
- return ACCESS_ONCE(mcsp->done);
+ return READ_ONCE(mcsp->done);
for (i = 0; i < nthreads; i++) {
if (i == mymcp->myid)
continue;
@@ -517,7 +517,7 @@ int maze_solve_child_done_check(void)
__sync_fetch_and_add(&theirmcp->adj[myid].mr, 0) != -1)
maze_solve_propagate(mymcp, theirmcp);
}
- return ACCESS_ONCE(mcsp->done);
+ return READ_ONCE(mcsp->done);
}
/*
@@ -555,7 +555,7 @@ void *maze_solve_child(void *arg)
* go to the following loop to do the exploration.
*/
while (!maze_find_any_next_cell(mp, cr, cc, &nr, &nc)) {
- if (++vi >= mcp->vi || ACCESS_ONCE(mcsp->done))
+ if (++vi >= mcp->vi || READ_ONCE(mcsp->done))
goto done;
cr = mcp->visited[vi].row;
cc = mcp->visited[vi].col;
@@ -566,7 +566,7 @@ void *maze_solve_child(void *arg)
* the current path one cell.
*/
do {
- if (ACCESS_ONCE(mcsp->done))
+ if (READ_ONCE(mcsp->done))
goto done;
cr = nr;
cc = nc;
--
2.7.4
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 0/7] Further conversion of code snippets (howto, cpu, SMPdesign)
2018-11-04 0:07 [PATCH 0/7] Further conversion of code snippets (howto, cpu, SMPdesign) Akira Yokosawa
` (6 preceding siblings ...)
2018-11-04 0:15 ` [PATCH 7/7] CodeSamples/SMPdesign/maze: Substitute {READ/WRITE}_ONCE() for ACCESS_ONCE() Akira Yokosawa
@ 2018-11-04 20:30 ` Paul E. McKenney
7 siblings, 0 replies; 9+ messages in thread
From: Paul E. McKenney @ 2018-11-04 20:30 UTC (permalink / raw)
To: Akira Yokosawa; +Cc: perfbook
On Sun, Nov 04, 2018 at 09:07:36AM +0900, Akira Yokosawa wrote:
> Hi Paul,
>
> This is another set of patches converting code snippets to new scheme.
> Notes on the changes other than simple conversion follow.
>
> Patch #2 contains somewhat ugly hack in lockhdeq.c to suppress
> "____cacheline_internodealigned_in_smp" in the resulting snippet.
> I'm afraid it could hurt your eyes.
;-)
> Patch #5 renames "percpu*" to "perthread*" to respect the names
> used in actual code.
>
> Patch #6 includes substitution of ACCESS_ONCE -> READ_ONCE.
>
> Patch #7 does the substitution under CodeSamples. It also adds
> a couple of function prototypes in maze.h.
The boxes for the inline code displays look good to me. I have queued
these, and will push them once I get back on-grid. Thank you!!!
Thanx, Paul
> Thanks, Akira
> --
> Akira Yokosawa (7):
> howto, cpu: Employ new scheme for command/code snippets
> SMPdesign: Employ new scheme for snippet of lockhdeq.c
> SMPdesign: Employ new scheme for snippet of lockhdeq.c and locktdeq.c
> SMPdesign: Employ new scheme for inline snippets
> SMPdesign: Employ new scheme for snippets from smpalloc.c
> SMPdesign/beyond: Employ new scheme for inline pseudocode snippets
> CodeSamples/SMPdesign/maze: Substitute {READ/WRITE}_ONCE() for
> ACCESS_ONCE()
>
> CodeSamples/SMPdesign/lockhdeq.c | 53 ++--
> CodeSamples/SMPdesign/locktdeq.c | 70 ++---
> CodeSamples/SMPdesign/maze/maze.h | 6 +
> CodeSamples/SMPdesign/maze/maze_fg.c | 18 +-
> CodeSamples/SMPdesign/maze/maze_part.c | 18 +-
> CodeSamples/SMPdesign/smpalloc.c | 57 ++--
> SMPdesign/SMPdesign.tex | 479 ++++++++++++++-------------------
> SMPdesign/beyond.tex | 319 +++++++++++-----------
> SMPdesign/partexercises.tex | 263 ++++++------------
> cpu/overview.tex | 15 +-
> howto/howto.tex | 60 ++---
> 11 files changed, 605 insertions(+), 753 deletions(-)
>
> --
> 2.7.4
>
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2018-11-05 8:16 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-11-04 0:07 [PATCH 0/7] Further conversion of code snippets (howto, cpu, SMPdesign) Akira Yokosawa
2018-11-04 0:08 ` [PATCH 1/7] howto, cpu: Employ new scheme for command/code snippets Akira Yokosawa
2018-11-04 0:09 ` [PATCH 2/7] SMPdesign: Employ new scheme for snippet of lockhdeq.c Akira Yokosawa
2018-11-04 0:11 ` [PATCH 3/7] SMPdesign: Employ new scheme for snippet of lockhdeq.c and locktdeq.c Akira Yokosawa
2018-11-04 0:11 ` [PATCH 4/7] SMPdesign: Employ new scheme for inline snippets Akira Yokosawa
2018-11-04 0:13 ` [PATCH 5/7] SMPdesign: Employ new scheme for snippets from smpalloc.c Akira Yokosawa
2018-11-04 0:13 ` [PATCH 6/7] SMPdesign/beyond: Employ new scheme for inline pseudocode snippets Akira Yokosawa
2018-11-04 0:15 ` [PATCH 7/7] CodeSamples/SMPdesign/maze: Substitute {READ/WRITE}_ONCE() for ACCESS_ONCE() Akira Yokosawa
2018-11-04 20:30 ` [PATCH 0/7] Further conversion of code snippets (howto, cpu, SMPdesign) Paul E. McKenney
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox