* [PATCH tip/core/rcu 2/6] documentation: Update based on on-demand vmstat workers
2015-03-03 16:37 ` [PATCH tip/core/rcu 1/6] documentation: Update rcutree.kthread_prio for grace-period kthread use Paul E. McKenney
@ 2015-03-03 16:37 ` Paul E. McKenney
2015-03-03 16:37 ` [PATCH tip/core/rcu 3/6] documentation: Update NO_HZ_FULL interaction with POSIX timers Paul E. McKenney
` (3 subsequent siblings)
4 siblings, 0 replies; 7+ messages in thread
From: Paul E. McKenney @ 2015-03-03 16:37 UTC (permalink / raw)
To: linux-kernel
Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, tglx,
peterz, rostedt, dhowells, edumazet, dvhart, fweisbec, oleg,
bobby.prani, Paul E. McKenney
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Now that the on-demand vmstat workers commit is in mainline, it is
possible to eliminate vmstat_update()-induced OS jitter. This commit
updates the documentation accordingly.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
Documentation/kernel-per-CPU-kthreads.txt | 18 ++++++++++--------
1 file changed, 10 insertions(+), 8 deletions(-)
diff --git a/Documentation/kernel-per-CPU-kthreads.txt b/Documentation/kernel-per-CPU-kthreads.txt
index f3cd299fcc41..81fe051c4447 100644
--- a/Documentation/kernel-per-CPU-kthreads.txt
+++ b/Documentation/kernel-per-CPU-kthreads.txt
@@ -190,14 +190,16 @@ To reduce its OS jitter, do any of the following:
on each CPU, including cs_dbs_timer() and od_dbs_timer().
WARNING: Please check your CPU specifications to
make sure that this is safe on your particular system.
- d. It is not possible to entirely get rid of OS jitter
- from vmstat_update() on CONFIG_SMP=y systems, but you
- can decrease its frequency by writing a large value
- to /proc/sys/vm/stat_interval. The default value is
- HZ, for an interval of one second. Of course, larger
- values will make your virtual-memory statistics update
- more slowly. Of course, you can also run your workload
- at a real-time priority, thus preempting vmstat_update(),
+ d. As of v3.18, Christoph Lameter's on-demand vmstat workers
+ commit prevents OS jitter due to vmstat_update() on
+ CONFIG_SMP=y systems. Before v3.18, is not possible
+ to entirely get rid of the OS jitter, but you can
+ decrease its frequency by writing a large value to
+ /proc/sys/vm/stat_interval. The default value is HZ,
+ for an interval of one second. Of course, larger values
+ will make your virtual-memory statistics update more
+ slowly. Of course, you can also run your workload at
+ a real-time priority, thus preempting vmstat_update(),
but if your workload is CPU-bound, this is a bad idea.
However, there is an RFC patch from Christoph Lameter
(based on an earlier one from Gilad Ben-Yossef) that
--
1.8.1.5
^ permalink raw reply related [flat|nested] 7+ messages in thread* [PATCH tip/core/rcu 3/6] documentation: Update NO_HZ_FULL interaction with POSIX timers
2015-03-03 16:37 ` [PATCH tip/core/rcu 1/6] documentation: Update rcutree.kthread_prio for grace-period kthread use Paul E. McKenney
2015-03-03 16:37 ` [PATCH tip/core/rcu 2/6] documentation: Update based on on-demand vmstat workers Paul E. McKenney
@ 2015-03-03 16:37 ` Paul E. McKenney
2015-03-03 16:37 ` [PATCH tip/core/rcu 4/6] documentation: Update per-CPU kthreads documentation Paul E. McKenney
` (2 subsequent siblings)
4 siblings, 0 replies; 7+ messages in thread
From: Paul E. McKenney @ 2015-03-03 16:37 UTC (permalink / raw)
To: linux-kernel
Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, tglx,
peterz, rostedt, dhowells, edumazet, dvhart, fweisbec, oleg,
bobby.prani, Paul E. McKenney
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
POSIX timers are no longer starved on adaptive-ticks CPUs. Instead, they
prevent affected CPUs from entering adaptive-ticks mode. This commit
therefore updates the NO_HZ.txt documentation.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
Documentation/timers/NO_HZ.txt | 10 +++-------
1 file changed, 3 insertions(+), 7 deletions(-)
diff --git a/Documentation/timers/NO_HZ.txt b/Documentation/timers/NO_HZ.txt
index cca122f25120..6eaf576294f3 100644
--- a/Documentation/timers/NO_HZ.txt
+++ b/Documentation/timers/NO_HZ.txt
@@ -158,13 +158,9 @@ not come for free:
to the need to inform kernel subsystems (such as RCU) about
the change in mode.
-3. POSIX CPU timers on adaptive-tick CPUs may miss their deadlines
- (perhaps indefinitely) because they currently rely on
- scheduling-tick interrupts. This will likely be fixed in
- one of two ways: (1) Prevent CPUs with POSIX CPU timers from
- entering adaptive-tick mode, or (2) Use hrtimers or other
- adaptive-ticks-immune mechanism to cause the POSIX CPU timer to
- fire properly.
+3. POSIX CPU timers prevent CPUs from entering adaptive-tick mode.
+ Real-time applications needing to take actions based on CPU time
+ consumption need to use other means of doing so.
4. If there are more perf events pending than the hardware can
accommodate, they are normally round-robined so as to collect
--
1.8.1.5
^ permalink raw reply related [flat|nested] 7+ messages in thread* [PATCH tip/core/rcu 4/6] documentation: Update per-CPU kthreads documentation
2015-03-03 16:37 ` [PATCH tip/core/rcu 1/6] documentation: Update rcutree.kthread_prio for grace-period kthread use Paul E. McKenney
2015-03-03 16:37 ` [PATCH tip/core/rcu 2/6] documentation: Update based on on-demand vmstat workers Paul E. McKenney
2015-03-03 16:37 ` [PATCH tip/core/rcu 3/6] documentation: Update NO_HZ_FULL interaction with POSIX timers Paul E. McKenney
@ 2015-03-03 16:37 ` Paul E. McKenney
2015-03-03 16:37 ` [PATCH tip/core/rcu 5/6] documentation: Clarify memory-barrier semantics of atomic operations Paul E. McKenney
2015-03-03 16:37 ` [PATCH tip/core/rcu 6/6] documentation: Clarify control-dependency pairing Paul E. McKenney
4 siblings, 0 replies; 7+ messages in thread
From: Paul E. McKenney @ 2015-03-03 16:37 UTC (permalink / raw)
To: linux-kernel
Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, tglx,
peterz, rostedt, dhowells, edumazet, dvhart, fweisbec, oleg,
bobby.prani, Paul E. McKenney
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
Documentation/kernel-per-CPU-kthreads.txt | 16 +++++++++++-----
1 file changed, 11 insertions(+), 5 deletions(-)
diff --git a/Documentation/kernel-per-CPU-kthreads.txt b/Documentation/kernel-per-CPU-kthreads.txt
index 81fe051c4447..f4cbfe0ba108 100644
--- a/Documentation/kernel-per-CPU-kthreads.txt
+++ b/Documentation/kernel-per-CPU-kthreads.txt
@@ -205,7 +205,9 @@ To reduce its OS jitter, do any of the following:
(based on an earlier one from Gilad Ben-Yossef) that
reduces or even eliminates vmstat overhead for some
workloads at https://lkml.org/lkml/2013/9/4/379.
- e. If running on high-end powerpc servers, build with
+ e. Boot with "elevator=noop" to avoid workqueue use by
+ the block layer.
+ f. If running on high-end powerpc servers, build with
CONFIG_PPC_RTAS_DAEMON=n. This prevents the RTAS
daemon from running on each CPU every second or so.
(This will require editing Kconfig files and will defeat
@@ -213,12 +215,12 @@ To reduce its OS jitter, do any of the following:
due to the rtas_event_scan() function.
WARNING: Please check your CPU specifications to
make sure that this is safe on your particular system.
- f. If running on Cell Processor, build your kernel with
+ g. If running on Cell Processor, build your kernel with
CBE_CPUFREQ_SPU_GOVERNOR=n to avoid OS jitter from
spu_gov_work().
WARNING: Please check your CPU specifications to
make sure that this is safe on your particular system.
- g. If running on PowerMAC, build your kernel with
+ h. If running on PowerMAC, build your kernel with
CONFIG_PMAC_RACKMETER=n to disable the CPU-meter,
avoiding OS jitter from rackmeter_do_timer().
@@ -260,8 +262,12 @@ Purpose: Detect software lockups on each CPU.
To reduce its OS jitter, do at least one of the following:
1. Build with CONFIG_LOCKUP_DETECTOR=n, which will prevent these
kthreads from being created in the first place.
-2. Echo a zero to /proc/sys/kernel/watchdog to disable the
+2. Boot with "nosoftlockup=0", which will also prevent these kthreads
+ from being created. Other related watchdog and softlockup boot
+ parameters may be found in Documentation/kernel-parameters.txt
+ and Documentation/watchdog/watchdog-parameters.txt.
+3. Echo a zero to /proc/sys/kernel/watchdog to disable the
watchdog timer.
-3. Echo a large number of /proc/sys/kernel/watchdog_thresh in
+4. Echo a large number of /proc/sys/kernel/watchdog_thresh in
order to reduce the frequency of OS jitter due to the watchdog
timer down to a level that is acceptable for your workload.
--
1.8.1.5
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH tip/core/rcu 5/6] documentation: Clarify memory-barrier semantics of atomic operations
2015-03-03 16:37 ` [PATCH tip/core/rcu 1/6] documentation: Update rcutree.kthread_prio for grace-period kthread use Paul E. McKenney
` (2 preceding siblings ...)
2015-03-03 16:37 ` [PATCH tip/core/rcu 4/6] documentation: Update per-CPU kthreads documentation Paul E. McKenney
@ 2015-03-03 16:37 ` Paul E. McKenney
2015-03-03 16:37 ` [PATCH tip/core/rcu 6/6] documentation: Clarify control-dependency pairing Paul E. McKenney
4 siblings, 0 replies; 7+ messages in thread
From: Paul E. McKenney @ 2015-03-03 16:37 UTC (permalink / raw)
To: linux-kernel
Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, tglx,
peterz, rostedt, dhowells, edumazet, dvhart, fweisbec, oleg,
bobby.prani, Paul E. McKenney
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
All value-returning atomic read-modify-write operations must provide full
memory-barrier semantics on both sides of the operation. This commit
clarifies the documentation to make it clear that these memory-barrier
semantics are provided by the operations themselves, not by their callers.
Reported-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
Documentation/atomic_ops.txt | 45 ++++++++++++++++++++++----------------------
1 file changed, 23 insertions(+), 22 deletions(-)
diff --git a/Documentation/atomic_ops.txt b/Documentation/atomic_ops.txt
index 183e41bdcb69..dab6da3382d9 100644
--- a/Documentation/atomic_ops.txt
+++ b/Documentation/atomic_ops.txt
@@ -201,11 +201,11 @@ These routines add 1 and subtract 1, respectively, from the given
atomic_t and return the new counter value after the operation is
performed.
-Unlike the above routines, it is required that explicit memory
-barriers are performed before and after the operation. It must be
-done such that all memory operations before and after the atomic
-operation calls are strongly ordered with respect to the atomic
-operation itself.
+Unlike the above routines, it is required that these primitives
+include explicit memory barriers that are performed before and after
+the operation. It must be done such that all memory operations before
+and after the atomic operation calls are strongly ordered with respect
+to the atomic operation itself.
For example, it should behave as if a smp_mb() call existed both
before and after the atomic operation.
@@ -233,21 +233,21 @@ These two routines increment and decrement by 1, respectively, the
given atomic counter. They return a boolean indicating whether the
resulting counter value was zero or not.
-It requires explicit memory barrier semantics around the operation as
-above.
+Again, these primitives provide explicit memory barrier semantics around
+the atomic operation.
int atomic_sub_and_test(int i, atomic_t *v);
This is identical to atomic_dec_and_test() except that an explicit
-decrement is given instead of the implicit "1". It requires explicit
-memory barrier semantics around the operation.
+decrement is given instead of the implicit "1". This primitive must
+provide explicit memory barrier semantics around the operation.
int atomic_add_negative(int i, atomic_t *v);
-The given increment is added to the given atomic counter value. A
-boolean is return which indicates whether the resulting counter value
-is negative. It requires explicit memory barrier semantics around the
-operation.
+The given increment is added to the given atomic counter value. A boolean
+is return which indicates whether the resulting counter value is negative.
+This primitive must provide explicit memory barrier semantics around
+the operation.
Then:
@@ -257,7 +257,7 @@ This performs an atomic exchange operation on the atomic variable v, setting
the given new value. It returns the old value that the atomic variable v had
just before the operation.
-atomic_xchg requires explicit memory barriers around the operation.
+atomic_xchg must provide explicit memory barriers around the operation.
int atomic_cmpxchg(atomic_t *v, int old, int new);
@@ -266,7 +266,7 @@ with the given old and new values. Like all atomic_xxx operations,
atomic_cmpxchg will only satisfy its atomicity semantics as long as all
other accesses of *v are performed through atomic_xxx operations.
-atomic_cmpxchg requires explicit memory barriers around the operation.
+atomic_cmpxchg must provide explicit memory barriers around the operation.
The semantics for atomic_cmpxchg are the same as those defined for 'cas'
below.
@@ -279,8 +279,8 @@ If the atomic value v is not equal to u, this function adds a to v, and
returns non zero. If v is equal to u then it returns zero. This is done as
an atomic operation.
-atomic_add_unless requires explicit memory barriers around the operation
-unless it fails (returns 0).
+atomic_add_unless must provide explicit memory barriers around the
+operation unless it fails (returns 0).
atomic_inc_not_zero, equivalent to atomic_add_unless(v, 1, 0)
@@ -460,9 +460,9 @@ the return value into an int. There are other places where things
like this occur as well.
These routines, like the atomic_t counter operations returning values,
-require explicit memory barrier semantics around their execution. All
-memory operations before the atomic bit operation call must be made
-visible globally before the atomic bit operation is made visible.
+must provide explicit memory barrier semantics around their execution.
+All memory operations before the atomic bit operation call must be
+made visible globally before the atomic bit operation is made visible.
Likewise, the atomic bit operation must be visible globally before any
subsequent memory operation is made visible. For example:
@@ -536,8 +536,9 @@ except that two underscores are prefixed to the interface name.
These non-atomic variants also do not require any special memory
barrier semantics.
-The routines xchg() and cmpxchg() need the same exact memory barriers
-as the atomic and bit operations returning values.
+The routines xchg() and cmpxchg() must provide the same exact
+memory-barrier semantics as the atomic and bit operations returning
+values.
Spinlocks and rwlocks have memory barrier expectations as well.
The rule to follow is simple:
--
1.8.1.5
^ permalink raw reply related [flat|nested] 7+ messages in thread* [PATCH tip/core/rcu 6/6] documentation: Clarify control-dependency pairing
2015-03-03 16:37 ` [PATCH tip/core/rcu 1/6] documentation: Update rcutree.kthread_prio for grace-period kthread use Paul E. McKenney
` (3 preceding siblings ...)
2015-03-03 16:37 ` [PATCH tip/core/rcu 5/6] documentation: Clarify memory-barrier semantics of atomic operations Paul E. McKenney
@ 2015-03-03 16:37 ` Paul E. McKenney
4 siblings, 0 replies; 7+ messages in thread
From: Paul E. McKenney @ 2015-03-03 16:37 UTC (permalink / raw)
To: linux-kernel
Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, tglx,
peterz, rostedt, dhowells, edumazet, dvhart, fweisbec, oleg,
bobby.prani, Paul E. McKenney
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
This commit explicitly states that control dependencies pair normally
with other barriers, and gives an example of such pairing.
Reported-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
Documentation/memory-barriers.txt | 42 +++++++++++++++++++++++++++------------
1 file changed, 29 insertions(+), 13 deletions(-)
diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
index ca2387ef27ab..6974f1c2b4e1 100644
--- a/Documentation/memory-barriers.txt
+++ b/Documentation/memory-barriers.txt
@@ -592,9 +592,9 @@ See also the subsection on "Cache Coherency" for a more thorough example.
CONTROL DEPENDENCIES
--------------------
-A control dependency requires a full read memory barrier, not simply a data
-dependency barrier to make it work correctly. Consider the following bit of
-code:
+A load-load control dependency requires a full read memory barrier, not
+simply a data dependency barrier to make it work correctly. Consider the
+following bit of code:
q = ACCESS_ONCE(a);
if (q) {
@@ -615,14 +615,15 @@ case what's actually required is:
}
However, stores are not speculated. This means that ordering -is- provided
-in the following example:
+for load-store control dependencies, as in the following example:
q = ACCESS_ONCE(a);
if (q) {
ACCESS_ONCE(b) = p;
}
-Please note that ACCESS_ONCE() is not optional! Without the
+Control dependencies pair normally with other types of barriers.
+That said, please note that ACCESS_ONCE() is not optional! Without the
ACCESS_ONCE(), might combine the load from 'a' with other loads from
'a', and the store to 'b' with other stores to 'b', with possible highly
counterintuitive effects on ordering.
@@ -813,6 +814,8 @@ In summary:
barrier() can help to preserve your control dependency. Please
see the Compiler Barrier section for more information.
+ (*) Control dependencies pair normally with other types of barriers.
+
(*) Control dependencies do -not- provide transitivity. If you
need transitivity, use smp_mb().
@@ -823,14 +826,14 @@ SMP BARRIER PAIRING
When dealing with CPU-CPU interactions, certain types of memory barrier should
always be paired. A lack of appropriate pairing is almost certainly an error.
-General barriers pair with each other, though they also pair with
-most other types of barriers, albeit without transitivity. An acquire
-barrier pairs with a release barrier, but both may also pair with other
-barriers, including of course general barriers. A write barrier pairs
-with a data dependency barrier, an acquire barrier, a release barrier,
-a read barrier, or a general barrier. Similarly a read barrier or a
-data dependency barrier pairs with a write barrier, an acquire barrier,
-a release barrier, or a general barrier:
+General barriers pair with each other, though they also pair with most
+other types of barriers, albeit without transitivity. An acquire barrier
+pairs with a release barrier, but both may also pair with other barriers,
+including of course general barriers. A write barrier pairs with a data
+dependency barrier, a control dependency, an acquire barrier, a release
+barrier, a read barrier, or a general barrier. Similarly a read barrier,
+control dependency, or a data dependency barrier pairs with a write
+barrier, an acquire barrier, a release barrier, or a general barrier:
CPU 1 CPU 2
=============== ===============
@@ -850,6 +853,19 @@ Or:
<data dependency barrier>
y = *x;
+Or even:
+
+ CPU 1 CPU 2
+ =============== ===============================
+ r1 = ACCESS_ONCE(y);
+ <general barrier>
+ ACCESS_ONCE(y) = 1; if (r2 = ACCESS_ONCE(x)) {
+ <implicit control dependency>
+ ACCESS_ONCE(y) = 1;
+ }
+
+ assert(r1 == 0 || r2 == 0);
+
Basically, the read barrier always has to be there, even though it can be of
the "weaker" type.
--
1.8.1.5
^ permalink raw reply related [flat|nested] 7+ messages in thread