* [RFC 1/6] percpu-refcount: Add managed mode for RCU released objects
2024-09-16 5:08 [RFC 0/6] Managed Percpu Refcount Neeraj Upadhyay
@ 2024-09-16 5:08 ` Neeraj Upadhyay
2024-09-16 5:08 ` [RFC 2/6] percpu-refcount: Add torture test for percpu refcount Neeraj Upadhyay
` (4 subsequent siblings)
5 siblings, 0 replies; 12+ messages in thread
From: Neeraj Upadhyay @ 2024-09-16 5:08 UTC (permalink / raw)
To: linux-kernel
Cc: john.johansen, paul, jmorris, serge, linux-security-module,
gautham.shenoy, Santosh.Shukla, Ananth.Narayan,
Raghavendra.KodsaraThimmappa, paulmck, boqun.feng, vinicius.gomes,
mjguzik, dennis, tj, cl, linux-mm, rcu
Add a new "managed mode" to percpu refcounts, to track initial
reference drop for refs which use RCU grace period for their object
reclaims. Typical usage pattern for such refs is:
// Called with elevated refcount
get()
p = get_ptr();
kref_get(&p->count);
return p;
get()
rcu_read_lock();
p = get_ptr();
if (p && !kref_get_unless_zero(&p->count))
p = NULL;
rcu_read_unlock();
return p;
release()
remove_ptr(p);
call_rcu(&p->rcu, freep);
release()
remove_ptr(p);
kfree_rcu((p, rcu);
Currently, percpu ref requires users to call percpu_ref_kill() when
object usage enters a shutdown phase. Post killi operation, ref
increment/ decrement are performed on a atomic counter. For cases where
ref is actively acquired and released after percpu_ref_kill(),
percpu ref does not provide any performance benefits over using
an atomic reference counter. Managed mode offloads tracking of ref
kill to a manager thread, thereby not requiring users to explicitly
call percpu_ref_kill(). This helps avoid the problem of suboptimal
performance if a percpu ref is actively acquired and released after
percpu_ref_kill() operation.
A percpu ref can be initialized as managed either during
percpu_ref_init() by passing PERCPU_REF_REL_MANAGED flag or a
reinitable ref can be switched to managed mode using
percpu_ref_switch_to_managed() post its initialization. Deferred switch
to managed mode can be used for cases like module initialization
errors, where a inited percpu ref's initial reference is dropped before
the object becomes active and is referenced by other contexts. One such
case is Apparmor labels which are not associated yet with a namespace.
These labels are freed without waiting for a RCU grace period. So,
managed mode cannot be used for these labels until their initialization
has completed.
Following are the allowed initialization modes for managed ref:
Atomic Percpu Dead Reinit Managed
Managed-ref Y N Y Y Y
Following are the allowed transitions for managed ref:
To --> A P P(RI) M D D(RI) D(RI/M) KLL REI RES
A y n y y n y y y y y
P n n n n y n n y n n
M n n n y n n y n y y
P(RI) y n y y n y y y y y
D(RI) y n y y n y y - y y
D(RI/M) n n n y n n y - y y
Modes:
A - Atomic P - PerCPU M - Managed P(RI) - PerCPU with ReInit
D(RI) - Dead with ReInit D(RI/M) - Dead with ReInit and Managed
PerCPU Ref Ops:
KLL - Kill REI - Reinit RES - Resurrect
Once a percpu ref is switched to managed mode, it cannot be switched to
any other active mode. On reinit/resurrect, managed ref is reinitialized
in managed mode.
Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
---
.../admin-guide/kernel-parameters.txt | 12 +
include/linux/percpu-refcount.h | 13 +
lib/percpu-refcount.c | 358 +++++++++++++++++-
3 files changed, 364 insertions(+), 19 deletions(-)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 09126bb8cc9f..0f02a1b04fe9 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -4665,6 +4665,18 @@
allocator. This parameter is primarily for debugging
and performance comparison.
+ percpu_refcount.max_scan_count= [KNL]
+ Specifies the maximum number of percpu ref nodes which
+ are processed in one run of percpu ref manager thread.
+
+ Default: 100
+
+ percpu_refcount.scan_interval= [KNL]
+ Specifies the duration (ms) between two runs of manager
+ thread.
+
+ Default: 5000 ms
+
pirq= [SMP,APIC] Manual mp-table setup
See Documentation/arch/x86/i386/IO-APIC.rst.
diff --git a/include/linux/percpu-refcount.h b/include/linux/percpu-refcount.h
index d73a1c08c3e3..e6aea81b3d01 100644
--- a/include/linux/percpu-refcount.h
+++ b/include/linux/percpu-refcount.h
@@ -68,6 +68,11 @@ enum {
__PERCPU_REF_FLAG_BITS = 2,
};
+/* Auxiliary flags */
+enum {
+ __PERCPU_REL_MANAGED = 1LU << 0, /* operating in managed mode */
+};
+
/* @flags for percpu_ref_init() */
enum {
/*
@@ -90,6 +95,10 @@ enum {
* Allow switching from atomic mode to percpu mode.
*/
PERCPU_REF_ALLOW_REINIT = 1 << 2,
+ /*
+ * Manage release of the percpu ref.
+ */
+ PERCPU_REF_REL_MANAGED = 1 << 3,
};
struct percpu_ref_data {
@@ -100,6 +109,9 @@ struct percpu_ref_data {
bool allow_reinit:1;
struct rcu_head rcu;
struct percpu_ref *ref;
+ unsigned int aux_flags;
+ struct llist_node node;
+
};
struct percpu_ref {
@@ -126,6 +138,7 @@ void percpu_ref_switch_to_atomic(struct percpu_ref *ref,
percpu_ref_func_t *confirm_switch);
void percpu_ref_switch_to_atomic_sync(struct percpu_ref *ref);
void percpu_ref_switch_to_percpu(struct percpu_ref *ref);
+int percpu_ref_switch_to_managed(struct percpu_ref *ref);
void percpu_ref_kill_and_confirm(struct percpu_ref *ref,
percpu_ref_func_t *confirm_kill);
void percpu_ref_resurrect(struct percpu_ref *ref);
diff --git a/lib/percpu-refcount.c b/lib/percpu-refcount.c
index 668f6aa6a75d..7b97f9728c5b 100644
--- a/lib/percpu-refcount.c
+++ b/lib/percpu-refcount.c
@@ -5,6 +5,9 @@
#include <linux/sched.h>
#include <linux/wait.h>
#include <linux/slab.h>
+#include <linux/llist.h>
+#include <linux/moduleparam.h>
+#include <linux/types.h>
#include <linux/mm.h>
#include <linux/percpu-refcount.h>
@@ -38,6 +41,7 @@
static DEFINE_SPINLOCK(percpu_ref_switch_lock);
static DECLARE_WAIT_QUEUE_HEAD(percpu_ref_switch_waitq);
+static LLIST_HEAD(percpu_ref_manage_head);
static unsigned long __percpu *percpu_count_ptr(struct percpu_ref *ref)
{
@@ -45,6 +49,8 @@ static unsigned long __percpu *percpu_count_ptr(struct percpu_ref *ref)
(ref->percpu_count_ptr & ~__PERCPU_REF_ATOMIC_DEAD);
}
+int percpu_ref_switch_to_managed(struct percpu_ref *ref);
+
/**
* percpu_ref_init - initialize a percpu refcount
* @ref: percpu_ref to initialize
@@ -80,6 +86,9 @@ int percpu_ref_init(struct percpu_ref *ref, percpu_ref_func_t *release,
return -ENOMEM;
}
+ if (flags & PERCPU_REF_REL_MANAGED)
+ flags |= PERCPU_REF_ALLOW_REINIT;
+
data->force_atomic = flags & PERCPU_REF_INIT_ATOMIC;
data->allow_reinit = flags & PERCPU_REF_ALLOW_REINIT;
@@ -101,10 +110,73 @@ int percpu_ref_init(struct percpu_ref *ref, percpu_ref_func_t *release,
data->confirm_switch = NULL;
data->ref = ref;
ref->data = data;
+ init_llist_node(&data->node);
+
+ if (flags & PERCPU_REF_REL_MANAGED)
+ percpu_ref_switch_to_managed(ref);
+
return 0;
}
EXPORT_SYMBOL_GPL(percpu_ref_init);
+static bool percpu_ref_is_managed(struct percpu_ref *ref)
+{
+ return (ref->data->aux_flags & __PERCPU_REL_MANAGED) != 0;
+}
+
+static void __percpu_ref_switch_mode(struct percpu_ref *ref,
+ percpu_ref_func_t *confirm_switch);
+
+static int __percpu_ref_switch_to_managed(struct percpu_ref *ref)
+{
+ unsigned long __percpu *percpu_count;
+ struct percpu_ref_data *data;
+ int ret = -1;
+
+ data = ref->data;
+
+ if (WARN_ONCE(!percpu_ref_tryget(ref), "Percpu ref is not active"))
+ return ret;
+
+ if (WARN_ONCE(!data->allow_reinit, "Percpu ref does not allow switch"))
+ goto err_switch_managed;
+
+ if (WARN_ONCE(percpu_ref_is_managed(ref), "Percpu ref is already managed"))
+ goto err_switch_managed;
+
+ data->aux_flags |= __PERCPU_REL_MANAGED;
+ data->force_atomic = false;
+ if (!__ref_is_percpu(ref, &percpu_count))
+ __percpu_ref_switch_mode(ref, NULL);
+ /* Ensure ordering of percpu mode switch and node scan */
+ smp_mb();
+ llist_add(&data->node, &percpu_ref_manage_head);
+
+ return 0;
+
+err_switch_managed:
+ percpu_ref_put(ref);
+ return ret;
+}
+
+/**
+ * percpu_ref_switch_to_managed - Switch an unmanaged ref to percpu mode.
+ *
+ * @ref: percpu_ref to switch to managed mode
+ *
+ */
+int percpu_ref_switch_to_managed(struct percpu_ref *ref)
+{
+ unsigned long flags;
+ int ret;
+
+ spin_lock_irqsave(&percpu_ref_switch_lock, flags);
+ ret = __percpu_ref_switch_to_managed(ref);
+ spin_unlock_irqrestore(&percpu_ref_switch_lock, flags);
+ return ret;
+}
+EXPORT_SYMBOL_GPL(percpu_ref_switch_to_managed);
+
static void __percpu_ref_exit(struct percpu_ref *ref)
{
unsigned long __percpu *percpu_count = percpu_count_ptr(ref);
@@ -283,6 +355,27 @@ static void __percpu_ref_switch_mode(struct percpu_ref *ref,
__percpu_ref_switch_to_percpu(ref);
}
+static bool __percpu_ref_switch_to_atomic_checked(struct percpu_ref *ref,
+ percpu_ref_func_t *confirm_switch,
+ bool check_managed)
+{
+ unsigned long flags;
+
+ spin_lock_irqsave(&percpu_ref_switch_lock, flags);
+ if (check_managed && WARN_ONCE(percpu_ref_is_managed(ref),
+ "Percpu ref is managed, cannot switch to atomic mode")) {
+ spin_unlock_irqrestore(&percpu_ref_switch_lock, flags);
+ return false;
+ }
+
+ ref->data->force_atomic = true;
+ __percpu_ref_switch_mode(ref, confirm_switch);
+
+ spin_unlock_irqrestore(&percpu_ref_switch_lock, flags);
+
+ return true;
+}
+
/**
* percpu_ref_switch_to_atomic - switch a percpu_ref to atomic mode
* @ref: percpu_ref to switch to atomic mode
@@ -306,17 +399,16 @@ static void __percpu_ref_switch_mode(struct percpu_ref *ref,
void percpu_ref_switch_to_atomic(struct percpu_ref *ref,
percpu_ref_func_t *confirm_switch)
{
- unsigned long flags;
-
- spin_lock_irqsave(&percpu_ref_switch_lock, flags);
-
- ref->data->force_atomic = true;
- __percpu_ref_switch_mode(ref, confirm_switch);
-
- spin_unlock_irqrestore(&percpu_ref_switch_lock, flags);
+ (void)__percpu_ref_switch_to_atomic_checked(ref, confirm_switch, true);
}
EXPORT_SYMBOL_GPL(percpu_ref_switch_to_atomic);
+static void __percpu_ref_switch_to_atomic_sync_checked(struct percpu_ref *ref, bool check_managed)
+{
+ if (!__percpu_ref_switch_to_atomic_checked(ref, NULL, check_managed))
+ return;
+ wait_event(percpu_ref_switch_waitq, !ref->data->confirm_switch);
+}
/**
* percpu_ref_switch_to_atomic_sync - switch a percpu_ref to atomic mode
* @ref: percpu_ref to switch to atomic mode
@@ -327,11 +419,28 @@ EXPORT_SYMBOL_GPL(percpu_ref_switch_to_atomic);
*/
void percpu_ref_switch_to_atomic_sync(struct percpu_ref *ref)
{
- percpu_ref_switch_to_atomic(ref, NULL);
- wait_event(percpu_ref_switch_waitq, !ref->data->confirm_switch);
+ __percpu_ref_switch_to_atomic_sync_checked(ref, true);
}
EXPORT_SYMBOL_GPL(percpu_ref_switch_to_atomic_sync);
+static void __percpu_ref_switch_to_percpu_checked(struct percpu_ref *ref, bool check_managed)
+{
+ unsigned long flags;
+
+ spin_lock_irqsave(&percpu_ref_switch_lock, flags);
+
+ if (check_managed && WARN_ONCE(percpu_ref_is_managed(ref),
+ "Percpu ref is managed, cannot switch to percpu mode")) {
+ spin_unlock_irqrestore(&percpu_ref_switch_lock, flags);
+ return;
+ }
+
+ ref->data->force_atomic = false;
+ __percpu_ref_switch_mode(ref, NULL);
+
+ spin_unlock_irqrestore(&percpu_ref_switch_lock, flags);
+}
+
/**
* percpu_ref_switch_to_percpu - switch a percpu_ref to percpu mode
* @ref: percpu_ref to switch to percpu mode
@@ -352,14 +461,7 @@ EXPORT_SYMBOL_GPL(percpu_ref_switch_to_atomic_sync);
*/
void percpu_ref_switch_to_percpu(struct percpu_ref *ref)
{
- unsigned long flags;
-
- spin_lock_irqsave(&percpu_ref_switch_lock, flags);
-
- ref->data->force_atomic = false;
- __percpu_ref_switch_mode(ref, NULL);
-
- spin_unlock_irqrestore(&percpu_ref_switch_lock, flags);
+ __percpu_ref_switch_to_percpu_checked(ref, true);
}
EXPORT_SYMBOL_GPL(percpu_ref_switch_to_percpu);
@@ -472,8 +574,226 @@ void percpu_ref_resurrect(struct percpu_ref *ref)
ref->percpu_count_ptr &= ~__PERCPU_REF_DEAD;
percpu_ref_get(ref);
- __percpu_ref_switch_mode(ref, NULL);
+ if (percpu_ref_is_managed(ref)) {
+ ref->data->aux_flags &= ~__PERCPU_REL_MANAGED;
+ __percpu_ref_switch_to_managed(ref);
+ } else {
+ __percpu_ref_switch_mode(ref, NULL);
+ }
spin_unlock_irqrestore(&percpu_ref_switch_lock, flags);
}
EXPORT_SYMBOL_GPL(percpu_ref_resurrect);
+
+#define DEFAULT_SCAN_INTERVAL_MS 5000
+/* Interval duration between two ref scans. */
+static ulong scan_interval = DEFAULT_SCAN_INTERVAL_MS;
+module_param(scan_interval, ulong, 0444);
+
+#define DEFAULT_MAX_SCAN_COUNT 100
+/* Number of percpu refs scanned in one iteration of worker execution. */
+static int max_scan_count = DEFAULT_MAX_SCAN_COUNT;
+module_param(max_scan_count, int, 0444);
+
+static void percpu_ref_release_work_fn(struct work_struct *work);
+
+/*
+ * Sentinel llist nodes for lockless list traveral and deletions by
+ * the pcpu ref release worker, while nodes are added from
+ * percpu_ref_init() and percpu_ref_switch_to_managed().
+ *
+ * Sentinel node marks the head of list traversal for the current
+ * iteration of kworker execution.
+ */
+struct percpu_ref_sen_node {
+ bool inuse;
+ struct llist_node node;
+};
+
+/*
+ * We need two sentinel nodes for lockless list manipulations from release
+ * worker - first node will be used in current reclaim iteration. The second
+ * node will be used in next iteration. Next iteration marks the first node
+ * as free, for use in subsequent iteration.
+ */
+#define PERCPU_REF_SEN_NODES_COUNT 2
+
+/* Track last processed percpu ref node */
+static struct llist_node *last_percpu_ref_node;
+
+static struct percpu_ref_sen_node
+ percpu_ref_sen_nodes[PERCPU_REF_SEN_NODES_COUNT];
+
+static DECLARE_DELAYED_WORK(percpu_ref_release_work, percpu_ref_release_work_fn);
+
+static bool percpu_ref_is_sen_node(struct llist_node *node)
+{
+ return &percpu_ref_sen_nodes[0].node <= node &&
+ node <= &percpu_ref_sen_nodes[PERCPU_REF_SEN_NODES_COUNT - 1].node;
+}
+
+static struct llist_node *percpu_ref_get_sen_node(void)
+{
+ int i;
+ struct percpu_ref_sen_node *sn;
+
+ for (i = 0; i < PERCPU_REF_SEN_NODES_COUNT; i++) {
+ sn = &percpu_ref_sen_nodes[i];
+ if (!sn->inuse) {
+ sn->inuse = true;
+ return &sn->node;
+ }
+ }
+
+ return NULL;
+}
+
+static void percpu_ref_put_sen_node(struct llist_node *node)
+{
+ struct percpu_ref_sen_node *sn = container_of(node, struct percpu_ref_sen_node, node);
+
+ sn->inuse = false;
+ init_llist_node(node);
+}
+
+static void percpu_ref_put_all_sen_nodes_except(struct llist_node *node)
+{
+ int i;
+
+ for (i = 0; i < PERCPU_REF_SEN_NODES_COUNT; i++) {
+ if (&percpu_ref_sen_nodes[i].node == node)
+ continue;
+ percpu_ref_sen_nodes[i].inuse = false;
+ init_llist_node(&percpu_ref_sen_nodes[i].node);
+ }
+}
+
+static struct workqueue_struct *percpu_ref_release_wq;
+
+static void percpu_ref_release_work_fn(struct work_struct *work)
+{
+ struct llist_node *pos, *first, *head, *prev, *next;
+ struct llist_node *sen_node;
+ struct percpu_ref *ref;
+ int count = 0;
+ bool held;
+
+ first = READ_ONCE(percpu_ref_manage_head.first);
+ if (!first)
+ goto queue_release_work;
+
+ /*
+ * Enqueue a dummy node to mark the start of scan. This dummy
+ * node is used as start point of scan and ensures that
+ * there is no additional synchronization required with new
+ * label node additions to the llist. Any new labels will
+ * be processed in next run of the kworker.
+ *
+ * SCAN START PTR
+ * |
+ * v
+ * +----------+ +------+ +------+ +------+
+ * | | | | | | | |
+ * | head ------> dummy|--->|label |--->| label|--->NULL
+ * | | | node | | | | |
+ * +----------+ +------+ +------+ +------+
+ *
+ *
+ * New label addition:
+ *
+ * SCAN START PTR
+ * |
+ * v
+ * +----------+ +------+ +------+ +------+ +------+
+ * | | | | | | | | | |
+ * | head |--> label|--> dummy|--->|label |--->| label|--->NULL
+ * | | | | | node | | | | |
+ * +----------+ +------+ +------+ +------+ +------+
+ *
+ */
+ if (last_percpu_ref_node == NULL || last_percpu_ref_node->next == NULL) {
+retry_sentinel_get:
+ sen_node = percpu_ref_get_sen_node();
+ /*
+ * All sentinel nodes are in use? This should not happen, as we
+ * require only one sentinel for the start of list traversal and
+ * other sentinel node is freed during the traversal.
+ */
+ if (WARN_ONCE(!sen_node, "All sentinel nodes are in use")) {
+ /* Use first node as the sentinel node */
+ head = first->next;
+ if (!head) {
+ struct llist_node *ign_node = NULL;
+ /*
+ * We exhausted sentinel nodes. However, there aren't
+ * enough nodes in the llist. So, we have leaked
+ * sentinel nodes. Reclaim sentinels and retry.
+ */
+ if (percpu_ref_is_sen_node(first))
+ ign_node = first;
+ percpu_ref_put_all_sen_nodes_except(ign_node);
+ goto retry_sentinel_get;
+ }
+ prev = first;
+ } else {
+ llist_add(sen_node, &percpu_ref_manage_head);
+ prev = sen_node;
+ head = prev->next;
+ }
+ } else {
+ prev = last_percpu_ref_node;
+ head = prev->next;
+ }
+
+ last_percpu_ref_node = NULL;
+ llist_for_each_safe(pos, next, head) {
+ /* Free sentinel node which is present in the list */
+ if (percpu_ref_is_sen_node(pos)) {
+ prev->next = pos->next;
+ percpu_ref_put_sen_node(pos);
+ continue;
+ }
+
+ ref = container_of(pos, struct percpu_ref_data, node)->ref;
+ __percpu_ref_switch_to_atomic_sync_checked(ref, false);
+ /*
+ * Drop the ref while in RCU read critical section to
+ * prevent obj free while we manipulating node.
+ */
+ rcu_read_lock();
+ percpu_ref_put(ref);
+ held = percpu_ref_tryget(ref);
+ if (!held) {
+ prev->next = pos->next;
+ init_llist_node(pos);
+ ref->percpu_count_ptr |= __PERCPU_REF_DEAD;
+ }
+ rcu_read_unlock();
+ if (!held)
+ continue;
+ __percpu_ref_switch_to_percpu_checked(ref, false);
+ count++;
+ if (count == max_scan_count) {
+ last_percpu_ref_node = pos;
+ break;
+ }
+ prev = pos;
+ }
+
+queue_release_work:
+ queue_delayed_work(percpu_ref_release_wq, &percpu_ref_release_work,
+ scan_interval);
+}
+
+static __init int percpu_ref_setup(void)
+{
+ percpu_ref_release_wq = alloc_workqueue("percpu_ref_release_wq",
+ WQ_UNBOUND | WQ_MEM_RECLAIM | WQ_FREEZABLE, 0);
+ if (!percpu_ref_release_wq)
+ return -ENOMEM;
+
+ queue_delayed_work(percpu_ref_release_wq, &percpu_ref_release_work,
+ scan_interval);
+ return 0;
+}
+early_initcall(percpu_ref_setup);
--
2.34.1
^ permalink raw reply related [flat|nested] 12+ messages in thread* [RFC 2/6] percpu-refcount: Add torture test for percpu refcount
2024-09-16 5:08 [RFC 0/6] Managed Percpu Refcount Neeraj Upadhyay
2024-09-16 5:08 ` [RFC 1/6] percpu-refcount: Add managed mode for RCU released objects Neeraj Upadhyay
@ 2024-09-16 5:08 ` Neeraj Upadhyay
2024-09-19 0:47 ` kernel test robot
2024-09-16 5:08 ` [RFC 3/6] percpu-refcount: Extend managed mode to allow runtime switching Neeraj Upadhyay
` (3 subsequent siblings)
5 siblings, 1 reply; 12+ messages in thread
From: Neeraj Upadhyay @ 2024-09-16 5:08 UTC (permalink / raw)
To: linux-kernel
Cc: john.johansen, paul, jmorris, serge, linux-security-module,
gautham.shenoy, Santosh.Shukla, Ananth.Narayan,
Raghavendra.KodsaraThimmappa, paulmck, boqun.feng, vinicius.gomes,
mjguzik, dennis, tj, cl, linux-mm, rcu
Add torture test to verify percpu managed mode operations,
verifying that a percpu ref does not have non-zero count when
all users have dropped their reference and that there is no early
release of the ref while users hold references to it.
Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
---
.../admin-guide/kernel-parameters.txt | 57 +++
lib/Kconfig.debug | 9 +
lib/Makefile | 1 +
lib/percpu-refcount-torture.c | 367 ++++++++++++++++++
lib/percpu-refcount.c | 49 ++-
lib/percpu-refcount.h | 6 +
6 files changed, 483 insertions(+), 6 deletions(-)
create mode 100644 lib/percpu-refcount-torture.c
create mode 100644 lib/percpu-refcount.h
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 0f02a1b04fe9..225f2dac294d 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -4677,6 +4677,63 @@
Default: 5000 ms
+ percpu_refcount_torture.busted_early_ref_release= [KNL]
+ Enable testing buggy release of percpu ref while
+ there are active users. Used for testing failure
+ scenarios in the test.
+
+ Default: 0 (disabled)
+
+ percpu_refcount_torture.busted_late_ref_release= [KNL]
+ Enable testing buggy non-zero reference count after
+ all ref users have dropped their reference. Used for
+ testing failure scenarios in the test.
+
+ Default: 0 (disabled)
+
+ percpu_refcount_torture.delay_us= [KNL]
+ Delay (in us) between reference increment and decrement
+ operations of ref users.
+
+ Default: 10
+
+ percpu_refcount_torture.niterations= [KNL]
+ Number of iterations of ref increment and decrement by
+ ref users.
+
+ Default: 100
+
+ percpu_refcount_torture.nrefs= [KNL]
+ Number of percpu ref instances.
+
+ Default: 2
+
+ percpu_refcount_torture.nusers= [KNL]
+ Number of percpu ref user threads which increment and
+ decrement a percpu ref.
+
+ Default: 2
+
+ percpu_refcount_torture.onoff_holdoff= [KNL]
+ Set time (s) after boot for CPU-hotplug testing.
+
+ Default: 0
+
+ percpu_refcount_torture.onoff_interval= [KNL]
+ Set time (jiffies) between CPU-hotplug operations,
+ or zero to disable CPU-hotplug testing.
+
+ percpu_refcount_torture.stutter= [KNL]
+ Set wait time (jiffies) between two iterations of
+ percpu ref operations.
+
+ Default: 0
+
+ percpu_refcount_torture.verbose= [KNL]
+ Enable additional printk() statements.
+
+ Default: 0 (Disabled)
+
pirq= [SMP,APIC] Manual mp-table setup
See Documentation/arch/x86/i386/IO-APIC.rst.
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index a30c03a66172..7e0117e01f05 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1611,6 +1611,15 @@ config SCF_TORTURE_TEST
module may be built after the fact on the running kernel to
be tested, if desired.
+config PERCPU_REFCOUNT_TORTURE_TEST
+ tristate "torture tests for percpu refcount"
+ select TORTURE_TEST
+ help
+ This options provides a kernel module that runs percpu
+ refcount torture tests for managed percpu refs. The kernel
+ module may be built after the fact on the running kernel
+ to be tested, if desired.
+
config CSD_LOCK_WAIT_DEBUG
bool "Debugging for csd_lock_wait(), called from smp_call_function*()"
depends on DEBUG_KERNEL
diff --git a/lib/Makefile b/lib/Makefile
index 322bb127b4dc..d0286f7dfb37 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -50,6 +50,7 @@ obj-y += bcd.o sort.o parser.o debug_locks.o random32.o \
once.o refcount.o rcuref.o usercopy.o errseq.o bucket_locks.o \
generic-radix-tree.o bitmap-str.o
obj-$(CONFIG_STRING_KUNIT_TEST) += string_kunit.o
+obj-$(CONFIG_PERCPU_REFCOUNT_TORTURE_TEST) += percpu-refcount-torture.o
obj-y += string_helpers.o
obj-$(CONFIG_STRING_HELPERS_KUNIT_TEST) += string_helpers_kunit.o
obj-y += hexdump.o
diff --git a/lib/percpu-refcount-torture.c b/lib/percpu-refcount-torture.c
new file mode 100644
index 000000000000..686f5a228b40
--- /dev/null
+++ b/lib/percpu-refcount-torture.c
@@ -0,0 +1,367 @@
+// SPDX-License-Identifier: GPL-2.0-only
+#include <linux/delay.h>
+#include <linux/jiffies.h>
+#include <linux/module.h>
+#include <linux/moduleparam.h>
+#include <linux/percpu-refcount.h>
+#include <linux/torture.h>
+
+#include "percpu-refcount.h"
+
+static int busted_early_ref_release;
+module_param(busted_early_ref_release, int, 0444);
+MODULE_PARM_DESC(busted_early_ref_release,
+ "Enable busted premature release of ref (default = 0), 0 = disable");
+
+static int busted_late_ref_release;
+module_param(busted_late_ref_release, int, 0444);
+MODULE_PARM_DESC(busted_late_ref_release,
+ "Enable busted late release of ref (default = 0), 0 = disable");
+
+static long delay_us = 10;
+module_param(delay_us, long, 0444);
+MODULE_PARM_DESC(delay_us,
+ "delay between reader refcount operations in microseconds (default = 10)");
+
+static long nrefs = 2;
+module_param(nrefs, long, 0444);
+MODULE_PARM_DESC(nrefs, "Number of percpu refs (default = 2)");
+
+static long niterations = 100;
+module_param(niterations, long, 0444);
+MODULE_PARM_DESC(niterations,
+ "Number of iterations of ref increment and decrement (default = 100)");
+
+static long nusers = 2;
+module_param(nusers, long, 0444);
+MODULE_PARM_DESC(nusers, "Number of refcount users (default = 2)");
+
+static int onoff_holdoff;
+module_param(onoff_holdoff, int, 0444);
+MODULE_PARM_DESC(onoff_holdoff, "Time after boot before CPU hotplugs (seconds)");
+
+static int onoff_interval;
+module_param(onoff_interval, int, 0444);
+MODULE_PARM_DESC(onoff_interval, "Time between CPU hotplugs (jiffies), 0=disable");
+
+static int stutter;
+module_param(stutter, int, 0444);
+MODULE_PARM_DESC(stutter, "Stutter period in jiffies (default = 0), 0 = disable");
+
+static int verbose = 1;
+module_param(verbose, int, 0444);
+MODULE_PARM_DESC(verbose, "Enable verbose debugging printk()s");
+
+static struct task_struct **ref_user_tasks;
+static struct task_struct *ref_manager_task;
+static struct task_struct **busted_early_release_tasks;
+static struct task_struct **busted_late_release_tasks;
+
+static struct percpu_ref *refs;
+static long *num_per_ref_users;
+
+static atomic_t running;
+static atomic_t *ref_running;
+
+static char *torture_type = "percpu-refcount";
+
+static int percpu_ref_manager_thread(void *data)
+{
+ int i;
+
+ while (atomic_read(&running) != 0) {
+ percpu_ref_test_flush_release_work();
+ stutter_wait("percpu_ref_manager_thread");
+ }
+ /* Ensure ordering with ref users */
+ smp_mb();
+
+ percpu_ref_test_flush_release_work();
+
+ for (i = 0; i < nrefs; i++) {
+ WARN(percpu_ref_test_is_percpu(&refs[i]),
+ "!!! released ref %d should be in atomic mode", i);
+ WARN(!percpu_ref_is_zero(&refs[i]),
+ "!!! released ref %d should have 0 refcount", i);
+ }
+
+ do {
+ stutter_wait("percpu_ref_manager_thread");
+ } while (!torture_must_stop());
+
+ torture_kthread_stopping("percpu_ref_manager_thread");
+
+ return 0;
+}
+
+static int percpu_ref_test_thread(void *data)
+{
+ struct percpu_ref *ref = (struct percpu_ref *)data;
+ int i = 0;
+
+ percpu_ref_get(ref);
+
+ do {
+ percpu_ref_get(ref);
+ udelay(delay_us);
+ percpu_ref_put(ref);
+ stutter_wait("percpu_ref_test_thread");
+ i++;
+ } while (i < niterations);
+
+ atomic_dec(&ref_running[ref - refs]);
+ /* Order ref release with ref_running[ref_idx] == 0 */
+ smp_mb();
+ percpu_ref_put(ref);
+ /* Order ref decrement with running == 0 */
+ smp_mb();
+ atomic_dec(&running);
+
+ do {
+ stutter_wait("percpu_ref_test_thread");
+ } while (!torture_must_stop());
+
+ torture_kthread_stopping("percpu_ref_test_thread");
+
+ return 0;
+}
+
+static int percpu_ref_busted_early_thread(void *data)
+{
+ struct percpu_ref *ref = (struct percpu_ref *)data;
+ int ref_idx = ref - refs;
+ int i = 0, j;
+
+ do {
+ /* Extra ref put momemtarily */
+ for (j = 0; j < num_per_ref_users[ref_idx]; j++)
+ percpu_ref_put(ref);
+ stutter_wait("percpu_ref_busted_early_thread");
+ for (j = 0; j < num_per_ref_users[ref_idx]; j++)
+ percpu_ref_get(ref);
+ i++;
+ stutter_wait("percpu_ref_busted_early_thread");
+ } while (i < niterations * 10);
+
+ do {
+ stutter_wait("percpu_ref_busted_early_thread");
+ } while (!torture_must_stop());
+
+ torture_kthread_stopping("percpu_ref_busted_early_thread");
+
+ return 0;
+}
+
+static int percpu_ref_busted_late_thread(void *data)
+{
+ struct percpu_ref *ref = (struct percpu_ref *)data;
+ int i = 0;
+
+ do {
+ /* Extra ref get momemtarily */
+ percpu_ref_get(ref);
+ stutter_wait("percpu_ref_busted_late_thread");
+ percpu_ref_put(ref);
+ i++;
+ } while (i < niterations);
+
+ do {
+ stutter_wait("percpu_ref_busted_late_thread");
+ } while (!torture_must_stop());
+
+ torture_kthread_stopping("percpu_ref_busted_late_thread");
+
+ return 0;
+}
+
+static void percpu_ref_test_cleanup(void)
+{
+ int i;
+
+ if (torture_cleanup_begin())
+ return;
+
+ if (busted_late_release_tasks) {
+ for (i = 0; i < nrefs; i++)
+ torture_stop_kthread(busted_late_task, busted_late_release_tasks[i]);
+ kfree(busted_late_release_tasks);
+ busted_late_release_tasks = NULL;
+ }
+
+ if (busted_early_release_tasks) {
+ for (i = 0; i < nrefs; i++)
+ torture_stop_kthread(busted_early_task, busted_early_release_tasks[i]);
+ kfree(busted_early_release_tasks);
+ busted_early_release_tasks = NULL;
+ }
+
+ if (ref_manager_task) {
+ torture_stop_kthread(ref_manager, ref_manager_task);
+ ref_manager_task = NULL;
+ }
+
+ if (ref_user_tasks) {
+ for (i = 0; i < nusers; i++)
+ torture_stop_kthread(ref_user, ref_user_tasks[i]);
+ kfree(ref_user_tasks);
+ ref_user_tasks = NULL;
+ }
+
+ kfree(ref_running);
+ ref_running = NULL;
+
+ kfree(num_per_ref_users);
+ num_per_ref_users = NULL;
+
+ if (refs) {
+ for (i = 0; i < nrefs; i++)
+ percpu_ref_exit(&refs[i]);
+ kfree(refs);
+ refs = NULL;
+ }
+
+ torture_cleanup_end();
+}
+
+static void percpu_ref_test_release(struct percpu_ref *ref)
+{
+ WARN(!!atomic_add_return(0, &ref_running[ref-refs]), "!!! Premature ref release");
+}
+
+static int __init percpu_ref_torture_init(void)
+{
+ DEFINE_TORTURE_RANDOM(rand);
+ struct torture_random_state *trsp = &rand;
+ int flags;
+ int err;
+ int ref_idx;
+ int i;
+
+ if (!torture_init_begin("percpu-refcount", verbose))
+ return -EBUSY;
+
+ atomic_set(&running, nusers);
+ /* Order @running with later increment and decrement operations */
+ smp_mb();
+
+ refs = kcalloc(nrefs, sizeof(refs[0]), GFP_KERNEL);
+ if (!refs) {
+ TOROUT_ERRSTRING("out of memory");
+ err = -ENOMEM;
+ goto init_err;
+ }
+ for (i = 0; i < nrefs; i++) {
+ flags = torture_random(trsp) & 1 ? PERCPU_REF_INIT_ATOMIC : PERCPU_REF_REL_MANAGED;
+ err = percpu_ref_init(&refs[i], percpu_ref_test_release,
+ flags, GFP_KERNEL);
+ if (err)
+ goto init_err;
+ if (!(flags & PERCPU_REF_REL_MANAGED))
+ percpu_ref_switch_to_managed(&refs[i]);
+ }
+
+ num_per_ref_users = kcalloc(nrefs, sizeof(num_per_ref_users[0]), GFP_KERNEL);
+ if (!num_per_ref_users) {
+ TOROUT_ERRSTRING("out of memory");
+ err = -ENOMEM;
+ goto init_err;
+ }
+ for (i = 0; i < nrefs; i++)
+ num_per_ref_users[i] = 0;
+
+ ref_user_tasks = kcalloc(nusers, sizeof(ref_user_tasks[0]), GFP_KERNEL);
+ if (!ref_user_tasks) {
+ TOROUT_ERRSTRING("out of memory");
+ err = -ENOMEM;
+ goto init_err;
+ }
+
+ ref_running = kcalloc(nrefs, sizeof(ref_running[0]), GFP_KERNEL);
+ if (!ref_running) {
+ TOROUT_ERRSTRING("out of memory");
+ err = -ENOMEM;
+ goto init_err;
+ }
+
+ for (i = 0; i < nusers; i++) {
+ ref_idx = torture_random(trsp) % nrefs;
+ atomic_inc(&ref_running[ref_idx]);
+ num_per_ref_users[ref_idx]++;
+ /* Order increments with subquent reads */
+ smp_mb();
+ err = torture_create_kthread(percpu_ref_test_thread,
+ &refs[ref_idx], ref_user_tasks[i]);
+ if (torture_init_error(err))
+ goto init_err;
+ }
+
+ err = torture_create_kthread(percpu_ref_manager_thread, NULL, ref_manager_task);
+ if (torture_init_error(err))
+ goto init_err;
+
+ /* Drop initial reference, after test threads have started running */
+ udelay(1);
+ for (i = 0; i < nrefs; i++)
+ percpu_ref_put(&refs[i]);
+
+
+ if (busted_early_ref_release) {
+ busted_early_release_tasks = kcalloc(nrefs,
+ sizeof(busted_early_release_tasks[0]),
+ GFP_KERNEL);
+ if (!busted_early_release_tasks) {
+ TOROUT_ERRSTRING("out of memory");
+ err = -ENOMEM;
+ goto init_err;
+ }
+ for (i = 0; i < nrefs; i++) {
+ err = torture_create_kthread(percpu_ref_busted_early_thread,
+ &refs[i], busted_early_release_tasks[i]);
+ if (torture_init_error(err))
+ goto init_err;
+ }
+ }
+
+ if (busted_late_ref_release) {
+ busted_late_release_tasks = kcalloc(nrefs, sizeof(busted_late_release_tasks[0]),
+ GFP_KERNEL);
+ if (!busted_late_release_tasks) {
+ TOROUT_ERRSTRING("out of memory");
+ err = -ENOMEM;
+ goto init_err;
+ }
+ for (i = 0; i < nrefs; i++) {
+ err = torture_create_kthread(percpu_ref_busted_late_thread,
+ &refs[i], busted_late_release_tasks[i]);
+ if (torture_init_error(err))
+ goto init_err;
+ }
+ }
+ if (stutter) {
+ err = torture_stutter_init(stutter, stutter);
+ if (torture_init_error(err))
+ goto init_err;
+ }
+
+ err = torture_onoff_init(onoff_holdoff * HZ, onoff_interval, NULL);
+ if (torture_init_error(err))
+ goto init_err;
+
+ torture_init_end();
+ return 0;
+init_err:
+ torture_init_end();
+ percpu_ref_test_cleanup();
+ return err;
+}
+
+static void __exit percpu_ref_torture_exit(void)
+{
+ percpu_ref_test_cleanup();
+}
+
+module_init(percpu_ref_torture_init);
+module_exit(percpu_ref_torture_exit);
+
+MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("percpu refcount torture test");
diff --git a/lib/percpu-refcount.c b/lib/percpu-refcount.c
index 7b97f9728c5b..7d0c85c7ce57 100644
--- a/lib/percpu-refcount.c
+++ b/lib/percpu-refcount.c
@@ -11,6 +11,8 @@
#include <linux/mm.h>
#include <linux/percpu-refcount.h>
+#include "percpu-refcount.h"
+
/*
* Initially, a percpu refcount is just a set of percpu counters. Initially, we
* don't try to detect the ref hitting 0 - which means that get/put can just
@@ -677,6 +679,7 @@ static void percpu_ref_release_work_fn(struct work_struct *work)
struct percpu_ref *ref;
int count = 0;
bool held;
+ struct llist_node *last_node = READ_ONCE(last_percpu_ref_node);
first = READ_ONCE(percpu_ref_manage_head.first);
if (!first)
@@ -711,7 +714,7 @@ static void percpu_ref_release_work_fn(struct work_struct *work)
* +----------+ +------+ +------+ +------+ +------+
*
*/
- if (last_percpu_ref_node == NULL || last_percpu_ref_node->next == NULL) {
+ if (last_node == NULL || last_node->next == NULL) {
retry_sentinel_get:
sen_node = percpu_ref_get_sen_node();
/*
@@ -741,11 +744,10 @@ static void percpu_ref_release_work_fn(struct work_struct *work)
head = prev->next;
}
} else {
- prev = last_percpu_ref_node;
+ prev = last_node;
head = prev->next;
}
- last_percpu_ref_node = NULL;
llist_for_each_safe(pos, next, head) {
/* Free sentinel node which is present in the list */
if (percpu_ref_is_sen_node(pos)) {
@@ -773,18 +775,53 @@ static void percpu_ref_release_work_fn(struct work_struct *work)
continue;
__percpu_ref_switch_to_percpu_checked(ref, false);
count++;
- if (count == max_scan_count) {
- last_percpu_ref_node = pos;
- break;
+ if (count == READ_ONCE(max_scan_count)) {
+ WRITE_ONCE(last_percpu_ref_node, pos);
+ goto queue_release_work;
}
prev = pos;
}
+ WRITE_ONCE(last_percpu_ref_node, NULL);
queue_release_work:
queue_delayed_work(percpu_ref_release_wq, &percpu_ref_release_work,
scan_interval);
}
+bool percpu_ref_test_is_percpu(struct percpu_ref *ref)
+{
+ unsigned long __percpu *percpu_count;
+
+ return __ref_is_percpu(ref, &percpu_count);
+}
+EXPORT_SYMBOL_GPL(percpu_ref_test_is_percpu);
+
+void percpu_ref_test_flush_release_work(void)
+{
+ int max_flush = READ_ONCE(max_scan_count);
+ int max_count = 1000;
+
+ /* Complete any executing release work */
+ flush_delayed_work(&percpu_ref_release_work);
+ /* Scan till the end of the llist */
+ WRITE_ONCE(max_scan_count, INT_MAX);
+ /* max scan count update visible to release work */
+ smp_mb();
+ flush_delayed_work(&percpu_ref_release_work);
+ /* max scan count update visible to release work */
+ smp_mb();
+ WRITE_ONCE(max_scan_count, 1);
+ /* max scan count update visible to work */
+ smp_mb();
+ flush_delayed_work(&percpu_ref_release_work);
+ while (READ_ONCE(last_percpu_ref_node) != NULL && max_count--)
+ flush_delayed_work(&percpu_ref_release_work);
+ /* max scan count update visible to work */
+ smp_mb();
+ WRITE_ONCE(max_scan_count, max_flush);
+}
+EXPORT_SYMBOL_GPL(percpu_ref_test_flush_release_work);
+
static __init int percpu_ref_setup(void)
{
percpu_ref_release_wq = alloc_workqueue("percpu_ref_release_wq",
diff --git a/lib/percpu-refcount.h b/lib/percpu-refcount.h
new file mode 100644
index 000000000000..be2ac0411194
--- /dev/null
+++ b/lib/percpu-refcount.h
@@ -0,0 +1,6 @@
+/* SPDX-License-Identifier: GPL-2.0+ */
+#ifndef __LIB_REFCOUNT_H
+#define __LIB_REFCOUNT_H
+bool percpu_ref_test_is_percpu(struct percpu_ref *ref);
+void percpu_ref_test_flush_release_work(void);
+#endif
--
2.34.1
^ permalink raw reply related [flat|nested] 12+ messages in thread* Re: [RFC 2/6] percpu-refcount: Add torture test for percpu refcount
2024-09-16 5:08 ` [RFC 2/6] percpu-refcount: Add torture test for percpu refcount Neeraj Upadhyay
@ 2024-09-19 0:47 ` kernel test robot
0 siblings, 0 replies; 12+ messages in thread
From: kernel test robot @ 2024-09-19 0:47 UTC (permalink / raw)
To: Neeraj Upadhyay; +Cc: oe-kbuild-all
Hi Neeraj,
[This is a private test report for your RFC patch.]
kernel test robot noticed the following build errors:
[auto build test ERROR on akpm-mm/mm-nonmm-unstable]
[also build test ERROR on linus/master dennis-percpu/for-next v6.11 next-20240918]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Neeraj-Upadhyay/percpu-refcount-Add-managed-mode-for-RCU-released-objects/20240916-131210
base: https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-nonmm-unstable
patch link: https://lore.kernel.org/r/20240916050811.473556-3-Neeraj.Upadhyay%40amd.com
patch subject: [RFC 2/6] percpu-refcount: Add torture test for percpu refcount
config: i386-allyesconfig (https://download.01.org/0day-ci/archive/20240919/202409190803.wpIBcLGg-lkp@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240919/202409190803.wpIBcLGg-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202409190803.wpIBcLGg-lkp@intel.com/
All error/warnings (new ones prefixed by >>):
lib/percpu-refcount-torture.c: In function 'percpu_ref_test_cleanup':
>> lib/percpu-refcount-torture.c:187:17: error: implicit declaration of function 'kfree' [-Werror=implicit-function-declaration]
187 | kfree(busted_late_release_tasks);
| ^~~~~
lib/percpu-refcount-torture.c: In function 'percpu_ref_torture_init':
>> lib/percpu-refcount-torture.c:247:16: error: implicit declaration of function 'kcalloc' [-Werror=implicit-function-declaration]
247 | refs = kcalloc(nrefs, sizeof(refs[0]), GFP_KERNEL);
| ^~~~~~~
>> lib/percpu-refcount-torture.c:247:14: warning: assignment to 'struct percpu_ref *' from 'int' makes pointer from integer without a cast [-Wint-conversion]
247 | refs = kcalloc(nrefs, sizeof(refs[0]), GFP_KERNEL);
| ^
>> lib/percpu-refcount-torture.c:263:27: warning: assignment to 'long int *' from 'int' makes pointer from integer without a cast [-Wint-conversion]
263 | num_per_ref_users = kcalloc(nrefs, sizeof(num_per_ref_users[0]), GFP_KERNEL);
| ^
>> lib/percpu-refcount-torture.c:272:24: warning: assignment to 'struct task_struct **' from 'int' makes pointer from integer without a cast [-Wint-conversion]
272 | ref_user_tasks = kcalloc(nusers, sizeof(ref_user_tasks[0]), GFP_KERNEL);
| ^
>> lib/percpu-refcount-torture.c:279:21: warning: assignment to 'atomic_t *' from 'int' makes pointer from integer without a cast [-Wint-conversion]
279 | ref_running = kcalloc(nrefs, sizeof(ref_running[0]), GFP_KERNEL);
| ^
lib/percpu-refcount-torture.c:309:44: warning: assignment to 'struct task_struct **' from 'int' makes pointer from integer without a cast [-Wint-conversion]
309 | busted_early_release_tasks = kcalloc(nrefs,
| ^
lib/percpu-refcount-torture.c:326:43: warning: assignment to 'struct task_struct **' from 'int' makes pointer from integer without a cast [-Wint-conversion]
326 | busted_late_release_tasks = kcalloc(nrefs, sizeof(busted_late_release_tasks[0]),
| ^
cc1: some warnings being treated as errors
vim +/kfree +187 lib/percpu-refcount-torture.c
176
177 static void percpu_ref_test_cleanup(void)
178 {
179 int i;
180
181 if (torture_cleanup_begin())
182 return;
183
184 if (busted_late_release_tasks) {
185 for (i = 0; i < nrefs; i++)
186 torture_stop_kthread(busted_late_task, busted_late_release_tasks[i]);
> 187 kfree(busted_late_release_tasks);
188 busted_late_release_tasks = NULL;
189 }
190
191 if (busted_early_release_tasks) {
192 for (i = 0; i < nrefs; i++)
193 torture_stop_kthread(busted_early_task, busted_early_release_tasks[i]);
194 kfree(busted_early_release_tasks);
195 busted_early_release_tasks = NULL;
196 }
197
198 if (ref_manager_task) {
199 torture_stop_kthread(ref_manager, ref_manager_task);
200 ref_manager_task = NULL;
201 }
202
203 if (ref_user_tasks) {
204 for (i = 0; i < nusers; i++)
205 torture_stop_kthread(ref_user, ref_user_tasks[i]);
206 kfree(ref_user_tasks);
207 ref_user_tasks = NULL;
208 }
209
210 kfree(ref_running);
211 ref_running = NULL;
212
213 kfree(num_per_ref_users);
214 num_per_ref_users = NULL;
215
216 if (refs) {
217 for (i = 0; i < nrefs; i++)
218 percpu_ref_exit(&refs[i]);
219 kfree(refs);
220 refs = NULL;
221 }
222
223 torture_cleanup_end();
224 }
225
226 static void percpu_ref_test_release(struct percpu_ref *ref)
227 {
228 WARN(!!atomic_add_return(0, &ref_running[ref-refs]), "!!! Premature ref release");
229 }
230
231 static int __init percpu_ref_torture_init(void)
232 {
233 DEFINE_TORTURE_RANDOM(rand);
234 struct torture_random_state *trsp = &rand;
235 int flags;
236 int err;
237 int ref_idx;
238 int i;
239
240 if (!torture_init_begin("percpu-refcount", verbose))
241 return -EBUSY;
242
243 atomic_set(&running, nusers);
244 /* Order @running with later increment and decrement operations */
245 smp_mb();
246
> 247 refs = kcalloc(nrefs, sizeof(refs[0]), GFP_KERNEL);
248 if (!refs) {
249 TOROUT_ERRSTRING("out of memory");
250 err = -ENOMEM;
251 goto init_err;
252 }
253 for (i = 0; i < nrefs; i++) {
254 flags = torture_random(trsp) & 1 ? PERCPU_REF_INIT_ATOMIC : PERCPU_REF_REL_MANAGED;
255 err = percpu_ref_init(&refs[i], percpu_ref_test_release,
256 flags, GFP_KERNEL);
257 if (err)
258 goto init_err;
259 if (!(flags & PERCPU_REF_REL_MANAGED))
260 percpu_ref_switch_to_managed(&refs[i]);
261 }
262
> 263 num_per_ref_users = kcalloc(nrefs, sizeof(num_per_ref_users[0]), GFP_KERNEL);
264 if (!num_per_ref_users) {
265 TOROUT_ERRSTRING("out of memory");
266 err = -ENOMEM;
267 goto init_err;
268 }
269 for (i = 0; i < nrefs; i++)
270 num_per_ref_users[i] = 0;
271
> 272 ref_user_tasks = kcalloc(nusers, sizeof(ref_user_tasks[0]), GFP_KERNEL);
273 if (!ref_user_tasks) {
274 TOROUT_ERRSTRING("out of memory");
275 err = -ENOMEM;
276 goto init_err;
277 }
278
> 279 ref_running = kcalloc(nrefs, sizeof(ref_running[0]), GFP_KERNEL);
280 if (!ref_running) {
281 TOROUT_ERRSTRING("out of memory");
282 err = -ENOMEM;
283 goto init_err;
284 }
285
286 for (i = 0; i < nusers; i++) {
287 ref_idx = torture_random(trsp) % nrefs;
288 atomic_inc(&ref_running[ref_idx]);
289 num_per_ref_users[ref_idx]++;
290 /* Order increments with subquent reads */
291 smp_mb();
292 err = torture_create_kthread(percpu_ref_test_thread,
293 &refs[ref_idx], ref_user_tasks[i]);
294 if (torture_init_error(err))
295 goto init_err;
296 }
297
298 err = torture_create_kthread(percpu_ref_manager_thread, NULL, ref_manager_task);
299 if (torture_init_error(err))
300 goto init_err;
301
302 /* Drop initial reference, after test threads have started running */
303 udelay(1);
304 for (i = 0; i < nrefs; i++)
305 percpu_ref_put(&refs[i]);
306
307
308 if (busted_early_ref_release) {
309 busted_early_release_tasks = kcalloc(nrefs,
310 sizeof(busted_early_release_tasks[0]),
311 GFP_KERNEL);
312 if (!busted_early_release_tasks) {
313 TOROUT_ERRSTRING("out of memory");
314 err = -ENOMEM;
315 goto init_err;
316 }
317 for (i = 0; i < nrefs; i++) {
318 err = torture_create_kthread(percpu_ref_busted_early_thread,
319 &refs[i], busted_early_release_tasks[i]);
320 if (torture_init_error(err))
321 goto init_err;
322 }
323 }
324
325 if (busted_late_ref_release) {
326 busted_late_release_tasks = kcalloc(nrefs, sizeof(busted_late_release_tasks[0]),
327 GFP_KERNEL);
328 if (!busted_late_release_tasks) {
329 TOROUT_ERRSTRING("out of memory");
330 err = -ENOMEM;
331 goto init_err;
332 }
333 for (i = 0; i < nrefs; i++) {
334 err = torture_create_kthread(percpu_ref_busted_late_thread,
335 &refs[i], busted_late_release_tasks[i]);
336 if (torture_init_error(err))
337 goto init_err;
338 }
339 }
340 if (stutter) {
341 err = torture_stutter_init(stutter, stutter);
342 if (torture_init_error(err))
343 goto init_err;
344 }
345
346 err = torture_onoff_init(onoff_holdoff * HZ, onoff_interval, NULL);
347 if (torture_init_error(err))
348 goto init_err;
349
350 torture_init_end();
351 return 0;
352 init_err:
353 torture_init_end();
354 percpu_ref_test_cleanup();
355 return err;
356 }
357
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 12+ messages in thread
* [RFC 3/6] percpu-refcount: Extend managed mode to allow runtime switching
2024-09-16 5:08 [RFC 0/6] Managed Percpu Refcount Neeraj Upadhyay
2024-09-16 5:08 ` [RFC 1/6] percpu-refcount: Add managed mode for RCU released objects Neeraj Upadhyay
2024-09-16 5:08 ` [RFC 2/6] percpu-refcount: Add torture test for percpu refcount Neeraj Upadhyay
@ 2024-09-16 5:08 ` Neeraj Upadhyay
2024-09-16 12:04 ` kernel test robot
2024-09-16 19:07 ` kernel test robot
2024-09-16 5:08 ` [RFC 4/6] percpu-refcount-torture: Extend test with runtime mode switches Neeraj Upadhyay
` (2 subsequent siblings)
5 siblings, 2 replies; 12+ messages in thread
From: Neeraj Upadhyay @ 2024-09-16 5:08 UTC (permalink / raw)
To: linux-kernel
Cc: john.johansen, paul, jmorris, serge, linux-security-module,
gautham.shenoy, Santosh.Shukla, Ananth.Narayan,
Raghavendra.KodsaraThimmappa, paulmck, boqun.feng, vinicius.gomes,
mjguzik, dennis, tj, cl, linux-mm, rcu
Provide more flexibility in terms of runtime mode switch for a managed
percpu ref. This can be useful for cases when in some scenarios, a
managed ref's object enters shutdown phase. Instead of waiting for
manager thread to process the ref, user can dirctly invoke
percpu_ref_kill() for the ref.
The init modes are same as in existing code. Runtime mode switching
allows switching back a managed ref to unmanaged mode, which allows
transitions to all reinit modes from managed mode.
To --> A P P(RI) M D D(RI) D(RI/M) EX REI RES
A y n y y n y y y y y
P n n n n y n n y n n
M y* n y* y n y* y y* y y
P(RI) y n y y n y y y y y
D(RI) y n y y n y y - y y
D(RI/M) y* n y* y n y* y - y y
Modes:
A - Atomic P - PerCPU M - Managed P(RI) - PerCPU with ReInit
D(RI) - Dead with ReInit D(RI/M) - Dead with ReInit and Managed
PerCPU Ref Ops:
KLL - Kill REI - Reinit RES - Resurrect
(RI) is for modes which are initialized with PERCPU_REF_ALLOW_REINIT.
The transitions shown above are the allowed transitions and can be
indirect transitions. For example, managed ref switches to P(RI) mode
when percpu_ref_switch_to_unmanaged() is called for it. P(RI) mode
can be directly switched to A mode using percpu_ref_switch_to_atomic().
Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
---
include/linux/percpu-refcount.h | 3 +-
lib/percpu-refcount.c | 248 +++++++++++---------------------
2 files changed, 88 insertions(+), 163 deletions(-)
diff --git a/include/linux/percpu-refcount.h b/include/linux/percpu-refcount.h
index e6aea81b3d01..fe967db431a6 100644
--- a/include/linux/percpu-refcount.h
+++ b/include/linux/percpu-refcount.h
@@ -110,7 +110,7 @@ struct percpu_ref_data {
struct rcu_head rcu;
struct percpu_ref *ref;
unsigned int aux_flags;
- struct llist_node node;
+ struct list_head node;
};
@@ -139,6 +139,7 @@ void percpu_ref_switch_to_atomic(struct percpu_ref *ref,
void percpu_ref_switch_to_atomic_sync(struct percpu_ref *ref);
void percpu_ref_switch_to_percpu(struct percpu_ref *ref);
int percpu_ref_switch_to_managed(struct percpu_ref *ref);
+void percpu_ref_switch_to_unmanaged(struct percpu_ref *ref);
void percpu_ref_kill_and_confirm(struct percpu_ref *ref,
percpu_ref_func_t *confirm_kill);
void percpu_ref_resurrect(struct percpu_ref *ref);
diff --git a/lib/percpu-refcount.c b/lib/percpu-refcount.c
index 7d0c85c7ce57..b79e36905aa4 100644
--- a/lib/percpu-refcount.c
+++ b/lib/percpu-refcount.c
@@ -5,7 +5,7 @@
#include <linux/sched.h>
#include <linux/wait.h>
#include <linux/slab.h>
-#include <linux/llist.h>
+#include <linux/list.h>
#include <linux/moduleparam.h>
#include <linux/types.h>
#include <linux/mm.h>
@@ -43,7 +43,12 @@
static DEFINE_SPINLOCK(percpu_ref_switch_lock);
static DECLARE_WAIT_QUEUE_HEAD(percpu_ref_switch_waitq);
-static LLIST_HEAD(percpu_ref_manage_head);
+static struct list_head percpu_ref_manage_head = LIST_HEAD_INIT(percpu_ref_manage_head);
+/* Spinlock protects node additions/deletions */
+static DEFINE_SPINLOCK(percpu_ref_manage_lock);
+/* Mutex synchronizes node deletions with the node being scanned */
+static DEFINE_MUTEX(percpu_ref_active_switch_mutex);
+static struct list_head *next_percpu_ref_node = &percpu_ref_manage_head;
static unsigned long __percpu *percpu_count_ptr(struct percpu_ref *ref)
{
@@ -112,7 +117,7 @@ int percpu_ref_init(struct percpu_ref *ref, percpu_ref_func_t *release,
data->confirm_switch = NULL;
data->ref = ref;
ref->data = data;
- init_llist_node(&data->node);
+ INIT_LIST_HEAD(&data->node);
if (flags & PERCPU_REF_REL_MANAGED)
percpu_ref_switch_to_managed(ref);
@@ -150,9 +155,9 @@ static int __percpu_ref_switch_to_managed(struct percpu_ref *ref)
data->force_atomic = false;
if (!__ref_is_percpu(ref, &percpu_count))
__percpu_ref_switch_mode(ref, NULL);
- /* Ensure ordering of percpu mode switch and node scan */
- smp_mb();
- llist_add(&data->node, &percpu_ref_manage_head);
+ spin_lock(&percpu_ref_manage_lock);
+ list_add(&data->node, &percpu_ref_manage_head);
+ spin_unlock(&percpu_ref_manage_lock);
return 0;
@@ -162,7 +167,7 @@ static int __percpu_ref_switch_to_managed(struct percpu_ref *ref)
}
/**
- * percpu_ref_switch_to_managed - Switch an unmanaged ref to percpu mode.
+ * percpu_ref_switch_to_managed - Switch an unmanaged ref to percpu managed mode.
*
* @ref: percpu_ref to switch to managed mode
*
@@ -179,6 +184,47 @@ int percpu_ref_switch_to_managed(struct percpu_ref *ref)
}
EXPORT_SYMBOL_GPL(percpu_ref_switch_to_managed);
+/**
+ * percpu_ref_switch_to_unmanaged - Switch a managed ref to percpu mode.
+ *
+ * @ref: percpu_ref to switch back to unmanaged percpu mode
+ *
+ * Must only be called with elevated refcount.
+ */
+void percpu_ref_switch_to_unmanaged(struct percpu_ref *ref)
+{
+ bool mutex_taken = false;
+ struct list_head *node;
+ unsigned long flags;
+
+ might_sleep();
+
+ WARN_ONCE(!percpu_ref_is_managed(ref), "Percpu ref is not managed");
+
+ node = &ref->data->node;
+ spin_lock(&percpu_ref_manage_lock);
+ if (list_empty(node)) {
+ spin_unlock(&percpu_ref_manage_lock);
+ mutex_taken = true;
+ mutex_lock(&percpu_ref_active_switch_mutex);
+ spin_lock(&percpu_ref_manage_lock);
+ }
+
+ if (next_percpu_ref_node == node)
+ next_percpu_ref_node = next_percpu_ref_node->next;
+ list_del_init(node);
+ spin_unlock(&percpu_ref_manage_lock);
+ if (mutex_taken)
+ mutex_unlock(&percpu_ref_active_switch_mutex);
+
+ /* Drop the pseudo-init reference */
+ percpu_ref_put(ref);
+ spin_lock_irqsave(&percpu_ref_switch_lock, flags);
+ ref->data->aux_flags &= ~__PERCPU_REL_MANAGED;
+ spin_unlock_irqrestore(&percpu_ref_switch_lock, flags);
+}
+EXPORT_SYMBOL_GPL(percpu_ref_switch_to_unmanaged);
+
static void __percpu_ref_exit(struct percpu_ref *ref)
{
unsigned long __percpu *percpu_count = percpu_count_ptr(ref);
@@ -599,164 +645,35 @@ module_param(max_scan_count, int, 0444);
static void percpu_ref_release_work_fn(struct work_struct *work);
-/*
- * Sentinel llist nodes for lockless list traveral and deletions by
- * the pcpu ref release worker, while nodes are added from
- * percpu_ref_init() and percpu_ref_switch_to_managed().
- *
- * Sentinel node marks the head of list traversal for the current
- * iteration of kworker execution.
- */
-struct percpu_ref_sen_node {
- bool inuse;
- struct llist_node node;
-};
-
-/*
- * We need two sentinel nodes for lockless list manipulations from release
- * worker - first node will be used in current reclaim iteration. The second
- * node will be used in next iteration. Next iteration marks the first node
- * as free, for use in subsequent iteration.
- */
-#define PERCPU_REF_SEN_NODES_COUNT 2
-
-/* Track last processed percpu ref node */
-static struct llist_node *last_percpu_ref_node;
-
-static struct percpu_ref_sen_node
- percpu_ref_sen_nodes[PERCPU_REF_SEN_NODES_COUNT];
-
static DECLARE_DELAYED_WORK(percpu_ref_release_work, percpu_ref_release_work_fn);
-static bool percpu_ref_is_sen_node(struct llist_node *node)
-{
- return &percpu_ref_sen_nodes[0].node <= node &&
- node <= &percpu_ref_sen_nodes[PERCPU_REF_SEN_NODES_COUNT - 1].node;
-}
-
-static struct llist_node *percpu_ref_get_sen_node(void)
-{
- int i;
- struct percpu_ref_sen_node *sn;
-
- for (i = 0; i < PERCPU_REF_SEN_NODES_COUNT; i++) {
- sn = &percpu_ref_sen_nodes[i];
- if (!sn->inuse) {
- sn->inuse = true;
- return &sn->node;
- }
- }
-
- return NULL;
-}
-
-static void percpu_ref_put_sen_node(struct llist_node *node)
-{
- struct percpu_ref_sen_node *sn = container_of(node, struct percpu_ref_sen_node, node);
-
- sn->inuse = false;
- init_llist_node(node);
-}
-
-static void percpu_ref_put_all_sen_nodes_except(struct llist_node *node)
-{
- int i;
-
- for (i = 0; i < PERCPU_REF_SEN_NODES_COUNT; i++) {
- if (&percpu_ref_sen_nodes[i].node == node)
- continue;
- percpu_ref_sen_nodes[i].inuse = false;
- init_llist_node(&percpu_ref_sen_nodes[i].node);
- }
-}
-
static struct workqueue_struct *percpu_ref_release_wq;
static void percpu_ref_release_work_fn(struct work_struct *work)
{
- struct llist_node *pos, *first, *head, *prev, *next;
- struct llist_node *sen_node;
+ struct list_head *node;
struct percpu_ref *ref;
int count = 0;
bool held;
- struct llist_node *last_node = READ_ONCE(last_percpu_ref_node);
- first = READ_ONCE(percpu_ref_manage_head.first);
- if (!first)
+ mutex_lock(&percpu_ref_active_switch_mutex);
+ spin_lock(&percpu_ref_manage_lock);
+ if (list_empty(&percpu_ref_manage_head)) {
+ next_percpu_ref_node = &percpu_ref_manage_head;
+ spin_unlock(&percpu_ref_manage_lock);
+ mutex_unlock(&percpu_ref_active_switch_mutex);
goto queue_release_work;
-
- /*
- * Enqueue a dummy node to mark the start of scan. This dummy
- * node is used as start point of scan and ensures that
- * there is no additional synchronization required with new
- * label node additions to the llist. Any new labels will
- * be processed in next run of the kworker.
- *
- * SCAN START PTR
- * |
- * v
- * +----------+ +------+ +------+ +------+
- * | | | | | | | |
- * | head ------> dummy|--->|label |--->| label|--->NULL
- * | | | node | | | | |
- * +----------+ +------+ +------+ +------+
- *
- *
- * New label addition:
- *
- * SCAN START PTR
- * |
- * v
- * +----------+ +------+ +------+ +------+ +------+
- * | | | | | | | | | |
- * | head |--> label|--> dummy|--->|label |--->| label|--->NULL
- * | | | | | node | | | | |
- * +----------+ +------+ +------+ +------+ +------+
- *
- */
- if (last_node == NULL || last_node->next == NULL) {
-retry_sentinel_get:
- sen_node = percpu_ref_get_sen_node();
- /*
- * All sentinel nodes are in use? This should not happen, as we
- * require only one sentinel for the start of list traversal and
- * other sentinel node is freed during the traversal.
- */
- if (WARN_ONCE(!sen_node, "All sentinel nodes are in use")) {
- /* Use first node as the sentinel node */
- head = first->next;
- if (!head) {
- struct llist_node *ign_node = NULL;
- /*
- * We exhausted sentinel nodes. However, there aren't
- * enough nodes in the llist. So, we have leaked
- * sentinel nodes. Reclaim sentinels and retry.
- */
- if (percpu_ref_is_sen_node(first))
- ign_node = first;
- percpu_ref_put_all_sen_nodes_except(ign_node);
- goto retry_sentinel_get;
- }
- prev = first;
- } else {
- llist_add(sen_node, &percpu_ref_manage_head);
- prev = sen_node;
- head = prev->next;
- }
- } else {
- prev = last_node;
- head = prev->next;
}
+ if (next_percpu_ref_node == &percpu_ref_manage_head)
+ node = percpu_ref_manage_head.next;
+ else
+ node = next_percpu_ref_node;
+ next_percpu_ref_node = node->next;
+ list_del_init(node);
+ spin_unlock(&percpu_ref_manage_lock);
- llist_for_each_safe(pos, next, head) {
- /* Free sentinel node which is present in the list */
- if (percpu_ref_is_sen_node(pos)) {
- prev->next = pos->next;
- percpu_ref_put_sen_node(pos);
- continue;
- }
-
- ref = container_of(pos, struct percpu_ref_data, node)->ref;
+ while (!list_is_head(node, &percpu_ref_manage_head)) {
+ ref = container_of(node, struct percpu_ref_data, node)->ref;
__percpu_ref_switch_to_atomic_sync_checked(ref, false);
/*
* Drop the ref while in RCU read critical section to
@@ -765,24 +682,31 @@ static void percpu_ref_release_work_fn(struct work_struct *work)
rcu_read_lock();
percpu_ref_put(ref);
held = percpu_ref_tryget(ref);
- if (!held) {
- prev->next = pos->next;
- init_llist_node(pos);
+ if (held) {
+ spin_lock(&percpu_ref_manage_lock);
+ list_add(node, &percpu_ref_manage_head);
+ spin_unlock(&percpu_ref_manage_lock);
+ __percpu_ref_switch_to_percpu_checked(ref, false);
+ } else {
ref->percpu_count_ptr |= __PERCPU_REF_DEAD;
}
rcu_read_unlock();
- if (!held)
- continue;
- __percpu_ref_switch_to_percpu_checked(ref, false);
+ mutex_unlock(&percpu_ref_active_switch_mutex);
count++;
- if (count == READ_ONCE(max_scan_count)) {
- WRITE_ONCE(last_percpu_ref_node, pos);
+ if (count == READ_ONCE(max_scan_count))
goto queue_release_work;
+ mutex_lock(&percpu_ref_active_switch_mutex);
+ spin_lock(&percpu_ref_manage_lock);
+ node = next_percpu_ref_node;
+ if (!list_is_head(next_percpu_ref_node, &percpu_ref_manage_head)) {
+ next_percpu_ref_node = next_percpu_ref_node->next;
+ list_del_init(node);
}
- prev = pos;
+ spin_unlock(&percpu_ref_manage_lock);
}
- WRITE_ONCE(last_percpu_ref_node, NULL);
+ mutex_unlock(&percpu_ref_active_switch_mutex);
+
queue_release_work:
queue_delayed_work(percpu_ref_release_wq, &percpu_ref_release_work,
scan_interval);
--
2.34.1
^ permalink raw reply related [flat|nested] 12+ messages in thread* Re: [RFC 3/6] percpu-refcount: Extend managed mode to allow runtime switching
2024-09-16 5:08 ` [RFC 3/6] percpu-refcount: Extend managed mode to allow runtime switching Neeraj Upadhyay
@ 2024-09-16 12:04 ` kernel test robot
2024-09-16 19:07 ` kernel test robot
1 sibling, 0 replies; 12+ messages in thread
From: kernel test robot @ 2024-09-16 12:04 UTC (permalink / raw)
To: Neeraj Upadhyay; +Cc: oe-kbuild-all
Hi Neeraj,
[This is a private test report for your RFC patch.]
kernel test robot noticed the following build errors:
[auto build test ERROR on akpm-mm/mm-nonmm-unstable]
[also build test ERROR on linus/master dennis-percpu/for-next v6.11 next-20240916]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Neeraj-Upadhyay/percpu-refcount-Add-managed-mode-for-RCU-released-objects/20240916-131210
base: https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-nonmm-unstable
patch link: https://lore.kernel.org/r/20240916050811.473556-4-Neeraj.Upadhyay%40amd.com
patch subject: [RFC 3/6] percpu-refcount: Extend managed mode to allow runtime switching
config: x86_64-rhel-8.3-kunit (https://download.01.org/0day-ci/archive/20240916/202409161915.iIfJbiNy-lkp@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240916/202409161915.iIfJbiNy-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202409161915.iIfJbiNy-lkp@intel.com/
All errors (new ones prefixed by >>):
In file included from <command-line>:
lib/percpu-refcount.c: In function 'percpu_ref_test_flush_release_work':
>> lib/percpu-refcount.c:741:26: error: 'last_percpu_ref_node' undeclared (first use in this function); did you mean 'next_percpu_ref_node'?
741 | while (READ_ONCE(last_percpu_ref_node) != NULL && max_count--)
| ^~~~~~~~~~~~~~~~~~~~
include/linux/compiler_types.h:490:23: note: in definition of macro '__compiletime_assert'
490 | if (!(condition)) \
| ^~~~~~~~~
include/linux/compiler_types.h:510:9: note: in expansion of macro '_compiletime_assert'
510 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
| ^~~~~~~~~~~~~~~~~~~
include/asm-generic/rwonce.h:36:9: note: in expansion of macro 'compiletime_assert'
36 | compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \
| ^~~~~~~~~~~~~~~~~~
include/asm-generic/rwonce.h:36:28: note: in expansion of macro '__native_word'
36 | compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \
| ^~~~~~~~~~~~~
include/asm-generic/rwonce.h:49:9: note: in expansion of macro 'compiletime_assert_rwonce_type'
49 | compiletime_assert_rwonce_type(x); \
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
lib/percpu-refcount.c:741:16: note: in expansion of macro 'READ_ONCE'
741 | while (READ_ONCE(last_percpu_ref_node) != NULL && max_count--)
| ^~~~~~~~~
lib/percpu-refcount.c:741:26: note: each undeclared identifier is reported only once for each function it appears in
741 | while (READ_ONCE(last_percpu_ref_node) != NULL && max_count--)
| ^~~~~~~~~~~~~~~~~~~~
include/linux/compiler_types.h:490:23: note: in definition of macro '__compiletime_assert'
490 | if (!(condition)) \
| ^~~~~~~~~
include/linux/compiler_types.h:510:9: note: in expansion of macro '_compiletime_assert'
510 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
| ^~~~~~~~~~~~~~~~~~~
include/asm-generic/rwonce.h:36:9: note: in expansion of macro 'compiletime_assert'
36 | compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \
| ^~~~~~~~~~~~~~~~~~
include/asm-generic/rwonce.h:36:28: note: in expansion of macro '__native_word'
36 | compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \
| ^~~~~~~~~~~~~
include/asm-generic/rwonce.h:49:9: note: in expansion of macro 'compiletime_assert_rwonce_type'
49 | compiletime_assert_rwonce_type(x); \
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
lib/percpu-refcount.c:741:16: note: in expansion of macro 'READ_ONCE'
741 | while (READ_ONCE(last_percpu_ref_node) != NULL && max_count--)
| ^~~~~~~~~
vim +741 lib/percpu-refcount.c
377193f4f16a1b Neeraj Upadhyay 2024-09-16 722
377193f4f16a1b Neeraj Upadhyay 2024-09-16 723 void percpu_ref_test_flush_release_work(void)
377193f4f16a1b Neeraj Upadhyay 2024-09-16 724 {
377193f4f16a1b Neeraj Upadhyay 2024-09-16 725 int max_flush = READ_ONCE(max_scan_count);
377193f4f16a1b Neeraj Upadhyay 2024-09-16 726 int max_count = 1000;
377193f4f16a1b Neeraj Upadhyay 2024-09-16 727
377193f4f16a1b Neeraj Upadhyay 2024-09-16 728 /* Complete any executing release work */
377193f4f16a1b Neeraj Upadhyay 2024-09-16 729 flush_delayed_work(&percpu_ref_release_work);
377193f4f16a1b Neeraj Upadhyay 2024-09-16 730 /* Scan till the end of the llist */
377193f4f16a1b Neeraj Upadhyay 2024-09-16 731 WRITE_ONCE(max_scan_count, INT_MAX);
377193f4f16a1b Neeraj Upadhyay 2024-09-16 732 /* max scan count update visible to release work */
377193f4f16a1b Neeraj Upadhyay 2024-09-16 733 smp_mb();
377193f4f16a1b Neeraj Upadhyay 2024-09-16 734 flush_delayed_work(&percpu_ref_release_work);
377193f4f16a1b Neeraj Upadhyay 2024-09-16 735 /* max scan count update visible to release work */
377193f4f16a1b Neeraj Upadhyay 2024-09-16 736 smp_mb();
377193f4f16a1b Neeraj Upadhyay 2024-09-16 737 WRITE_ONCE(max_scan_count, 1);
377193f4f16a1b Neeraj Upadhyay 2024-09-16 738 /* max scan count update visible to work */
377193f4f16a1b Neeraj Upadhyay 2024-09-16 739 smp_mb();
377193f4f16a1b Neeraj Upadhyay 2024-09-16 740 flush_delayed_work(&percpu_ref_release_work);
377193f4f16a1b Neeraj Upadhyay 2024-09-16 @741 while (READ_ONCE(last_percpu_ref_node) != NULL && max_count--)
377193f4f16a1b Neeraj Upadhyay 2024-09-16 742 flush_delayed_work(&percpu_ref_release_work);
377193f4f16a1b Neeraj Upadhyay 2024-09-16 743 /* max scan count update visible to work */
377193f4f16a1b Neeraj Upadhyay 2024-09-16 744 smp_mb();
377193f4f16a1b Neeraj Upadhyay 2024-09-16 745 WRITE_ONCE(max_scan_count, max_flush);
377193f4f16a1b Neeraj Upadhyay 2024-09-16 746 }
377193f4f16a1b Neeraj Upadhyay 2024-09-16 747 EXPORT_SYMBOL_GPL(percpu_ref_test_flush_release_work);
377193f4f16a1b Neeraj Upadhyay 2024-09-16 748
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: [RFC 3/6] percpu-refcount: Extend managed mode to allow runtime switching
2024-09-16 5:08 ` [RFC 3/6] percpu-refcount: Extend managed mode to allow runtime switching Neeraj Upadhyay
2024-09-16 12:04 ` kernel test robot
@ 2024-09-16 19:07 ` kernel test robot
1 sibling, 0 replies; 12+ messages in thread
From: kernel test robot @ 2024-09-16 19:07 UTC (permalink / raw)
To: Neeraj Upadhyay; +Cc: llvm, oe-kbuild-all
Hi Neeraj,
[This is a private test report for your RFC patch.]
kernel test robot noticed the following build errors:
[auto build test ERROR on akpm-mm/mm-nonmm-unstable]
[also build test ERROR on linus/master dennis-percpu/for-next v6.11]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Neeraj-Upadhyay/percpu-refcount-Add-managed-mode-for-RCU-released-objects/20240916-131210
base: https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-nonmm-unstable
patch link: https://lore.kernel.org/r/20240916050811.473556-4-Neeraj.Upadhyay%40amd.com
patch subject: [RFC 3/6] percpu-refcount: Extend managed mode to allow runtime switching
config: x86_64-rhel-8.3-rust (https://download.01.org/0day-ci/archive/20240917/202409170250.ukGxk6pH-lkp@intel.com/config)
compiler: clang version 18.1.8 (https://github.com/llvm/llvm-project 3b5b5c1ec4a3095ab096dd780e84d7ab81f3d7ff)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240917/202409170250.ukGxk6pH-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202409170250.ukGxk6pH-lkp@intel.com/
All errors (new ones prefixed by >>):
>> lib/percpu-refcount.c:741:19: error: use of undeclared identifier 'last_percpu_ref_node'; did you mean 'next_percpu_ref_node'?
741 | while (READ_ONCE(last_percpu_ref_node) != NULL && max_count--)
| ^~~~~~~~~~~~~~~~~~~~
| next_percpu_ref_node
include/asm-generic/rwonce.h:49:33: note: expanded from macro 'READ_ONCE'
49 | compiletime_assert_rwonce_type(x); \
| ^
include/asm-generic/rwonce.h:36:35: note: expanded from macro 'compiletime_assert_rwonce_type'
36 | compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \
| ^
include/linux/compiler_types.h:477:10: note: expanded from macro '__native_word'
477 | (sizeof(t) == sizeof(char) || sizeof(t) == sizeof(short) || \
| ^
include/linux/compiler_types.h:510:22: note: expanded from macro 'compiletime_assert'
510 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
| ^
include/linux/compiler_types.h:498:23: note: expanded from macro '_compiletime_assert'
498 | __compiletime_assert(condition, msg, prefix, suffix)
| ^
include/linux/compiler_types.h:490:9: note: expanded from macro '__compiletime_assert'
490 | if (!(condition)) \
| ^
lib/percpu-refcount.c:51:26: note: 'next_percpu_ref_node' declared here
51 | static struct list_head *next_percpu_ref_node = &percpu_ref_manage_head;
| ^
>> lib/percpu-refcount.c:741:19: error: use of undeclared identifier 'last_percpu_ref_node'; did you mean 'next_percpu_ref_node'?
741 | while (READ_ONCE(last_percpu_ref_node) != NULL && max_count--)
| ^~~~~~~~~~~~~~~~~~~~
| next_percpu_ref_node
include/asm-generic/rwonce.h:49:33: note: expanded from macro 'READ_ONCE'
49 | compiletime_assert_rwonce_type(x); \
| ^
include/asm-generic/rwonce.h:36:35: note: expanded from macro 'compiletime_assert_rwonce_type'
36 | compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \
| ^
include/linux/compiler_types.h:477:39: note: expanded from macro '__native_word'
477 | (sizeof(t) == sizeof(char) || sizeof(t) == sizeof(short) || \
| ^
include/linux/compiler_types.h:510:22: note: expanded from macro 'compiletime_assert'
510 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
| ^
include/linux/compiler_types.h:498:23: note: expanded from macro '_compiletime_assert'
498 | __compiletime_assert(condition, msg, prefix, suffix)
| ^
include/linux/compiler_types.h:490:9: note: expanded from macro '__compiletime_assert'
490 | if (!(condition)) \
| ^
lib/percpu-refcount.c:51:26: note: 'next_percpu_ref_node' declared here
51 | static struct list_head *next_percpu_ref_node = &percpu_ref_manage_head;
| ^
>> lib/percpu-refcount.c:741:19: error: use of undeclared identifier 'last_percpu_ref_node'; did you mean 'next_percpu_ref_node'?
741 | while (READ_ONCE(last_percpu_ref_node) != NULL && max_count--)
| ^~~~~~~~~~~~~~~~~~~~
| next_percpu_ref_node
include/asm-generic/rwonce.h:49:33: note: expanded from macro 'READ_ONCE'
49 | compiletime_assert_rwonce_type(x); \
| ^
include/asm-generic/rwonce.h:36:35: note: expanded from macro 'compiletime_assert_rwonce_type'
36 | compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \
| ^
include/linux/compiler_types.h:478:10: note: expanded from macro '__native_word'
478 | sizeof(t) == sizeof(int) || sizeof(t) == sizeof(long))
| ^
include/linux/compiler_types.h:510:22: note: expanded from macro 'compiletime_assert'
510 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
| ^
include/linux/compiler_types.h:498:23: note: expanded from macro '_compiletime_assert'
498 | __compiletime_assert(condition, msg, prefix, suffix)
| ^
include/linux/compiler_types.h:490:9: note: expanded from macro '__compiletime_assert'
490 | if (!(condition)) \
| ^
lib/percpu-refcount.c:51:26: note: 'next_percpu_ref_node' declared here
51 | static struct list_head *next_percpu_ref_node = &percpu_ref_manage_head;
| ^
>> lib/percpu-refcount.c:741:19: error: use of undeclared identifier 'last_percpu_ref_node'; did you mean 'next_percpu_ref_node'?
741 | while (READ_ONCE(last_percpu_ref_node) != NULL && max_count--)
| ^~~~~~~~~~~~~~~~~~~~
| next_percpu_ref_node
include/asm-generic/rwonce.h:49:33: note: expanded from macro 'READ_ONCE'
49 | compiletime_assert_rwonce_type(x); \
| ^
include/asm-generic/rwonce.h:36:35: note: expanded from macro 'compiletime_assert_rwonce_type'
36 | compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \
| ^
include/linux/compiler_types.h:478:38: note: expanded from macro '__native_word'
478 | sizeof(t) == sizeof(int) || sizeof(t) == sizeof(long))
| ^
include/linux/compiler_types.h:510:22: note: expanded from macro 'compiletime_assert'
510 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
| ^
include/linux/compiler_types.h:498:23: note: expanded from macro '_compiletime_assert'
498 | __compiletime_assert(condition, msg, prefix, suffix)
| ^
include/linux/compiler_types.h:490:9: note: expanded from macro '__compiletime_assert'
490 | if (!(condition)) \
| ^
lib/percpu-refcount.c:51:26: note: 'next_percpu_ref_node' declared here
51 | static struct list_head *next_percpu_ref_node = &percpu_ref_manage_head;
| ^
>> lib/percpu-refcount.c:741:19: error: use of undeclared identifier 'last_percpu_ref_node'; did you mean 'next_percpu_ref_node'?
741 | while (READ_ONCE(last_percpu_ref_node) != NULL && max_count--)
| ^~~~~~~~~~~~~~~~~~~~
| next_percpu_ref_node
include/asm-generic/rwonce.h:49:33: note: expanded from macro 'READ_ONCE'
49 | compiletime_assert_rwonce_type(x); \
| ^
include/asm-generic/rwonce.h:36:48: note: expanded from macro 'compiletime_assert_rwonce_type'
36 | compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \
| ^
include/linux/compiler_types.h:510:22: note: expanded from macro 'compiletime_assert'
510 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
| ^
include/linux/compiler_types.h:498:23: note: expanded from macro '_compiletime_assert'
498 | __compiletime_assert(condition, msg, prefix, suffix)
| ^
include/linux/compiler_types.h:490:9: note: expanded from macro '__compiletime_assert'
490 | if (!(condition)) \
| ^
lib/percpu-refcount.c:51:26: note: 'next_percpu_ref_node' declared here
51 | static struct list_head *next_percpu_ref_node = &percpu_ref_manage_head;
| ^
>> lib/percpu-refcount.c:741:19: error: use of undeclared identifier 'last_percpu_ref_node'; did you mean 'next_percpu_ref_node'?
741 | while (READ_ONCE(last_percpu_ref_node) != NULL && max_count--)
| ^~~~~~~~~~~~~~~~~~~~
| next_percpu_ref_node
include/asm-generic/rwonce.h:50:14: note: expanded from macro 'READ_ONCE'
50 | __READ_ONCE(x); \
| ^
include/asm-generic/rwonce.h:44:65: note: expanded from macro '__READ_ONCE'
44 | #define __READ_ONCE(x) (*(const volatile __unqual_scalar_typeof(x) *)&(x))
| ^
include/linux/compiler_types.h:466:13: note: expanded from macro '__unqual_scalar_typeof'
466 | _Generic((x), \
| ^
lib/percpu-refcount.c:51:26: note: 'next_percpu_ref_node' declared here
51 | static struct list_head *next_percpu_ref_node = &percpu_ref_manage_head;
| ^
>> lib/percpu-refcount.c:741:19: error: use of undeclared identifier 'last_percpu_ref_node'; did you mean 'next_percpu_ref_node'?
741 | while (READ_ONCE(last_percpu_ref_node) != NULL && max_count--)
| ^~~~~~~~~~~~~~~~~~~~
| next_percpu_ref_node
include/asm-generic/rwonce.h:50:14: note: expanded from macro 'READ_ONCE'
50 | __READ_ONCE(x); \
| ^
include/asm-generic/rwonce.h:44:65: note: expanded from macro '__READ_ONCE'
44 | #define __READ_ONCE(x) (*(const volatile __unqual_scalar_typeof(x) *)&(x))
| ^
include/linux/compiler_types.h:473:15: note: expanded from macro '__unqual_scalar_typeof'
473 | default: (x)))
| ^
lib/percpu-refcount.c:51:26: note: 'next_percpu_ref_node' declared here
51 | static struct list_head *next_percpu_ref_node = &percpu_ref_manage_head;
| ^
>> lib/percpu-refcount.c:741:19: error: use of undeclared identifier 'last_percpu_ref_node'; did you mean 'next_percpu_ref_node'?
741 | while (READ_ONCE(last_percpu_ref_node) != NULL && max_count--)
| ^~~~~~~~~~~~~~~~~~~~
| next_percpu_ref_node
include/asm-generic/rwonce.h:50:14: note: expanded from macro 'READ_ONCE'
50 | __READ_ONCE(x); \
| ^
include/asm-generic/rwonce.h:44:72: note: expanded from macro '__READ_ONCE'
44 | #define __READ_ONCE(x) (*(const volatile __unqual_scalar_typeof(x) *)&(x))
| ^
lib/percpu-refcount.c:51:26: note: 'next_percpu_ref_node' declared here
51 | static struct list_head *next_percpu_ref_node = &percpu_ref_manage_head;
| ^
8 errors generated.
vim +741 lib/percpu-refcount.c
377193f4f16a1b Neeraj Upadhyay 2024-09-16 722
377193f4f16a1b Neeraj Upadhyay 2024-09-16 723 void percpu_ref_test_flush_release_work(void)
377193f4f16a1b Neeraj Upadhyay 2024-09-16 724 {
377193f4f16a1b Neeraj Upadhyay 2024-09-16 725 int max_flush = READ_ONCE(max_scan_count);
377193f4f16a1b Neeraj Upadhyay 2024-09-16 726 int max_count = 1000;
377193f4f16a1b Neeraj Upadhyay 2024-09-16 727
377193f4f16a1b Neeraj Upadhyay 2024-09-16 728 /* Complete any executing release work */
377193f4f16a1b Neeraj Upadhyay 2024-09-16 729 flush_delayed_work(&percpu_ref_release_work);
377193f4f16a1b Neeraj Upadhyay 2024-09-16 730 /* Scan till the end of the llist */
377193f4f16a1b Neeraj Upadhyay 2024-09-16 731 WRITE_ONCE(max_scan_count, INT_MAX);
377193f4f16a1b Neeraj Upadhyay 2024-09-16 732 /* max scan count update visible to release work */
377193f4f16a1b Neeraj Upadhyay 2024-09-16 733 smp_mb();
377193f4f16a1b Neeraj Upadhyay 2024-09-16 734 flush_delayed_work(&percpu_ref_release_work);
377193f4f16a1b Neeraj Upadhyay 2024-09-16 735 /* max scan count update visible to release work */
377193f4f16a1b Neeraj Upadhyay 2024-09-16 736 smp_mb();
377193f4f16a1b Neeraj Upadhyay 2024-09-16 737 WRITE_ONCE(max_scan_count, 1);
377193f4f16a1b Neeraj Upadhyay 2024-09-16 738 /* max scan count update visible to work */
377193f4f16a1b Neeraj Upadhyay 2024-09-16 739 smp_mb();
377193f4f16a1b Neeraj Upadhyay 2024-09-16 740 flush_delayed_work(&percpu_ref_release_work);
377193f4f16a1b Neeraj Upadhyay 2024-09-16 @741 while (READ_ONCE(last_percpu_ref_node) != NULL && max_count--)
377193f4f16a1b Neeraj Upadhyay 2024-09-16 742 flush_delayed_work(&percpu_ref_release_work);
377193f4f16a1b Neeraj Upadhyay 2024-09-16 743 /* max scan count update visible to work */
377193f4f16a1b Neeraj Upadhyay 2024-09-16 744 smp_mb();
377193f4f16a1b Neeraj Upadhyay 2024-09-16 745 WRITE_ONCE(max_scan_count, max_flush);
377193f4f16a1b Neeraj Upadhyay 2024-09-16 746 }
377193f4f16a1b Neeraj Upadhyay 2024-09-16 747 EXPORT_SYMBOL_GPL(percpu_ref_test_flush_release_work);
377193f4f16a1b Neeraj Upadhyay 2024-09-16 748
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 12+ messages in thread
* [RFC 4/6] percpu-refcount-torture: Extend test with runtime mode switches
2024-09-16 5:08 [RFC 0/6] Managed Percpu Refcount Neeraj Upadhyay
` (2 preceding siblings ...)
2024-09-16 5:08 ` [RFC 3/6] percpu-refcount: Extend managed mode to allow runtime switching Neeraj Upadhyay
@ 2024-09-16 5:08 ` Neeraj Upadhyay
2024-09-19 2:11 ` kernel test robot
2024-09-16 5:08 ` [RFC 5/6] apparmor: Switch labels to percpu refcount in atomic mode Neeraj Upadhyay
2024-09-16 5:08 ` [RFC 6/6] apparmor: Switch labels to percpu ref managed mode Neeraj Upadhyay
5 siblings, 1 reply; 12+ messages in thread
From: Neeraj Upadhyay @ 2024-09-16 5:08 UTC (permalink / raw)
To: linux-kernel
Cc: john.johansen, paul, jmorris, serge, linux-security-module,
gautham.shenoy, Santosh.Shukla, Ananth.Narayan,
Raghavendra.KodsaraThimmappa, paulmck, boqun.feng, vinicius.gomes,
mjguzik, dennis, tj, cl, linux-mm, rcu
Extend the test to exercise runtime switching from managed
mode to other reinitable active modes.
Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
---
lib/percpu-refcount-torture.c | 41 +++++++++++++++++++++++++++++++++--
lib/percpu-refcount.c | 12 +++++++++-
2 files changed, 50 insertions(+), 3 deletions(-)
diff --git a/lib/percpu-refcount-torture.c b/lib/percpu-refcount-torture.c
index 686f5a228b40..cb2700b16517 100644
--- a/lib/percpu-refcount-torture.c
+++ b/lib/percpu-refcount-torture.c
@@ -3,6 +3,7 @@
#include <linux/jiffies.h>
#include <linux/module.h>
#include <linux/moduleparam.h>
+#include <linux/mutex.h>
#include <linux/percpu-refcount.h>
#include <linux/torture.h>
@@ -59,6 +60,7 @@ static struct task_struct **busted_late_release_tasks;
static struct percpu_ref *refs;
static long *num_per_ref_users;
+static struct mutex *ref_switch_mutexes;
static atomic_t running;
static atomic_t *ref_running;
@@ -97,19 +99,36 @@ static int percpu_ref_manager_thread(void *data)
static int percpu_ref_test_thread(void *data)
{
struct percpu_ref *ref = (struct percpu_ref *)data;
+ DEFINE_TORTURE_RANDOM(rand);
+ int ref_idx = ref - refs;
+ int do_switch;
int i = 0;
percpu_ref_get(ref);
do {
percpu_ref_get(ref);
+ /* Perform checks once per 256 iterations */
+ do_switch = (torture_random(&rand) & 0xff);
udelay(delay_us);
+ if (do_switch) {
+ mutex_lock(&ref_switch_mutexes[ref_idx]);
+ percpu_ref_switch_to_unmanaged(ref);
+ udelay(delay_us);
+ percpu_ref_switch_to_atomic_sync(ref);
+ if (do_switch & 1)
+ percpu_ref_switch_to_percpu(ref);
+ udelay(delay_us);
+ percpu_ref_switch_to_managed(ref);
+ mutex_unlock(&ref_switch_mutexes[ref_idx]);
+ udelay(delay_us);
+ }
percpu_ref_put(ref);
stutter_wait("percpu_ref_test_thread");
i++;
} while (i < niterations);
- atomic_dec(&ref_running[ref - refs]);
+ atomic_dec(&ref_running[ref_idx]);
/* Order ref release with ref_running[ref_idx] == 0 */
smp_mb();
percpu_ref_put(ref);
@@ -213,6 +232,13 @@ static void percpu_ref_test_cleanup(void)
kfree(num_per_ref_users);
num_per_ref_users = NULL;
+ if (ref_switch_mutexes) {
+ for (i = 0; i < nrefs; i++)
+ mutex_destroy(&ref_switch_mutexes[i]);
+ kfree(ref_switch_mutexes);
+ ref_switch_mutexes = NULL;
+ }
+
if (refs) {
for (i = 0; i < nrefs; i++)
percpu_ref_exit(&refs[i]);
@@ -251,7 +277,8 @@ static int __init percpu_ref_torture_init(void)
goto init_err;
}
for (i = 0; i < nrefs; i++) {
- flags = torture_random(trsp) & 1 ? PERCPU_REF_INIT_ATOMIC : PERCPU_REF_REL_MANAGED;
+ flags = (torture_random(trsp) & 1) ? PERCPU_REF_INIT_ATOMIC :
+ PERCPU_REF_REL_MANAGED;
err = percpu_ref_init(&refs[i], percpu_ref_test_release,
flags, GFP_KERNEL);
if (err)
@@ -269,6 +296,16 @@ static int __init percpu_ref_torture_init(void)
for (i = 0; i < nrefs; i++)
num_per_ref_users[i] = 0;
+ ref_switch_mutexes = kcalloc(nrefs, sizeof(ref_switch_mutexes[0]), GFP_KERNEL);
+ if (!ref_switch_mutexes) {
+ TOROUT_ERRSTRING("out of memory");
+ err = -ENOMEM;
+ goto init_err;
+ }
+
+ for (i = 0; i < nrefs; i++)
+ mutex_init(&ref_switch_mutexes[i]);
+
ref_user_tasks = kcalloc(nusers, sizeof(ref_user_tasks[0]), GFP_KERNEL);
if (!ref_user_tasks) {
TOROUT_ERRSTRING("out of memory");
diff --git a/lib/percpu-refcount.c b/lib/percpu-refcount.c
index b79e36905aa4..4e0a453bd51f 100644
--- a/lib/percpu-refcount.c
+++ b/lib/percpu-refcount.c
@@ -723,6 +723,7 @@ EXPORT_SYMBOL_GPL(percpu_ref_test_is_percpu);
void percpu_ref_test_flush_release_work(void)
{
int max_flush = READ_ONCE(max_scan_count);
+ struct list_head *next;
int max_count = 1000;
/* Complete any executing release work */
@@ -738,8 +739,17 @@ void percpu_ref_test_flush_release_work(void)
/* max scan count update visible to work */
smp_mb();
flush_delayed_work(&percpu_ref_release_work);
- while (READ_ONCE(last_percpu_ref_node) != NULL && max_count--)
+
+ while (true) {
+ if (!max_count--)
+ break;
+ spin_lock(&percpu_ref_manage_lock);
+ next = next_percpu_ref_node;
+ spin_unlock(&percpu_ref_manage_lock);
+ if (list_is_head(next, &percpu_ref_manage_head))
+ break;
flush_delayed_work(&percpu_ref_release_work);
+ }
/* max scan count update visible to work */
smp_mb();
WRITE_ONCE(max_scan_count, max_flush);
--
2.34.1
^ permalink raw reply related [flat|nested] 12+ messages in thread* Re: [RFC 4/6] percpu-refcount-torture: Extend test with runtime mode switches
2024-09-16 5:08 ` [RFC 4/6] percpu-refcount-torture: Extend test with runtime mode switches Neeraj Upadhyay
@ 2024-09-19 2:11 ` kernel test robot
0 siblings, 0 replies; 12+ messages in thread
From: kernel test robot @ 2024-09-19 2:11 UTC (permalink / raw)
To: Neeraj Upadhyay; +Cc: oe-kbuild-all
Hi Neeraj,
[This is a private test report for your RFC patch.]
kernel test robot noticed the following build warnings:
[auto build test WARNING on akpm-mm/mm-nonmm-unstable]
[also build test WARNING on linus/master dennis-percpu/for-next v6.11 next-20240918]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Neeraj-Upadhyay/percpu-refcount-Add-managed-mode-for-RCU-released-objects/20240916-131210
base: https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-nonmm-unstable
patch link: https://lore.kernel.org/r/20240916050811.473556-5-Neeraj.Upadhyay%40amd.com
patch subject: [RFC 4/6] percpu-refcount-torture: Extend test with runtime mode switches
config: i386-allyesconfig (https://download.01.org/0day-ci/archive/20240919/202409190905.Cy8vmSdT-lkp@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240919/202409190905.Cy8vmSdT-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202409190905.Cy8vmSdT-lkp@intel.com/
All warnings (new ones prefixed by >>):
lib/percpu-refcount-torture.c: In function 'percpu_ref_test_cleanup':
lib/percpu-refcount-torture.c:206:17: error: implicit declaration of function 'kfree' [-Werror=implicit-function-declaration]
206 | kfree(busted_late_release_tasks);
| ^~~~~
lib/percpu-refcount-torture.c: In function 'percpu_ref_torture_init':
lib/percpu-refcount-torture.c:273:16: error: implicit declaration of function 'kcalloc' [-Werror=implicit-function-declaration]
273 | refs = kcalloc(nrefs, sizeof(refs[0]), GFP_KERNEL);
| ^~~~~~~
lib/percpu-refcount-torture.c:273:14: warning: assignment to 'struct percpu_ref *' from 'int' makes pointer from integer without a cast [-Wint-conversion]
273 | refs = kcalloc(nrefs, sizeof(refs[0]), GFP_KERNEL);
| ^
lib/percpu-refcount-torture.c:290:27: warning: assignment to 'long int *' from 'int' makes pointer from integer without a cast [-Wint-conversion]
290 | num_per_ref_users = kcalloc(nrefs, sizeof(num_per_ref_users[0]), GFP_KERNEL);
| ^
>> lib/percpu-refcount-torture.c:299:28: warning: assignment to 'struct mutex *' from 'int' makes pointer from integer without a cast [-Wint-conversion]
299 | ref_switch_mutexes = kcalloc(nrefs, sizeof(ref_switch_mutexes[0]), GFP_KERNEL);
| ^
lib/percpu-refcount-torture.c:309:24: warning: assignment to 'struct task_struct **' from 'int' makes pointer from integer without a cast [-Wint-conversion]
309 | ref_user_tasks = kcalloc(nusers, sizeof(ref_user_tasks[0]), GFP_KERNEL);
| ^
lib/percpu-refcount-torture.c:316:21: warning: assignment to 'atomic_t *' from 'int' makes pointer from integer without a cast [-Wint-conversion]
316 | ref_running = kcalloc(nrefs, sizeof(ref_running[0]), GFP_KERNEL);
| ^
lib/percpu-refcount-torture.c:346:44: warning: assignment to 'struct task_struct **' from 'int' makes pointer from integer without a cast [-Wint-conversion]
346 | busted_early_release_tasks = kcalloc(nrefs,
| ^
lib/percpu-refcount-torture.c:363:43: warning: assignment to 'struct task_struct **' from 'int' makes pointer from integer without a cast [-Wint-conversion]
363 | busted_late_release_tasks = kcalloc(nrefs, sizeof(busted_late_release_tasks[0]),
| ^
cc1: some warnings being treated as errors
vim +299 lib/percpu-refcount-torture.c
256
257 static int __init percpu_ref_torture_init(void)
258 {
259 DEFINE_TORTURE_RANDOM(rand);
260 struct torture_random_state *trsp = &rand;
261 int flags;
262 int err;
263 int ref_idx;
264 int i;
265
266 if (!torture_init_begin("percpu-refcount", verbose))
267 return -EBUSY;
268
269 atomic_set(&running, nusers);
270 /* Order @running with later increment and decrement operations */
271 smp_mb();
272
> 273 refs = kcalloc(nrefs, sizeof(refs[0]), GFP_KERNEL);
274 if (!refs) {
275 TOROUT_ERRSTRING("out of memory");
276 err = -ENOMEM;
277 goto init_err;
278 }
279 for (i = 0; i < nrefs; i++) {
280 flags = (torture_random(trsp) & 1) ? PERCPU_REF_INIT_ATOMIC :
281 PERCPU_REF_REL_MANAGED;
282 err = percpu_ref_init(&refs[i], percpu_ref_test_release,
283 flags, GFP_KERNEL);
284 if (err)
285 goto init_err;
286 if (!(flags & PERCPU_REF_REL_MANAGED))
287 percpu_ref_switch_to_managed(&refs[i]);
288 }
289
290 num_per_ref_users = kcalloc(nrefs, sizeof(num_per_ref_users[0]), GFP_KERNEL);
291 if (!num_per_ref_users) {
292 TOROUT_ERRSTRING("out of memory");
293 err = -ENOMEM;
294 goto init_err;
295 }
296 for (i = 0; i < nrefs; i++)
297 num_per_ref_users[i] = 0;
298
> 299 ref_switch_mutexes = kcalloc(nrefs, sizeof(ref_switch_mutexes[0]), GFP_KERNEL);
300 if (!ref_switch_mutexes) {
301 TOROUT_ERRSTRING("out of memory");
302 err = -ENOMEM;
303 goto init_err;
304 }
305
306 for (i = 0; i < nrefs; i++)
307 mutex_init(&ref_switch_mutexes[i]);
308
309 ref_user_tasks = kcalloc(nusers, sizeof(ref_user_tasks[0]), GFP_KERNEL);
310 if (!ref_user_tasks) {
311 TOROUT_ERRSTRING("out of memory");
312 err = -ENOMEM;
313 goto init_err;
314 }
315
316 ref_running = kcalloc(nrefs, sizeof(ref_running[0]), GFP_KERNEL);
317 if (!ref_running) {
318 TOROUT_ERRSTRING("out of memory");
319 err = -ENOMEM;
320 goto init_err;
321 }
322
323 for (i = 0; i < nusers; i++) {
324 ref_idx = torture_random(trsp) % nrefs;
325 atomic_inc(&ref_running[ref_idx]);
326 num_per_ref_users[ref_idx]++;
327 /* Order increments with subquent reads */
328 smp_mb();
329 err = torture_create_kthread(percpu_ref_test_thread,
330 &refs[ref_idx], ref_user_tasks[i]);
331 if (torture_init_error(err))
332 goto init_err;
333 }
334
335 err = torture_create_kthread(percpu_ref_manager_thread, NULL, ref_manager_task);
336 if (torture_init_error(err))
337 goto init_err;
338
339 /* Drop initial reference, after test threads have started running */
340 udelay(1);
341 for (i = 0; i < nrefs; i++)
342 percpu_ref_put(&refs[i]);
343
344
345 if (busted_early_ref_release) {
346 busted_early_release_tasks = kcalloc(nrefs,
347 sizeof(busted_early_release_tasks[0]),
348 GFP_KERNEL);
349 if (!busted_early_release_tasks) {
350 TOROUT_ERRSTRING("out of memory");
351 err = -ENOMEM;
352 goto init_err;
353 }
354 for (i = 0; i < nrefs; i++) {
355 err = torture_create_kthread(percpu_ref_busted_early_thread,
356 &refs[i], busted_early_release_tasks[i]);
357 if (torture_init_error(err))
358 goto init_err;
359 }
360 }
361
362 if (busted_late_ref_release) {
363 busted_late_release_tasks = kcalloc(nrefs, sizeof(busted_late_release_tasks[0]),
364 GFP_KERNEL);
365 if (!busted_late_release_tasks) {
366 TOROUT_ERRSTRING("out of memory");
367 err = -ENOMEM;
368 goto init_err;
369 }
370 for (i = 0; i < nrefs; i++) {
371 err = torture_create_kthread(percpu_ref_busted_late_thread,
372 &refs[i], busted_late_release_tasks[i]);
373 if (torture_init_error(err))
374 goto init_err;
375 }
376 }
377 if (stutter) {
378 err = torture_stutter_init(stutter, stutter);
379 if (torture_init_error(err))
380 goto init_err;
381 }
382
383 err = torture_onoff_init(onoff_holdoff * HZ, onoff_interval, NULL);
384 if (torture_init_error(err))
385 goto init_err;
386
387 torture_init_end();
388 return 0;
389 init_err:
390 torture_init_end();
391 percpu_ref_test_cleanup();
392 return err;
393 }
394
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 12+ messages in thread
* [RFC 5/6] apparmor: Switch labels to percpu refcount in atomic mode
2024-09-16 5:08 [RFC 0/6] Managed Percpu Refcount Neeraj Upadhyay
` (3 preceding siblings ...)
2024-09-16 5:08 ` [RFC 4/6] percpu-refcount-torture: Extend test with runtime mode switches Neeraj Upadhyay
@ 2024-09-16 5:08 ` Neeraj Upadhyay
2024-09-16 5:08 ` [RFC 6/6] apparmor: Switch labels to percpu ref managed mode Neeraj Upadhyay
5 siblings, 0 replies; 12+ messages in thread
From: Neeraj Upadhyay @ 2024-09-16 5:08 UTC (permalink / raw)
To: linux-kernel
Cc: john.johansen, paul, jmorris, serge, linux-security-module,
gautham.shenoy, Santosh.Shukla, Ananth.Narayan,
Raghavendra.KodsaraThimmappa, paulmck, boqun.feng, vinicius.gomes,
mjguzik, dennis, tj, cl, linux-mm, rcu
In preparation of using percpu refcount for labels, replace label kref
with percpu ref. The percpu ref is initialized to atomic mode, as
using percpu mode requires tracking ref kill points. As the atomic
counter is in a different cacheline now, rearrange some of the fields
in aa_label struct - flags, proxy; to optimize some of the fast paths
for unconfined labels.
In addition to the requirement to cleanup the percpu ref using
percpu_ref_exit() in label destruction path, other potential impact
from this patch could be:
- Increase in memory requirement (for per cpu counters) for each label.
- Displacement of aa_label struct members to different cacheline, as
percpu ref takes 2 pointers space.
- Moving of the atomic counter outside of the cacheline of the aa_label
struct.
Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
---
security/apparmor/include/label.h | 16 ++++++++--------
security/apparmor/include/policy.h | 8 ++++----
security/apparmor/label.c | 11 ++++++++---
3 files changed, 20 insertions(+), 15 deletions(-)
diff --git a/security/apparmor/include/label.h b/security/apparmor/include/label.h
index 2a72e6b17d68..4b29a4679c74 100644
--- a/security/apparmor/include/label.h
+++ b/security/apparmor/include/label.h
@@ -121,12 +121,12 @@ struct label_it {
* @ent: set of profiles for label, actual size determined by @size
*/
struct aa_label {
- struct kref count;
+ struct percpu_ref count;
+ long flags;
+ struct aa_proxy *proxy;
struct rb_node node;
struct rcu_head rcu;
- struct aa_proxy *proxy;
__counted char *hname;
- long flags;
u32 secid;
int size;
struct aa_profile *vec[];
@@ -276,7 +276,7 @@ void __aa_labelset_update_subtree(struct aa_ns *ns);
void aa_label_destroy(struct aa_label *label);
void aa_label_free(struct aa_label *label);
-void aa_label_kref(struct kref *kref);
+void aa_label_percpu_ref(struct percpu_ref *ref);
bool aa_label_init(struct aa_label *label, int size, gfp_t gfp);
struct aa_label *aa_label_alloc(int size, struct aa_proxy *proxy, gfp_t gfp);
@@ -373,7 +373,7 @@ int aa_label_match(struct aa_profile *profile, struct aa_ruleset *rules,
*/
static inline struct aa_label *__aa_get_label(struct aa_label *l)
{
- if (l && kref_get_unless_zero(&l->count))
+ if (l && percpu_ref_tryget(&l->count))
return l;
return NULL;
@@ -382,7 +382,7 @@ static inline struct aa_label *__aa_get_label(struct aa_label *l)
static inline struct aa_label *aa_get_label(struct aa_label *l)
{
if (l)
- kref_get(&(l->count));
+ percpu_ref_get(&(l->count));
return l;
}
@@ -402,7 +402,7 @@ static inline struct aa_label *aa_get_label_rcu(struct aa_label __rcu **l)
rcu_read_lock();
do {
c = rcu_dereference(*l);
- } while (c && !kref_get_unless_zero(&c->count));
+ } while (c && !percpu_ref_tryget(&c->count));
rcu_read_unlock();
return c;
@@ -442,7 +442,7 @@ static inline struct aa_label *aa_get_newest_label(struct aa_label *l)
static inline void aa_put_label(struct aa_label *l)
{
if (l)
- kref_put(&l->count, aa_label_kref);
+ percpu_ref_put(&l->count);
}
diff --git a/security/apparmor/include/policy.h b/security/apparmor/include/policy.h
index 75088cc310b6..5849b6b94cea 100644
--- a/security/apparmor/include/policy.h
+++ b/security/apparmor/include/policy.h
@@ -329,7 +329,7 @@ static inline aa_state_t ANY_RULE_MEDIATES(struct list_head *head,
static inline struct aa_profile *aa_get_profile(struct aa_profile *p)
{
if (p)
- kref_get(&(p->label.count));
+ percpu_ref_get(&(p->label.count));
return p;
}
@@ -343,7 +343,7 @@ static inline struct aa_profile *aa_get_profile(struct aa_profile *p)
*/
static inline struct aa_profile *aa_get_profile_not0(struct aa_profile *p)
{
- if (p && kref_get_unless_zero(&p->label.count))
+ if (p && percpu_ref_tryget(&p->label.count))
return p;
return NULL;
@@ -363,7 +363,7 @@ static inline struct aa_profile *aa_get_profile_rcu(struct aa_profile __rcu **p)
rcu_read_lock();
do {
c = rcu_dereference(*p);
- } while (c && !kref_get_unless_zero(&c->label.count));
+ } while (c && !percpu_ref_tryget(&c->label.count));
rcu_read_unlock();
return c;
@@ -376,7 +376,7 @@ static inline struct aa_profile *aa_get_profile_rcu(struct aa_profile __rcu **p)
static inline void aa_put_profile(struct aa_profile *p)
{
if (p)
- kref_put(&p->label.count, aa_label_kref);
+ percpu_ref_put(&p->label.count);
}
static inline int AUDIT_MODE(struct aa_profile *profile)
diff --git a/security/apparmor/label.c b/security/apparmor/label.c
index c71e4615dd46..aa9e6eac3ecc 100644
--- a/security/apparmor/label.c
+++ b/security/apparmor/label.c
@@ -336,6 +336,7 @@ void aa_label_destroy(struct aa_label *label)
rcu_assign_pointer(label->proxy->label, NULL);
aa_put_proxy(label->proxy);
}
+ percpu_ref_exit(&label->count);
aa_free_secid(label->secid);
label->proxy = (struct aa_proxy *) PROXY_POISON + 1;
@@ -369,9 +370,9 @@ static void label_free_rcu(struct rcu_head *head)
label_free_switch(label);
}
-void aa_label_kref(struct kref *kref)
+void aa_label_percpu_ref(struct percpu_ref *ref)
{
- struct aa_label *label = container_of(kref, struct aa_label, count);
+ struct aa_label *label = container_of(ref, struct aa_label, count);
struct aa_ns *ns = labels_ns(label);
if (!ns) {
@@ -408,7 +409,11 @@ bool aa_label_init(struct aa_label *label, int size, gfp_t gfp)
label->size = size; /* doesn't include null */
label->vec[size] = NULL; /* null terminate */
- kref_init(&label->count);
+ if (percpu_ref_init(&label->count, aa_label_percpu_ref, PERCPU_REF_INIT_ATOMIC, gfp)) {
+ aa_free_secid(label->secid);
+ return false;
+ }
+
RB_CLEAR_NODE(&label->node);
return true;
--
2.34.1
^ permalink raw reply related [flat|nested] 12+ messages in thread* [RFC 6/6] apparmor: Switch labels to percpu ref managed mode
2024-09-16 5:08 [RFC 0/6] Managed Percpu Refcount Neeraj Upadhyay
` (4 preceding siblings ...)
2024-09-16 5:08 ` [RFC 5/6] apparmor: Switch labels to percpu refcount in atomic mode Neeraj Upadhyay
@ 2024-09-16 5:08 ` Neeraj Upadhyay
2024-09-18 5:44 ` kernel test robot
5 siblings, 1 reply; 12+ messages in thread
From: Neeraj Upadhyay @ 2024-09-16 5:08 UTC (permalink / raw)
To: linux-kernel
Cc: john.johansen, paul, jmorris, serge, linux-security-module,
gautham.shenoy, Santosh.Shukla, Ananth.Narayan,
Raghavendra.KodsaraThimmappa, paulmck, boqun.feng, vinicius.gomes,
mjguzik, dennis, tj, cl, linux-mm, rcu
Nginx performance testing with Apparmor enabled (with Nginx running in
unconfined profile), on kernel versions 6.1 and 6.5 show significant
drop in throughput scalability when Nginx workers are scaled to use
higher number of CPUs across various L3 cache domains.
Below is one sample data on the throughput scalability loss, based on
results on AMD Zen4 system with 96 CPUs with SMT core count 2:
Config Cache Domains apparmor=off apparmor=on
scaling eff (%) scaling eff (%)
8C16T 1 100% 100%
16C32T 2 95% 94%
24C48T 3 94% 93%
48C96T 6 92% 88%
96C192T 12 85% 68%
There is a significant drop in scaling efficiency for 96 cores/192 SMT
threads.
Perf tool shows most of the contention coming from below places:
6.56% nginx [kernel.vmlinux] [k] apparmor_current_getsecid_subj
6.22% nginx [kernel.vmlinux] [k] apparmor_file_open
The majority of the CPU cycles is found to be due to memory contention
in atomic_fetch_add and atomic_fetch_sub operations from kref_get() and
kref_put() operations on AppArmor labels.
A part of the contention was fixed with commit 2516fde1fa00 ("apparmor:
Optimize retrieving current task secid"). After including this commit, the
scaling efficiency improved to below:
Config Cache Domains apparmor=on apparmor=on (patched)
scaling eff (%) scaling eff (%)
8C16T 1 100% 100%
16C32T 2 97% 93%
24C48T 3 94% 92%
48C96T 6 88% 88%
96C192T 12 65% 79%
However, the scaling efficiency impact is still significant even after
including the commit. Also, the performance impact is even higher for
>192 CPUs. In addition, the memory contention impact would increase
when there is a high frequency of label update operations and labels
are marked stale more frequently.
Use the new percpu managed mode for tracking release of all Apparmor
labels. Using percpu refcount for Apparmor label's refcounting improves
throughput scalability for Nginx:
Config Cache Domains apparmor=on (percpuref)
scaling eff (%)
8C16T 1 100%
16C32T 2 96%
24C48T 3 94%
48C96T 6 93%
96C192T 12 90%
Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
---
The apparmor_file_open() refcount contention has been resolved recently
with commit f4fee216df7d ("apparmor: try to avoid refing the label in
apparmor_file_open"). I have posted this series to get feedback on the
approach to improve refcount scalability within apparmor subsystem.
security/apparmor/label.c | 1 +
security/apparmor/policy_ns.c | 2 ++
2 files changed, 3 insertions(+)
diff --git a/security/apparmor/label.c b/security/apparmor/label.c
index aa9e6eac3ecc..016a45a180b1 100644
--- a/security/apparmor/label.c
+++ b/security/apparmor/label.c
@@ -710,6 +710,7 @@ static struct aa_label *__label_insert(struct aa_labelset *ls,
rb_link_node(&label->node, parent, new);
rb_insert_color(&label->node, &ls->root);
label->flags |= FLAG_IN_TREE;
+ percpu_ref_switch_to_managed(&label->count);
return aa_get_label(label);
}
diff --git a/security/apparmor/policy_ns.c b/security/apparmor/policy_ns.c
index 1f02cfe1d974..18eb58b68a60 100644
--- a/security/apparmor/policy_ns.c
+++ b/security/apparmor/policy_ns.c
@@ -124,6 +124,7 @@ static struct aa_ns *alloc_ns(const char *prefix, const char *name)
goto fail_unconfined;
/* ns and ns->unconfined share ns->unconfined refcount */
ns->unconfined->ns = ns;
+ percpu_ref_switch_to_managed(&ns->unconfined->label.count);
atomic_set(&ns->uniq_null, 0);
@@ -377,6 +378,7 @@ int __init aa_alloc_root_ns(void)
}
kernel_t = &kernel_p->label;
root_ns->unconfined->ns = aa_get_ns(root_ns);
+ percpu_ref_switch_to_managed(&root_ns->unconfined->label.count);
return 0;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 12+ messages in thread* Re: [RFC 6/6] apparmor: Switch labels to percpu ref managed mode
2024-09-16 5:08 ` [RFC 6/6] apparmor: Switch labels to percpu ref managed mode Neeraj Upadhyay
@ 2024-09-18 5:44 ` kernel test robot
0 siblings, 0 replies; 12+ messages in thread
From: kernel test robot @ 2024-09-18 5:44 UTC (permalink / raw)
To: Neeraj Upadhyay
Cc: oe-lkp, lkp, apparmor, linux-kernel, john.johansen, paul, jmorris,
serge, linux-security-module, gautham.shenoy, Santosh.Shukla,
Ananth.Narayan, Raghavendra.KodsaraThimmappa, paulmck, boqun.feng,
vinicius.gomes, mjguzik, dennis, tj, cl, linux-mm, rcu,
oliver.sang
Hello,
kernel test robot noticed "WARNING:at_lib/percpu-refcount.c:#__percpu_ref_switch_to_managed" on:
commit: 59b177cbdc4908a728329b8eec742969a2285979 ("[RFC 6/6] apparmor: Switch labels to percpu ref managed mode")
url: https://github.com/intel-lab-lkp/linux/commits/Neeraj-Upadhyay/percpu-refcount-Add-managed-mode-for-RCU-released-objects/20240916-131210
base: https://git.kernel.org/cgit/linux/kernel/git/akpm/mm.git mm-nonmm-unstable
patch link: https://lore.kernel.org/all/20240916050811.473556-7-Neeraj.Upadhyay@amd.com/
patch subject: [RFC 6/6] apparmor: Switch labels to percpu ref managed mode
in testcase: boot
compiler: gcc-12
test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G
(please refer to attached dmesg/kmsg for entire log/backtrace)
+------------------------------------------------------------------+------------+------------+
| | 124f137c55 | 59b177cbdc |
+------------------------------------------------------------------+------------+------------+
| WARNING:at_lib/percpu-refcount.c:#__percpu_ref_switch_to_managed | 0 | 13 |
| RIP:__percpu_ref_switch_to_managed | 0 | 13 |
+------------------------------------------------------------------+------------+------------+
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202409181358.c07681be-oliver.sang@intel.com
[ 13.041399][ T0] ------------[ cut here ]------------
[ 13.042302][ T0] Percpu ref is already managed
[ 13.042302][ T0] WARNING: CPU: 0 PID: 0 at lib/percpu-refcount.c:151 __percpu_ref_switch_to_managed (lib/percpu-refcount.c:151 (discriminator 3))
[ 13.042302][ T0] Modules linked in:
[ 13.042302][ T0] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 6.11.0-rc6-00143-g59b177cbdc49 #12
[ 13.042302][ T0] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
[ 13.042302][ T0] RIP: 0010:__percpu_ref_switch_to_managed (lib/percpu-refcount.c:151 (discriminator 3))
[ 13.042302][ T0] Code: 03 75 58 65 48 ff 08 e8 c1 7c ee fe eb b1 80 3d 5d a2 ca 03 00 75 c5 48 c7 c7 c0 61 8f 91 c6 05 4d a2 ca 03 01 e8 03 af ce fe <0f> 0b eb ae 31 f6 48 89 df e8 05 f7 ff ff e9 54 fe ff ff e8 4b aa
All code
========
0: 03 75 58 add 0x58(%rbp),%esi
3: 65 48 ff 08 decq %gs:(%rax)
7: e8 c1 7c ee fe callq 0xfffffffffeee7ccd
c: eb b1 jmp 0xffffffffffffffbf
e: 80 3d 5d a2 ca 03 00 cmpb $0x0,0x3caa25d(%rip) # 0x3caa272
15: 75 c5 jne 0xffffffffffffffdc
17: 48 c7 c7 c0 61 8f 91 mov $0xffffffff918f61c0,%rdi
1e: c6 05 4d a2 ca 03 01 movb $0x1,0x3caa24d(%rip) # 0x3caa272
25: e8 03 af ce fe callq 0xfffffffffeceaf2d
2a:* 0f 0b ud2 <-- trapping instruction
2c: eb ae jmp 0xffffffffffffffdc
2e: 31 f6 xor %esi,%esi
30: 48 89 df mov %rbx,%rdi
33: e8 05 f7 ff ff callq 0xfffffffffffff73d
38: e9 54 fe ff ff jmpq 0xfffffffffffffe91
3d: e8 .byte 0xe8
3e: 4b aa rex.WXB stos %al,%es:(%rdi)
Code starting with the faulting instruction
===========================================
0: 0f 0b ud2
2: eb ae jmp 0xffffffffffffffb2
4: 31 f6 xor %esi,%esi
6: 48 89 df mov %rbx,%rdi
9: e8 05 f7 ff ff callq 0xfffffffffffff713
e: e9 54 fe ff ff jmpq 0xfffffffffffffe67
13: e8 .byte 0xe8
14: 4b aa rex.WXB stos %al,%es:(%rdi)
[ 13.042302][ T0] RSP: 0000:ffffffff92407e00 EFLAGS: 00010082
[ 13.042302][ T0] RAX: 0000000000000000 RBX: ffff8881008a6500 RCX: 1ffffffff24a3780
[ 13.042302][ T0] RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000001
[ 13.042302][ T0] RBP: ffff8881008a6508 R08: 0000000000000000 R09: fffffbfff24a3780
[ 13.042302][ T0] R10: ffffffff9251bc03 R11: 0000000000000001 R12: ffff8881001f3500
[ 13.042302][ T0] R13: ffff8881001f3500 R14: 0000000000000005 R15: 00000000000147b0
[ 13.042302][ T0] FS: 0000000000000000(0000) GS:ffff8883a8400000(0000) knlGS:0000000000000000
[ 13.042302][ T0] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 13.042302][ T0] CR2: ffff88843ffff000 CR3: 00000003b6e62000 CR4: 00000000000006f0
[ 13.042302][ T0] Call Trace:
[ 13.042302][ T0] <TASK>
[ 13.042302][ T0] ? __warn (kernel/panic.c:741)
[ 13.042302][ T0] ? __percpu_ref_switch_to_managed (lib/percpu-refcount.c:151 (discriminator 3))
[ 13.042302][ T0] ? report_bug (lib/bug.c:180 lib/bug.c:219)
[ 13.042302][ T0] ? handle_bug (arch/x86/kernel/traps.c:239)
[ 13.042302][ T0] ? exc_invalid_op (arch/x86/kernel/traps.c:260 (discriminator 1))
[ 13.042302][ T0] ? asm_exc_invalid_op (arch/x86/include/asm/idtentry.h:621)
[ 13.042302][ T0] ? __percpu_ref_switch_to_managed (lib/percpu-refcount.c:151 (discriminator 3))
[ 13.042302][ T0] percpu_ref_switch_to_managed (include/linux/spinlock.h:406 lib/percpu-refcount.c:182)
[ 13.042302][ T0] aa_alloc_root_ns (security/apparmor/policy_ns.c:383)
[ 13.042302][ T0] ? aa_setup_dfa_engine (security/apparmor/lsm.c:2194)
[ 13.042302][ T0] apparmor_init (security/apparmor/lsm.c:2235)
[ 13.042302][ T0] initialize_lsm (security/security.c:263 (discriminator 3))
[ 13.042302][ T0] ordered_lsm_init (security/security.c:422 (discriminator 3))
[ 13.042302][ T0] security_init (security/security.c:475)
[ 13.042302][ T0] start_kernel (init/main.c:1085)
[ 13.042302][ T0] x86_64_start_reservations (arch/x86/kernel/head64.c:495)
[ 13.042302][ T0] x86_64_start_kernel (arch/x86/kernel/head64.c:437 (discriminator 17))
[ 13.042302][ T0] common_startup_64 (arch/x86/kernel/head_64.S:421)
[ 13.042302][ T0] </TASK>
[ 13.042302][ T0] ---[ end trace 0000000000000000 ]---
[ 13.044823][ T0] AppArmor: AppArmor initialized
[ 13.062176][ T0] Mount-cache hash table entries: 32768 (order: 6, 262144 bytes, linear)
[ 13.065846][ T0] Mountpoint-cache hash table entries: 32768 (order: 6, 262144 bytes, linear)
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20240918/202409181358.c07681be-oliver.sang@intel.com
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 12+ messages in thread