From: Uladzislau Rezki <urezki@gmail.com>
To: Joel Fernandes <joel@joelfernandes.org>,
Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
"Paul E. McKenney" <paulmck@kernel.org>,
Steven Rostedt <rostedt@goodmis.org>,
rcu@vger.kernel.org, Josh Triplett <josh@joshtriplett.org>,
Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
Lai Jiangshan <jiangshanlai@gmail.com>,
Thomas Gleixner <tglx@linutronix.de>,
Mike Galbraith <efault@gmx.de>,
urezki@gmail.com
Subject: Re: [PATCH 1/3] rcu: Use static initializer for krc.lock
Date: Fri, 17 Apr 2020 10:47:07 +0200 [thread overview]
Message-ID: <20200417084707.GA13555@pc636> (raw)
In-Reply-To: <20200417030515.GE176663@google.com>
On Thu, Apr 16, 2020 at 11:05:15PM -0400, Joel Fernandes wrote:
> On Thu, Apr 16, 2020 at 11:34:44PM +0200, Sebastian Andrzej Siewior wrote:
> > On 2020-04-16 14:00:57 [-0700], Paul E. McKenney wrote:
> > >
> > > We might need different calling-context restrictions for the two variants
> > > of kfree_rcu(). And we might need to come up with some sort of lockdep
> > > check for "safe to use normal spinlock in -rt".
> >
> > Oh. We do have this already, it is called CONFIG_PROVE_RAW_LOCK_NESTING.
> > This one will scream if you do
> > raw_spin_lock();
> > spin_lock();
> >
> > Sadly, as of today, there is code triggering this which needs to be
> > addressed first (but it is one list of things to do).
> >
> > Given the thread so far, is it okay if I repost the series with
> > migrate_disable() instead of accepting a possible migration before
> > grabbing the lock? I would prefer to avoid the extra RT case (avoiding
> > memory allocations in a possible atomic context) until we get there.
>
> I prefer something like the following to make it possible to invoke
> kfree_rcu() from atomic context considering call_rcu() is already callable
> from such contexts. Thoughts?
>
> (Only build tested)
> ---8<-----------------------
>
> From: "Joel Fernandes (Google)" <joel@joelfernandes.org>
> Subject: [PATCH] rcu/tree: Avoid allocating in non-preemptible context for
> PREEMPT_RT kernels
>
> Per recent discussions, kfree_rcu() is a low-level facility which should be
> callable in atomic context (raw spinlock sections, IRQ disable sections etc).
>
> However, it depends on page allocation which acquires sleeping locks on
> PREEMPT_RT.
>
> In order to support all usecases, avoid the allocation of pages for
> PREEMPT_RT. The page allocation is just an optimization which does not
> break functionality. Further, in future patches the pages will be
> pre-allocated reducing the likelihood that page allocations will be
> needed.
>
> We also convert the spinlock_t to raw_spinlock_t so that does not sleep
> in PREEMPT_RT's raw atomic critical sections.
>
> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
> ---
> kernel/rcu/tree.c | 42 +++++++++++++++++++++++++-----------------
> 1 file changed, 25 insertions(+), 17 deletions(-)
>
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index f288477ee1c26..ba831712fb307 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -2905,7 +2905,7 @@ struct kfree_rcu_cpu {
> struct kfree_rcu_bulk_data *bhead;
> struct kfree_rcu_bulk_data *bcached;
> struct kfree_rcu_cpu_work krw_arr[KFREE_N_BATCHES];
> - spinlock_t lock;
> + raw_spinlock_t lock;
> struct delayed_work monitor_work;
> bool monitor_todo;
> bool initialized;
> @@ -2939,12 +2939,12 @@ static void kfree_rcu_work(struct work_struct *work)
> krwp = container_of(to_rcu_work(work),
> struct kfree_rcu_cpu_work, rcu_work);
> krcp = krwp->krcp;
> - spin_lock_irqsave(&krcp->lock, flags);
> + raw_spin_lock_irqsave(&krcp->lock, flags);
> head = krwp->head_free;
> krwp->head_free = NULL;
> bhead = krwp->bhead_free;
> krwp->bhead_free = NULL;
> - spin_unlock_irqrestore(&krcp->lock, flags);
> + raw_spin_unlock_irqrestore(&krcp->lock, flags);
>
> /* "bhead" is now private, so traverse locklessly. */
> for (; bhead; bhead = bnext) {
> @@ -3047,14 +3047,14 @@ static inline void kfree_rcu_drain_unlock(struct kfree_rcu_cpu *krcp,
> krcp->monitor_todo = false;
> if (queue_kfree_rcu_work(krcp)) {
> // Success! Our job is done here.
> - spin_unlock_irqrestore(&krcp->lock, flags);
> + raw_spin_unlock_irqrestore(&krcp->lock, flags);
> return;
> }
>
> // Previous RCU batch still in progress, try again later.
> krcp->monitor_todo = true;
> schedule_delayed_work(&krcp->monitor_work, KFREE_DRAIN_JIFFIES);
> - spin_unlock_irqrestore(&krcp->lock, flags);
> + raw_spin_unlock_irqrestore(&krcp->lock, flags);
> }
>
> /*
> @@ -3067,16 +3067,16 @@ static void kfree_rcu_monitor(struct work_struct *work)
> struct kfree_rcu_cpu *krcp = container_of(work, struct kfree_rcu_cpu,
> monitor_work.work);
>
> - spin_lock_irqsave(&krcp->lock, flags);
> + raw_spin_lock_irqsave(&krcp->lock, flags);
> if (krcp->monitor_todo)
> kfree_rcu_drain_unlock(krcp, flags);
> else
> - spin_unlock_irqrestore(&krcp->lock, flags);
> + raw_spin_unlock_irqrestore(&krcp->lock, flags);
> }
>
> static inline bool
> kfree_call_rcu_add_ptr_to_bulk(struct kfree_rcu_cpu *krcp,
> - struct rcu_head *head, rcu_callback_t func)
> + struct rcu_head *head, rcu_callback_t func, bool alloc)
> {
> struct kfree_rcu_bulk_data *bnode;
>
> @@ -3092,6 +3092,10 @@ kfree_call_rcu_add_ptr_to_bulk(struct kfree_rcu_cpu *krcp,
> if (!bnode) {
> WARN_ON_ONCE(sizeof(struct kfree_rcu_bulk_data) > PAGE_SIZE);
>
> + /* If allocation is not allowed, don't do it. */
> + if (!alloc)
> + return false;
> +
> bnode = (struct kfree_rcu_bulk_data *)
> __get_free_page(GFP_NOWAIT | __GFP_NOWARN);
> }
> @@ -3138,11 +3142,15 @@ void kfree_call_rcu(struct rcu_head *head, rcu_callback_t func)
> {
> unsigned long flags;
> struct kfree_rcu_cpu *krcp;
> + bool alloc = true;
> +
> + if (IS_ENABLED(CONFIG_PREEMPT_RT) && !preemptible())
> + alloc = false;
>
> local_irq_save(flags); // For safely calling this_cpu_ptr().
> krcp = this_cpu_ptr(&krc);
> if (krcp->initialized)
> - spin_lock(&krcp->lock);
> + raw_spin_lock(&krcp->lock);
>
> // Queue the object but don't yet schedule the batch.
> if (debug_rcu_head_queue(head)) {
> @@ -3156,7 +3164,7 @@ void kfree_call_rcu(struct rcu_head *head, rcu_callback_t func)
> * Under high memory pressure GFP_NOWAIT can fail,
> * in that case the emergency path is maintained.
> */
> - if (unlikely(!kfree_call_rcu_add_ptr_to_bulk(krcp, head, func))) {
> + if (unlikely(!kfree_call_rcu_add_ptr_to_bulk(krcp, head, func, alloc))) {
> head->func = func;
> head->next = krcp->head;
> krcp->head = head;
> @@ -3173,7 +3181,7 @@ void kfree_call_rcu(struct rcu_head *head, rcu_callback_t func)
>
> unlock_return:
> if (krcp->initialized)
> - spin_unlock(&krcp->lock);
> + raw_spin_unlock(&krcp->lock);
> local_irq_restore(flags);
> }
> EXPORT_SYMBOL_GPL(kfree_call_rcu);
> @@ -3205,11 +3213,11 @@ kfree_rcu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
> struct kfree_rcu_cpu *krcp = per_cpu_ptr(&krc, cpu);
>
> count = krcp->count;
> - spin_lock_irqsave(&krcp->lock, flags);
> + raw_spin_lock_irqsave(&krcp->lock, flags);
> if (krcp->monitor_todo)
> kfree_rcu_drain_unlock(krcp, flags);
> else
> - spin_unlock_irqrestore(&krcp->lock, flags);
> + raw_spin_unlock_irqrestore(&krcp->lock, flags);
>
> sc->nr_to_scan -= count;
> freed += count;
> @@ -3236,15 +3244,15 @@ void __init kfree_rcu_scheduler_running(void)
> for_each_online_cpu(cpu) {
> struct kfree_rcu_cpu *krcp = per_cpu_ptr(&krc, cpu);
>
> - spin_lock_irqsave(&krcp->lock, flags);
> + raw_spin_lock_irqsave(&krcp->lock, flags);
> if (!krcp->head || krcp->monitor_todo) {
> - spin_unlock_irqrestore(&krcp->lock, flags);
> + raw_spin_unlock_irqrestore(&krcp->lock, flags);
> continue;
> }
> krcp->monitor_todo = true;
> schedule_delayed_work_on(cpu, &krcp->monitor_work,
> KFREE_DRAIN_JIFFIES);
> - spin_unlock_irqrestore(&krcp->lock, flags);
> + raw_spin_unlock_irqrestore(&krcp->lock, flags);
> }
> }
>
> @@ -4140,7 +4148,7 @@ static void __init kfree_rcu_batch_init(void)
> for_each_possible_cpu(cpu) {
> struct kfree_rcu_cpu *krcp = per_cpu_ptr(&krc, cpu);
>
> - spin_lock_init(&krcp->lock);
> + raw_spin_lock_init(&krcp->lock);
> for (i = 0; i < KFREE_N_BATCHES; i++) {
> INIT_RCU_WORK(&krcp->krw_arr[i].rcu_work, kfree_rcu_work);
> krcp->krw_arr[i].krcp = krcp;
I have the same view on it how to handle it in PREEMPT_RT. Basically if
we can store the pointer in array we do it, if a current context is not
preemptible just build the list using rcu_head and queuing it for
farther process. If context is preemptible we will utilize our "array
pointer" approach, so the performance optimization will be in place
on CONFIG_PREEMPT_RT kernel.
I think this is an easiest way of making it PREEMPT_RT friendly. Also
we need to add static initialize of "raw spin lock". Split this patch
to:
- converting to raw spinlocks;
- make it statically initialized;
- bypass the page allocator if RT and not preemptable.
--
Vlad Rezki
next prev parent reply other threads:[~2020-04-17 8:47 UTC|newest]
Thread overview: 85+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-04-15 16:00 [PATCH 0/3] rcu: Static initializer + misc Sebastian Andrzej Siewior
2020-04-15 16:00 ` [PATCH 1/3] rcu: Use static initializer for krc.lock Sebastian Andrzej Siewior
2020-04-16 14:42 ` Joel Fernandes
2020-04-16 15:01 ` Uladzislau Rezki
2020-04-16 15:20 ` Sebastian Andrzej Siewior
2020-04-16 15:38 ` Uladzislau Rezki
2020-04-16 15:46 ` Sebastian Andrzej Siewior
2020-04-16 16:01 ` Uladzislau Rezki
2020-04-16 16:11 ` Sebastian Andrzej Siewior
2020-04-16 16:18 ` Uladzislau Rezki
2020-04-16 16:33 ` Sebastian Andrzej Siewior
2020-04-16 17:29 ` Paul E. McKenney
2020-04-16 18:23 ` Sebastian Andrzej Siewior
2020-04-16 18:29 ` Paul E. McKenney
2020-04-16 18:43 ` Joel Fernandes
2020-04-16 20:56 ` Sebastian Andrzej Siewior
2020-04-16 21:04 ` Joel Fernandes
2020-04-16 21:07 ` Sebastian Andrzej Siewior
2020-04-16 18:40 ` Steven Rostedt
2020-04-16 18:53 ` Joel Fernandes
2020-04-16 19:24 ` Steven Rostedt
2020-04-16 20:41 ` Joel Fernandes
2020-04-16 21:05 ` Sebastian Andrzej Siewior
2020-04-16 17:28 ` Paul E. McKenney
2020-04-16 15:18 ` Sebastian Andrzej Siewior
2020-04-16 18:41 ` Joel Fernandes
2020-04-16 18:59 ` Joel Fernandes
2020-04-16 19:26 ` Steven Rostedt
2020-04-16 19:53 ` Paul E. McKenney
2020-04-16 20:05 ` Uladzislau Rezki
2020-04-16 20:25 ` Paul E. McKenney
2020-04-16 21:02 ` Joel Fernandes
2020-04-16 21:18 ` Uladzislau Rezki
2020-04-16 21:26 ` Uladzislau Rezki
2020-04-16 21:28 ` Sebastian Andrzej Siewior
2020-04-16 20:36 ` Joel Fernandes
2020-04-16 21:00 ` Paul E. McKenney
2020-04-16 21:34 ` Sebastian Andrzej Siewior
2020-04-17 3:05 ` Joel Fernandes
2020-04-17 8:47 ` Uladzislau Rezki [this message]
2020-04-17 15:04 ` Sebastian Andrzej Siewior
2020-04-17 18:26 ` Joel Fernandes
2020-04-17 18:54 ` Paul E. McKenney
2020-04-18 12:37 ` Uladzislau Rezki
2020-04-19 14:58 ` Paul E. McKenney
2020-04-20 0:27 ` Joel Fernandes
2020-04-20 1:17 ` Joel Fernandes
2020-04-20 1:44 ` Paul E. McKenney
2020-04-20 12:13 ` Uladzislau Rezki
2020-04-20 12:36 ` joel
2020-04-20 13:00 ` Uladzislau Rezki
2020-04-20 13:26 ` Paul E. McKenney
2020-04-20 16:08 ` Uladzislau Rezki
2020-04-20 16:25 ` Paul E. McKenney
2020-04-20 16:29 ` Uladzislau Rezki
2020-04-20 16:46 ` Paul E. McKenney
2020-04-20 16:59 ` Uladzislau Rezki
2020-04-20 17:21 ` Paul E. McKenney
2020-04-20 17:40 ` Uladzislau Rezki
2020-04-20 17:57 ` Joel Fernandes
2020-04-20 18:13 ` Paul E. McKenney
2020-04-20 17:59 ` Paul E. McKenney
2020-04-20 19:06 ` Uladzislau Rezki
2020-04-20 20:17 ` Uladzislau Rezki
2020-04-20 22:16 ` Paul E. McKenney
2020-04-21 1:22 ` Steven Rostedt
2020-04-21 5:18 ` Uladzislau Rezki
2020-04-21 13:30 ` Steven Rostedt
2020-04-21 13:45 ` Uladzislau Rezki
2020-04-21 13:39 ` Sebastian Andrzej Siewior
2020-04-21 15:41 ` Paul E. McKenney
2020-04-21 17:05 ` Sebastian Andrzej Siewior
2020-04-21 18:09 ` Paul E. McKenney
2020-04-22 11:13 ` Sebastian Andrzej Siewior
2020-04-22 13:33 ` Paul E. McKenney
2020-04-22 15:46 ` Sebastian Andrzej Siewior
2020-04-22 16:19 ` Paul E. McKenney
2020-04-22 16:35 ` Paul E. McKenney
2020-04-20 3:02 ` Mike Galbraith
2020-04-20 12:30 ` joel
2020-04-17 16:11 ` Uladzislau Rezki
2020-04-19 12:15 ` Uladzislau Rezki
2020-04-15 16:00 ` [PATCH 2/3] rcu: Use consistent locking around kfree_rcu_drain_unlock() Sebastian Andrzej Siewior
2020-04-15 16:00 ` [PATCH 3/3] rcu: Avoid using xchg() in kfree_call_rcu_add_ptr_to_bulk() Sebastian Andrzej Siewior
2020-04-20 15:23 ` Joel Fernandes
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200417084707.GA13555@pc636 \
--to=urezki@gmail.com \
--cc=bigeasy@linutronix.de \
--cc=efault@gmx.de \
--cc=jiangshanlai@gmail.com \
--cc=joel@joelfernandes.org \
--cc=josh@joshtriplett.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=paulmck@kernel.org \
--cc=rcu@vger.kernel.org \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.