From: Uladzislau Rezki <urezki@gmail.com>
To: "Paul E. McKenney" <paulmck@kernel.org>
Cc: "Uladzislau Rezki (Sony)" <urezki@gmail.com>,
LKML <linux-kernel@vger.kernel.org>, RCU <rcu@vger.kernel.org>,
Frederic Weisbecker <frederic@kernel.org>,
Neeraj Upadhyay <neeraj.iitr10@gmail.com>,
Joel Fernandes <joel@joelfernandes.org>,
Oleksiy Avramchenko <oleksiy.avramchenko@sony.com>
Subject: Re: [PATCH v2 4/4] rcu/kvfree: Use a polled API to speedup a reclaim process
Date: Fri, 2 Dec 2022 13:54:17 +0100 [thread overview]
Message-ID: <Y4n1eWpLMkaEGxtP@pc636> (raw)
In-Reply-To: <20221201234559.GA1520591@paulmck-ThinkPad-P17-Gen-1>
>
> A couple more questions interspersed below upon further reflection.
>
> Thoughts?
>
See below my thoughts:
> Thanx, Paul
>
> > ---
> > kernel/rcu/tree.c | 47 +++++++++++++++++++++++++++++++++++++++--------
> > 1 file changed, 39 insertions(+), 8 deletions(-)
> >
> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > index c94c17194299..44279ca488ef 100644
> > --- a/kernel/rcu/tree.c
> > +++ b/kernel/rcu/tree.c
> > @@ -2741,11 +2741,13 @@ EXPORT_SYMBOL_GPL(call_rcu);
> > /**
> > * struct kvfree_rcu_bulk_data - single block to store kvfree_rcu() pointers
> > * @list: List node. All blocks are linked between each other
> > + * @gp_snap: Snapshot of RCU state for objects placed to this bulk
> > * @nr_records: Number of active pointers in the array
> > * @records: Array of the kvfree_rcu() pointers
> > */
> > struct kvfree_rcu_bulk_data {
> > struct list_head list;
> > + unsigned long gp_snap;
> > unsigned long nr_records;
> > void *records[];
> > };
> > @@ -2762,13 +2764,15 @@ struct kvfree_rcu_bulk_data {
> > * struct kfree_rcu_cpu_work - single batch of kfree_rcu() requests
> > * @rcu_work: Let queue_rcu_work() invoke workqueue handler after grace period
> > * @head_free: List of kfree_rcu() objects waiting for a grace period
> > + * @head_free_gp_snap: Snapshot of RCU state for objects placed to "@head_free"
> > * @bulk_head_free: Bulk-List of kvfree_rcu() objects waiting for a grace period
> > * @krcp: Pointer to @kfree_rcu_cpu structure
> > */
> >
> > struct kfree_rcu_cpu_work {
> > - struct rcu_work rcu_work;
> > + struct work_struct rcu_work;
> > struct rcu_head *head_free;
> > + unsigned long head_free_gp_snap;
> > struct list_head bulk_head_free[FREE_N_CHANNELS];
> > struct kfree_rcu_cpu *krcp;
> > };
> > @@ -2964,10 +2968,11 @@ static void kfree_rcu_work(struct work_struct *work)
> > struct rcu_head *head;
> > struct kfree_rcu_cpu *krcp;
> > struct kfree_rcu_cpu_work *krwp;
> > + unsigned long head_free_gp_snap;
> > int i;
> >
> > - krwp = container_of(to_rcu_work(work),
> > - struct kfree_rcu_cpu_work, rcu_work);
> > + krwp = container_of(work,
> > + struct kfree_rcu_cpu_work, rcu_work);
> > krcp = krwp->krcp;
> >
> > raw_spin_lock_irqsave(&krcp->lock, flags);
> > @@ -2978,12 +2983,29 @@ static void kfree_rcu_work(struct work_struct *work)
> > // Channel 3.
> > head = krwp->head_free;
> > krwp->head_free = NULL;
> > + head_free_gp_snap = krwp->head_free_gp_snap;
> > raw_spin_unlock_irqrestore(&krcp->lock, flags);
> >
> > // Handle the first two channels.
> > - for (i = 0; i < FREE_N_CHANNELS; i++)
> > + for (i = 0; i < FREE_N_CHANNELS; i++) {
> > + // Start from the tail page, so a GP is likely passed for it.
> > + list_for_each_entry_safe_reverse(bnode, n, &bulk_head[i], list) {
> > + // Not yet ready? Bail out since we need one more GP.
> > + if (!poll_state_synchronize_rcu(bnode->gp_snap))
> > + break;
> > +
> > + list_del_init(&bnode->list);
> > + kvfree_rcu_bulk(krcp, bnode, i);
> > + }
> > +
> > + // Please note a request for one more extra GP can
> > + // occur only once for all objects in this batch.
> > + if (!list_empty(&bulk_head[i]))
> > + synchronize_rcu();
>
> Does directly invoking synchronize_rcu() instead of using queue_rcu_work()
> provide benefits, for example, reduced memory footprint?
>
queue_rcu_work() will delay freeing of all objects in a batch. We can
make use of it but it should be only for the ones which still require
a grace period. A memory footprint and a time depends on when our
callback is invoked by the RCU-core to queue the reclaim work.
Such time can be long, because it depends on many factors:
- scheduling delays in waking gp;
- scheduling delays in kicking nocb;
- delays in waiting in a "cblist":
- dequeuing and invoking f(rhp);
- delay in waking our final reclaim work and giving it a CPU time.
This patch combines a possibility to reclaim asap for objects which
passed a grace period and requesting one more GP for the ones which
have not passed it yet.
>
> If not, it would be good to instead use queue_rcu_work() in order
> to avoid an unnecessary context switch in this workqueue handler.
>
I went by the most easiest way from code perspective since i do not
see problems with a current approach from testing and personal point
of views.
If we are about to do that i need to add extra logic to split ready
and not ready pointers for direct reclaim and the rest over the
queu_rcu_work().
I can check how it goes.
>
> My concern is that an RCU CPU stall might otherwise end up tying up more
> workqueue kthreads as well as more memory.
>
There is a limit. We have two batches, one work for each. Suppose the
reclaim kthread is stuck in synchronize_rcu() so it does not do any
progress. In this case same work can be only in pending state and
nothing more no matter how many times the queue_work() is invoked:
2 * num_possible_cpus();
If we end up in RCU stall we will not be able to reclaim anyway.
--
Uladzislau Rezki
next prev parent reply other threads:[~2022-12-02 12:54 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-11-29 15:58 [PATCH v2 0/4] kvfree_rcu() updates related to polled API Uladzislau Rezki (Sony)
2022-11-29 15:58 ` [PATCH v2 1/4] rcu/kvfree: Switch to a generic linked list API Uladzislau Rezki (Sony)
2022-11-29 15:58 ` [PATCH v2 2/4] rcu/kvfree: Move bulk/list reclaim to separate functions Uladzislau Rezki (Sony)
2022-11-29 15:58 ` [PATCH v2 3/4] rcu/kvfree: Move need_offload_krc() out of krcp->lock Uladzislau Rezki (Sony)
2022-11-29 23:38 ` Paul E. McKenney
2022-11-30 12:56 ` Uladzislau Rezki
2022-11-30 18:44 ` Paul E. McKenney
2022-12-02 13:19 ` Uladzislau Rezki
2022-11-29 15:58 ` [PATCH v2 4/4] rcu/kvfree: Use a polled API to speedup a reclaim process Uladzislau Rezki (Sony)
2022-12-01 23:45 ` Paul E. McKenney
2022-12-02 12:54 ` Uladzislau Rezki [this message]
2022-12-02 19:14 ` Paul E. McKenney
2022-11-29 16:37 ` [PATCH v2 0/4] kvfree_rcu() updates related to polled API Paul E. McKenney
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y4n1eWpLMkaEGxtP@pc636 \
--to=urezki@gmail.com \
--cc=frederic@kernel.org \
--cc=joel@joelfernandes.org \
--cc=linux-kernel@vger.kernel.org \
--cc=neeraj.iitr10@gmail.com \
--cc=oleksiy.avramchenko@sony.com \
--cc=paulmck@kernel.org \
--cc=rcu@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox