From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Lai Jiangshan <jiangshanlai+lkml@gmail.com>
Cc: Tejun Heo <tj@kernel.org>,
torvalds@linux-foundation.org, Jann Horn <jannh@google.com>,
bcrl@kvack.org, viro@zeniv.linux.org.uk,
Kent Overstreet <kent.overstreet@gmail.com>,
security@kernel.org, LKML <linux-kernel@vger.kernel.org>,
kernel-team@fb.com
Subject: Re: [PATCH 7/7] RCU, workqueue: Implement rcu_work
Date: Thu, 8 Mar 2018 09:28:18 -0800 [thread overview]
Message-ID: <20180308172818.GN3918@linux.vnet.ibm.com> (raw)
In-Reply-To: <CAJhGHyDEpXG88uDHiogxUH0Yp5sU3v1p-kUSvJbGYCZ+HGucvQ@mail.gmail.com>
On Thu, Mar 08, 2018 at 08:29:53AM +0800, Lai Jiangshan wrote:
> On Wed, Mar 7, 2018 at 10:54 PM, Paul E. McKenney
> <paulmck@linux.vnet.ibm.com> wrote:
> > On Wed, Mar 07, 2018 at 10:49:49AM +0800, Lai Jiangshan wrote:
> >> On Wed, Mar 7, 2018 at 1:33 AM, Tejun Heo <tj@kernel.org> wrote:
> >>
> >> > +/**
> >> > + * queue_rcu_work_on - queue work on specific CPU after a RCU grace period
> >> > + * @cpu: CPU number to execute work on
> >> > + * @wq: workqueue to use
> >> > + * @rwork: work to queue
> >>
> >> For many people, "RCU grace period" is clear enough, but not ALL.
> >>
> >> So please make it a little more clear that it just queues work after
> >> a *Normal* RCU grace period. it supports only one RCU variant.
> >>
> >>
> >> > + *
> >> > + * Return: %false if @work was already on a queue, %true otherwise.
> >> > + */
> >>
> >> I'm afraid this will be a hard-using API.
> >>
> >> The user can't find a plan B when it returns false, especially when
> >> the user expects the work must be called at least once again
> >> after an RCU grace period.
> >>
> >> And the error-prone part of it is that, like other queue_work() functions,
> >> the return value of it is often ignored and makes the problem worse.
> >>
> >> So, since workqueue.c provides this API, it should handle this
> >> problem. For example, by calling call_rcu() again in this case, but
> >> everything will be much more complex: a synchronization is needed
> >> for "calling call_rcu() again" and allowing the work item called
> >> twice after the last queue_rcu_work() is not workqueue style.
> >
> > I confess that I had not thought of this aspect, but doesn't the
> > rcu_barrier() in v2 of this patchset guarantee that it has passed
> > the RCU portion of the overall wait time? Given that, what I am
> > missing is now this differs from flush_work() on a normal work item.
> >
> > So could you please show me the sequence of events that leads to this
> > problem? (Again, against v2 of this patch, which replaces the
> > synchronize_rcu() with rcu_barrier().)
>
> I mentioned a subtle use case that user would think it is supported
> since the comment doesn't disallow it.
>
> It is clear that the user expects
> the work must be called at least once after the API returns
I would have said "before the API returns" rather than "after the
API returns". What am I missing here?
> the work must be called after an RCU grace period
>
> But in the case when the user expects the work must be called
> at least once again after "queue_rcu_work() + an RCU grace period",
> the API is not competent to it if the work is queued.
> Although the user can detect it by the return value of
> queue_rcu_work(), the user hardly makes his expectation
> happen by adding appropriate code.
Beyond a certain point, I need to defer to Tejun on this, but I thought
that a false return meant that the work executed before the flush call.
Here I am assuming that the caller knows that the queue_work_rcu()
already happened and that another queue_work_rcu() won't be invoked on
that same work item until the flush completes. Without this sort of
assumption, I agree that there could be problems, but without these
assumptions it seems to me that you would have exactly the same problems
with the non-RCU APIs.
What am I missing?
Thanx, Paul
> >> Some would argue that the delayed_work has the same problem
> >> when the user expects the work must be called at least once again
> >> after a period of time. But time interval is easy to detect, the user
> >> can check the time and call the queue_delayed_work() again
> >> when needed which is also a frequent design pattern. And
> >> for rcu, it is hard to use this design pattern since it is hard
> >> to detect (new) rcu grace period without using call_rcu().
> >>
> >> I would not provide this API. it is not a NACK. I'm just trying
> >> expressing my thinking about the API. I'd rather RCU be changed
> >> and RCU callbacks are changed to be sleepable. But a complete
> >> overhaul cleanup on the whole source tree for compatibility
> >> is needed at first, an even more complex job.
> >
> > One downside of allowing RCU callback functions to sleep is that
> > one poorly written callback can block a bunch of other ones.
> > One advantage of Tejun's approach is that such delays only affect
> > the workqueues, which are already designed to handle such delays.
> >
> > Thanx, Paul
> >
> >> > +bool queue_rcu_work_on(int cpu, struct workqueue_struct *wq,
> >> > + struct rcu_work *rwork)
> >> > +{
> >> > + struct work_struct *work = &rwork->work;
> >> > +
> >> > + if (!test_and_set_bit(WORK_STRUCT_PENDING_BIT, work_data_bits(work))) {
> >> > + rwork->wq = wq;
> >> > + rwork->cpu = cpu;
> >> > + call_rcu(&rwork->rcu, rcu_work_rcufn);
> >> > + return true;
> >> > + }
> >> > +
> >> > + return false;
> >> > +}
> >> > +EXPORT_SYMBOL(queue_rcu_work_on);
> >> > +
> >>
> >
>
next prev parent reply other threads:[~2018-03-08 17:28 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-03-06 17:26 [PATCHSET] percpu_ref, RCU: Audit RCU usages in percpu_ref users Tejun Heo
2018-03-06 17:33 ` [PATCH 1/7] fs/aio: Add explicit RCU grace period when freeing kioctx Tejun Heo
2018-03-06 17:33 ` [PATCH 2/7] fs/aio: Use RCU accessors for kioctx_table->table[] Tejun Heo
2018-03-06 17:33 ` [PATCH 3/7] RDMAVT: Fix synchronization around percpu_ref Tejun Heo
2018-03-07 15:39 ` Dennis Dalessandro
2018-03-06 17:33 ` [PATCH 4/7] HMM: Remove superflous RCU protection around radix tree lookup Tejun Heo
2018-03-06 17:59 ` Jerome Glisse
2018-03-06 17:33 ` [PATCH 5/7] block: Remove superflous rcu_read_[un]lock_sched() in blk_queue_enter() Tejun Heo
2018-03-06 17:52 ` Bart Van Assche
2018-03-14 18:46 ` tj
2018-03-14 20:05 ` Bart Van Assche
2018-03-14 20:08 ` Peter Zijlstra
2018-03-14 20:14 ` Bart Van Assche
2018-03-06 17:33 ` [PATCH 6/7] percpu_ref: Update doc to dissuade users from depending on internal RCU grace periods Tejun Heo
2018-03-06 17:33 ` [PATCH 7/7] RCU, workqueue: Implement rcu_work Tejun Heo
2018-03-06 18:30 ` Linus Torvalds
2018-03-09 15:37 ` Tejun Heo
2018-03-07 2:49 ` Lai Jiangshan
2018-03-07 14:54 ` Paul E. McKenney
2018-03-07 16:23 ` Peter Zijlstra
2018-03-07 17:58 ` Paul E. McKenney
2018-03-08 0:29 ` Lai Jiangshan
2018-03-08 17:28 ` Paul E. McKenney [this message]
2018-03-09 16:21 ` Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180308172818.GN3918@linux.vnet.ibm.com \
--to=paulmck@linux.vnet.ibm.com \
--cc=bcrl@kvack.org \
--cc=jannh@google.com \
--cc=jiangshanlai+lkml@gmail.com \
--cc=kent.overstreet@gmail.com \
--cc=kernel-team@fb.com \
--cc=linux-kernel@vger.kernel.org \
--cc=security@kernel.org \
--cc=tj@kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).