From: Tejun Heo <tj@kernel.org>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org, torvalds@linux-foundation.org,
mingo@redhat.com, akpm@linux-foundation.org,
peterz@infradead.org
Subject: Re: [PATCHSET] timer: clean up initializers and implement irqsafe timers
Date: Tue, 14 Aug 2012 12:15:49 -0700 [thread overview]
Message-ID: <20120814191549.GX25632@google.com> (raw)
In-Reply-To: <alpine.LFD.2.02.1208131720070.32033@ionos>
Hello, Thomas.
On Tue, Aug 14, 2012 at 08:55:16PM +0200, Thomas Gleixner wrote:
> > * mod_delayed_work() can't be used from IRQ handlers.
>
> This function does not exist. So what?
It makes the workqueue users messy. It's difficult to get completely
correct and subtle errors are difficult to detect / verify.
> > * __cancel_delayed_work() can't use the usual try_to_grab_pending()
> > which handles all three states but instead only deals with the first
> > state using a separate implementation. There's no way to make a
> > delayed_work not pending from IRQ handlers.
>
> And why is that desired and justifies the mess you are trying to
> create in the timer code?
Because API forcing its users to be messy is stupid.
> > * The context / behavior differences among cancel_delayed_work(),
> > __cancel_delayed_work(), cancel_delayed_work_sync() are subtle and
> > confusing (the first two are mostly historical tho).
>
> We have a lot of subtle differences in interfaces for similar
> reasons.
And we should work to make them better.
> > This patchset implements irqsafe timers. For an irqsafe timer, IRQ is
> > not enabled from dispatch till the end of its execution making it safe
> > to drain the timer regardless of context. This will enable cleaning
> > up delayed_work interface.
>
> By burdening crap on the timer code. We had a similar context case
> handling in the original hrtimers code and Linus went berserk on it.
> There is no real good reason to reinvent it in a different flavour.
>
> Your general approach about workqueues seems to be adding hacks into
> other sensitive code paths [ see __schedule() ]. Can we please stop
> that? workqueues are not so special to justify that.
The schedule thing worked out pretty well, didn't it? If it improves
the kernel in general, I don't see why timer shouldn't participate in
it. Timer ain't that special either. However, it does suck to add
one-off feature which isn't used by anyone else but I couldn't find a
better way.
So, if you can think of something better, sure. Let's see.
> Right now delayed work arms a timer, whose callback enqueues the work
> and wakes the worker thread, which then executes the work.
>
> So what about changing delayed_work into:
>
> struct delayed_work {
> struct work_struct work;
> unsigned long expires;
> };
>
> Now when delayed work gets scheduled it gets enqueued into a separate
> list in the workqueue core with the proper worker lock held. Then
> check the expiry time of the new work against the current expiry time
> of a timer in the worker itself.
Work items aren't assigned to worker on queue. It's a shared worker
pool. Workers take work items when they can.
> If the new expiry time is after the
> current expiry time, nothing to do. If the new expiry is before the
> current expiry time or the timer is not armed, then (re)arm the timer.
>
> When the timer expires it wakes the worker and that evaluates the
> delayed list for expired works and executes them and rearms the timer
> if necessary.
How are you gonna decide which worker a delayed work item should be
queued on? What if the work item before it takes a very long time to
finish? Do we migrate those work items to a different worker?
> To cancel delayed work you don't have to worry about the timer
> callback being executed at all, simply because the timer callback is
> just a wakeup of the worker and not fiddling with the work itself. If
> the work is removed before the worker thread runs, life goes on as
> usual.
>
> So all you have to do is to remove the work from the delayed list. If
> the timer is armed, just leave it alone and let it fire. Canceling
> delayed work is probably not a high frequency operation.
>
> In fact that would make cancel_delayed_work and cancel_work basically
> the same operation.
>
> I have no idea how many concurrent delayed works are on the fly, so I
> can't tell whether a simple ordered list is sufficient or if you need
> a tree which is better suited for a large number of sorted items. But
> that's a trivial to solve detail.
Aside from work <-> worker association confusion, you're basically
suggesting for workqueue to implement its own tvec_base in suboptimal
way. Doesn't make much sense to me.
Thanks.
--
tejun
next prev parent reply other threads:[~2012-08-14 19:15 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-08-08 18:10 Tejun Heo
2012-08-08 18:10 ` [PATCH 1/4] timer: generalize timer->base flags handling Tejun Heo
2012-08-21 16:40 ` [tip:timers/core] timer: Generalize " tip-bot for Tejun Heo
2012-08-08 18:10 ` [PATCH 2/4] timer: relocate declarations of init_timer_on_stack_key() Tejun Heo
2012-08-21 16:41 ` [tip:timers/core] timer: Relocate " tip-bot for Tejun Heo
2012-08-08 18:10 ` [PATCH 3/4] timer: clean up timer initializers Tejun Heo
2012-08-21 16:42 ` [tip:timers/core] timer: Clean " tip-bot for Tejun Heo
2012-08-08 18:10 ` [PATCH 4/4] timer: implement TIMER_IRQSAFE Tejun Heo
2012-08-21 16:43 ` [tip:timers/core] timer: Implement TIMER_IRQSAFE tip-bot for Tejun Heo
2012-08-21 19:26 ` Tejun Heo
2012-08-08 18:13 ` $SUBJ should have been "[PATCHSET] timer: clean up initializers and implement irqsafe timers" Tejun Heo
2012-08-13 23:35 ` [PATCHSET] timer: clean up initializers and implement irqsafe timers Tejun Heo
2012-08-14 8:32 ` Thomas Gleixner
2012-08-14 19:16 ` Thomas Gleixner
2012-08-14 19:22 ` Tejun Heo
2012-08-14 21:03 ` Thomas Gleixner
2012-08-14 21:56 ` Tejun Heo
2012-08-14 22:45 ` Thomas Gleixner
2012-08-14 23:01 ` Tejun Heo
2012-08-14 23:33 ` Thomas Gleixner
2012-08-15 0:18 ` Tejun Heo
2012-08-15 10:58 ` Peter Zijlstra
2012-08-16 19:36 ` Tejun Heo
2012-08-14 18:55 ` Thomas Gleixner
2012-08-14 19:15 ` Tejun Heo [this message]
2012-08-14 20:43 ` Thomas Gleixner
2012-08-14 21:40 ` Tejun Heo
2012-08-14 23:12 ` Thomas Gleixner
2012-08-14 23:27 ` Tejun Heo
2012-08-14 23:46 ` Thomas Gleixner
2012-08-14 23:52 ` Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120814191549.GX25632@google.com \
--to=tj@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.