From: Peter Zijlstra <peterz@infradead.org>
To: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Jarek Poplawski <jarkao2@o2.pl>,
Max Krasnyansky <maxk@qualcomm.com>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH] workqueues: insert_work: use "list_head *" instead of "int tail"
Date: Fri, 13 Jun 2008 17:32:12 +0200 [thread overview]
Message-ID: <1213371132.16944.49.camel@twins> (raw)
In-Reply-To: <20080613151725.GA9194@tv-sign.ru>
On Fri, 2008-06-13 at 19:17 +0400, Oleg Nesterov wrote:
> On 06/13, Peter Zijlstra wrote:
> >
> > On Fri, 2008-06-13 at 18:26 +0400, Oleg Nesterov wrote:
> > >
> > > Yes, we have a free bit... but afaics we can do better.
> > >
> > > struct context_barrier {
> > > struct work_struct work;
> > > struct flush_context *fc;
> > > ...
> > > }
> > >
> > > void context_barrier_barrier_func(struct work_struct *work)
> > > {
> > > struct flush_context *fc = container_of();
> > > if (atomic_dec_and_test())
> > > ...
> > > }
> > >
> > > void insert_context_barrier(work, barr)
> > > {
> > > ...insert barr after work, like flush_work() does...
> > > }
> > >
> > > queue_work_contex(struct workqueue_struct *wq,
> > > struct work_struct *work,
> > > struct flush_context *fc)
> > > {
> > > int ret = queue_work(wq, work);
> > > if (ret)
> > > insert_context_barrier(work, barr);
> > > return ret;
> > > }
> > >
> > > this way we shouldn't change run_workqueue() and introduce a "parallel"
> > > larger work_struct which needs its own INIT_()/etc.
> >
> > Where does do the context_barrier instances come from, are they
> > allocated in insert_context_barrier() ?
>
> Ah, indeed. We have to kmalloc() or use a larger work_struct... And if we
> use a larger work_struct, we can make
>
> struct work_struct_context
> {
> struct work_struct work;
> work_func_t real_func;
> struct flush_context *fc;
> };
>
> static void work_context_func(struct work_struct *work)
> {
> struct work_struct_context *wc = container_of();
> struct flush_context *fc = wc->context;
>
> wc->real_func(work);
>
> if (atomic_dec_and_test(fc->count))
> ...
> }
>
> to avoid changing run_workqueue(), but this is nasty. Perhaps flush_context
> can be just array or list of queued work_structs, then we can do flush_work()
> for each work_struct...
list would grow work_struct with a list_head, and sizeof(list_head) >
sizeof(conetxt *), and an array would require krealloc().
Neither sounds really appealing..
Anyway, I think before we go further down this road, we'd better see if
anybody actually needs this. Not that theorizing about this problem
isn't fun,... but... :-)
> > > > making all this PI savvy for -rt is going to be fun though.. I guess we
> > > > can just queue a normal barrier of the flusher's priority, and cancel it
> > > > once we complete.. hey - that doesn't sound hard at all :-)
> > >
> > > Yes!!! I think this is much better (because _much_ simple) than re-ordering
> > > the pending work_struct's, we can just boost the whole ->worklist. We can
> > > implement flush_work_pi() in the same manner as queue_work_contex() above.
> > > That is why I said previously that flush_() should govern the priority,
> > > not queue.
> > >
> > > But we can also implement queue_work_pi(struct work_struct_pi *work).
> >
> > in -rt all the workqueue stuff is PI already, we replaced the list_head
> > in work_struct with a plist_node and queue_work() enqueues worklets at
> > the ->normal_prio of the calling task.
>
> Oh well. IIRC, these patches really uglified^Wcompicated the code. Don't
> get me wrong, if I knew how to make this better, I'd sent the patch ;)
>
> > Hmm, the required barrier might still spoil - or at least complicate the
> > matter.
> >
> > In order to boost the pending work, we'd need to enqueue a barrer before
> > the high prio worklet - otherwise the high prio worklet will complete
> > before the work we're waiting on.
>
> flush_work_pi() can boost the priority, and then barrier restores it.
> Can't this work?
>
> > And we can cancel the high prio worklet once all our worklets of
> > interrest are done, but we cannot cancel the barrier.
>
> Yes, if we cancel the high prio worklet, this doesn't restore the prio
> immediately.
/me reads back that -rt barrier code,... *groan* head explodes.. I'm
sure we can make it drop the prio on cancel, but I'd have to take a
proper look at it.
But getting rid of the barrier will be very tricky. So it will always
have some side-effects..
next prev parent reply other threads:[~2008-06-13 15:32 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-06-12 16:51 [PATCH] workqueues: insert_work: use "list_head *" instead of "int tail" Oleg Nesterov
2008-06-12 16:55 ` Oleg Nesterov
2008-06-12 17:01 ` Peter Zijlstra
2008-06-12 17:44 ` Oleg Nesterov
2008-06-12 18:38 ` Peter Zijlstra
2008-06-13 14:26 ` Oleg Nesterov
2008-06-13 14:43 ` Peter Zijlstra
2008-06-13 15:17 ` Oleg Nesterov
2008-06-13 15:32 ` Peter Zijlstra [this message]
2008-06-24 5:41 ` Max Krasnyansky
2008-06-12 22:24 ` Jarek Poplawski
2008-06-13 10:13 ` Jarek Poplawski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1213371132.16944.49.camel@twins \
--to=peterz@infradead.org \
--cc=akpm@linux-foundation.org \
--cc=jarkao2@o2.pl \
--cc=linux-kernel@vger.kernel.org \
--cc=maxk@qualcomm.com \
--cc=oleg@tv-sign.ru \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.