From: Tejun Heo <tj@kernel.org>
To: Roland Dreier <rdreier@cisco.com>
Cc: linux-kernel@vger.kernel.org, Roland Dreier <rolandd@cisco.com>,
Sean Hefty <sean.hefty@intel.com>,
Hal Rosenstock <hal.rosenstock@gmail.com>
Subject: Re: [PATCH 01/30] infiniband: update workqueue usage
Date: Thu, 16 Dec 2010 17:50:58 +0100 [thread overview]
Message-ID: <4D0A4372.2010503@kernel.org> (raw)
In-Reply-To: <adad3p2hony.fsf@cisco.com>
Hello, Roland. Sorry about the delay.
On 12/15/2010 07:33 PM, Roland Dreier wrote:
> Thanks Tejun. A couple questions:
>
> > * ib_wq is added, which is used as the common workqueue for infiniband
> > instead of the system workqueue. All system workqueue usages
> > including flush_scheduled_work() callers are converted to use and
> > flush ib_wq. This is to prepare for deprecation of
> > flush_scheduled_work().
>
> Why do we want to move to a subsystem-specific workqueue? Can we just
> replace flush_scheduled_work() by cancel_delayed_work_sync() as
> appropriate and not create yet another work queue?
Because there are places where work is used to free the containing
structure. Before a module is unloaded, all works which uses
functions in the module should be flushed; however, if a work is used
to free the containing structure, such work can't be flushed
explicitly, so the workqueue which processes such works should be
flushed.
So, in this case, ib_wq is added primarily to serve as a flush domain.
For driver midlayers, this seems often necessary. Also, the workqueue
doesn't have any dedicated worker and is quite cheap.
>
> > * qib_wq is removed and ib_wq is used instead.
>
> You obviously looked at the comment
>
> - /*
> - * We create our own workqueue mainly because we want to be
> - * able to flush it when devices are being removed. We can't
> - * use schedule_work()/flush_scheduled_work() because both
> - * unregister_netdev() and linkwatch_event take the rtnl lock,
> - * so flush_scheduled_work() can deadlock during device
> - * removal.
> - */
> - qib_wq = create_workqueue("qib");
>
> and know that with the new workqueue stuff, this issue no longer
> exists. But for both my education and also the clarity of the changelog
> for this patch, perhaps you could expand on why ib_wq is safe here.
I think I got confused. I thought the comment was indicating the
separation between qib_wq and qib_cq_wq. It's between system_wq and
qib_wq, right? I'll drop this part from the series, but then again
what's the difference from ib_srp, which flushes the common workqueue?
Why doesn't ib_srp have the same problem?
> > * create[_singlethread]_workqueue() usages are replaced with the new
> > alloc[_ordered]_workqueue(). This removes rescuers from all
> > infiniband workqueues.
>
> What are rescuers?
Normally, all workqueues share global per-cpu worker pool, but certain
workqueues needs forward progress guarantee under memory pressure (the
ones which are used to free memory). In this case, the workqueues are
created with WQ_MEM_RECLAIM and has a single rescuer worker reserved.
So, any workqueue which is in memory reclaim path needs to have the
flag set to avoid the unlikely but still possible deadlock under
memory pressure.
> Can we replace some of these driver-specific work queues by the ib_wq?
>
> Are all these things just possibilities for future cleanup?
Hmm... Yeah, sure, they can be. With the new implementation, separate
workqueues are used for the following purposes.
* As a forward progress guarantee domain as decribed above.
* As a flushing domain.
* As a property domain. Different workqueues have different execution
and queueing properties set.
Unless one of the above is necessary, work items can be queued
together into the same workqueue. Concurrency-wise, it wouldn't make
any difference. They all use the same set of workers anyway, but I
don't know the code well enough to make the changes myself. If you're
interested in doing it, I'll be happy to help.
Thanks.
--
tejun
next prev parent reply other threads:[~2010-12-16 16:51 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <1292086307-19211-1-git-send-email-tj@kernel.org>
2010-12-11 16:51 ` [PATCH 27/30] mtd: don't use flush_scheduled_work() Tejun Heo
2010-12-14 16:53 ` Artem Bityutskiy
2010-12-14 16:53 ` Artem Bityutskiy
[not found] ` <1292086307-19211-8-git-send-email-tj@kernel.org>
2010-12-11 21:33 ` [PATCH 07/30] ocfs2: " Joel Becker
[not found] ` <1292086307-19211-10-git-send-email-tj@kernel.org>
2010-12-12 8:56 ` [PATCH 09/30] sound: " Takashi Iwai
2010-12-12 9:15 ` Tejun Heo
2010-12-12 12:38 ` Takashi Iwai
2010-12-12 12:40 ` Mark Brown
2010-12-12 16:02 ` Tejun Heo
2010-12-12 18:47 ` Mark Brown
2010-12-12 18:50 ` Tejun Heo
2010-12-12 19:00 ` Mark Brown
2010-12-13 13:32 ` Liam Girdwood
2010-12-13 16:36 ` Olaya, Margarita
2010-12-13 8:34 ` Takashi Iwai
2010-12-13 13:29 ` Liam Girdwood
2010-12-13 16:12 ` Mark Brown
[not found] ` <1292086307-19211-21-git-send-email-tj@kernel.org>
[not found] ` <20101211201931.00414518@endymion.delvare>
2010-12-12 13:37 ` [PATCH 20/30] macintosh/ams: " Michael Hanselmann
2010-12-13 10:16 ` Stelian Pop
2010-12-13 10:18 ` Jean Delvare
2010-12-13 10:40 ` [PATCH] Remove myself from MAINTAINERS as I no longer have the hardware to test Stelian Pop
2010-12-16 9:58 ` Jean Delvare
[not found] ` <1292086307-19211-22-git-send-email-tj@kernel.org>
2010-12-13 8:44 ` [PATCH 21/30] pcmcia/ipwireless: don't use flush_scheduled_work() David Sterba
[not found] ` <1292086307-19211-6-git-send-email-tj@kernel.org>
2010-12-13 11:21 ` [PATCH 05/30] net/dsa: " Lennert Buytenhek
[not found] ` <1292086307-19211-29-git-send-email-tj@kernel.org>
2010-12-22 0:07 ` [PATCH 28/30] battery: " Anton Vorontsov
2010-12-24 15:02 ` [PATCHSET] workqueue: remove flush_scheduled_work() usage Tejun Heo
[not found] ` <1292086307-19211-2-git-send-email-tj@kernel.org>
2010-12-15 18:33 ` [PATCH 01/30] infiniband: update workqueue usage Roland Dreier
2010-12-16 16:50 ` Tejun Heo [this message]
2010-12-23 21:47 ` David Dillow
2011-01-17 5:21 ` Roland Dreier
2011-01-24 11:06 ` [PATCH] RDMA: update missed converion of flush_scheduled_work() Tejun Heo
2011-01-24 11:12 ` [PATCH RFC] RDMA: use alloc[_ordered]_workqueue() Tejun Heo
2011-01-29 0:40 ` [PATCH] RDMA: update missed converion of flush_scheduled_work() Roland Dreier
2011-01-31 10:36 ` Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4D0A4372.2010503@kernel.org \
--to=tj@kernel.org \
--cc=hal.rosenstock@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=rdreier@cisco.com \
--cc=rolandd@cisco.com \
--cc=sean.hefty@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.