From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrew Morton Subject: Re: workqueues and percpu (was: [PATCH] dm: remake of the verity target) Date: Thu, 8 Mar 2012 15:30:48 -0800 Message-ID: <20120308153048.4a80de34.akpm@linux-foundation.org> References: <1330648393-20692-1-git-send-email-msb@chromium.org> <20120306215947.GB27051@google.com> <20120308143909.bfc4cb4d.akpm@linux-foundation.org> <20120308231521.GA2968@htj.dyndns.org> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20120308231521.GA2968@htj.dyndns.org> Sender: linux-kernel-owner@vger.kernel.org To: Tejun Heo Cc: Mikulas Patocka , Mandeep Singh Baines , linux-kernel@vger.kernel.org, dm-devel@redhat.com, Alasdair G Kergon , Will Drewry , Elly Jones , Milan Broz , Olof Johansson , Steffen Klassert , Rusty Russell List-Id: dm-devel.ids On Thu, 8 Mar 2012 15:15:21 -0800 Tejun Heo wrote: > > I'm not sure what we can do about it really, apart from blocking unplug > > until all the target CPU's workqueues have been cleared. And/or refusing > > to unplug a CPU until all pinned-to-that-cpu kernel threads have been > > shut down or pinned elsewhere (which is the same thing, only more > > general). > > > > Tejun, is this new behaviour? I do recall that a long time ago we > > wrestled with unplug-vs-worker-threads and I ended up OK with the > > result, but I forget what it was. IIRC Rusty was involved. > > Unfortunately, yes, this is a new behavior. Before, we could have > unbound delays during unplug from work items. Now, we have CPU > affinity assumption breakage. Ow, didn't know that. > The behavior change was primarily to > allow long running work items to use regular workqueues without > worrying about inducing delay across cpu hotplug operations, which is > important as it's also used on suspend / hibernation, especially on > mobile platforms. Well.. why did we want to support these long-running work items? They're abusive, aren't they? Where are they? > During the cmwq conversion, I ended up auditing a lot of (I think I > went through most of them) workqueue users and IIRC there weren't too > many which required stable affinity. > > > That being said, I don't think it's worth compromising the DM code > > because of this workqueue wart: lots of other code has the same wart, > > and we should find a centralised fix for it. > > Probably the best way to solve this is introducing pinned attribute to > workqueues and have them drained automatically on cpu hotplug events. > It'll require auditing workqueue users but I guess we'll just have to > do it given that we need to actually distinguish the ones need to be > pinned. That will make future use of workqueues more complex and people will screw it up. > Or maybe we can use explicit queue_work_on() to distinguish > the ones which require pinning. > > Another approach would be requiring all workqueues to be drained on > cpu offlining and requiring any work item which may stall to use > unbound wq. IMHO, picking out the ones which may stall would be much > less obvious than the ones which require cpu pinning. I'd be surprised if it's *that* hard to find and fix the long-running work items. Hopefully most of them are already using create_freezable_workqueue() or create_singlethread_workqueue(). I wonder if there's some debug code we can put in workqueue.c to detect when a pinned work item takes "too long".