From mboxrd@z Thu Jan 1 00:00:00 1970 From: Steven Whitehouse Date: Thu, 09 Sep 2010 15:06:17 +0100 Subject: [Cluster-devel] GFS2: Use new workqueue scheme In-Reply-To: <4C88E5BD.6070400@kernel.org> References: <1284035767.2468.15.camel@localhost> <4C88DEB9.90600@kernel.org> <1284039908.2468.39.camel@localhost> <4C88E5BD.6070400@kernel.org> Message-ID: <1284041177.2468.45.camel@localhost> List-Id: To: cluster-devel.redhat.com MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Hi, On Thu, 2010-09-09 at 15:48 +0200, Tejun Heo wrote: > Hello, > > On 09/09/2010 03:45 PM, Steven Whitehouse wrote: > > On Thu, 2010-09-09 at 15:18 +0200, Tejun Heo wrote: > >> Hello, Steven. > >> > >> Thanks for working on this. > >> > > I think it will be a big win for GFS2, particularly as the number of cpu > > cores increases > > Awesome. :-) > > >>> - glock_workqueue = create_workqueue("glock_workqueue"); > >>> + glock_workqueue = alloc_workqueue("glock_workqueue", WQ_RESCUER | > >>> + WQ_HIGHPRI | WQ_CPU_INTENSIVE | > >>> + WQ_FREEZEABLE, 0); > >> > >> Does this really need WQ_HIGHRPI and WQ_CPU_INTENSIVE? > >> > > This would be a tasklet were it not for the fact that it needs to be > > able to submit block I/O from time to time. It does need to be as fast > > as possible since it directly affects the latency of operations using > > large numbers of inodes. > > > > I read your latest set of docs before assigning the flags, so I hope > > I've understood it correctly. > > > > The glock workqueue is involved in sending requests to the DLM and > > processing the results of those requests, waking up waiting processes as > > quickly as possible. > > I see but then wouldn't WQ_CPU_INTENSIVE be unnecessary? It's high > priority but doesn't sound like they're gonna hog huge amount of CPU > cycles. Also, please note that the high priority is global across all > workqueues and thus _must_ be used judiciously. Well, if you were > gonna use tasklets for it, it probably is a good candidate tho. > Ah, I see. Maybe I misunderstood. I read the bit about using both WQ_HIGHPRO and WQ_CPU_INTENSIVE which says: "Work items queued on a highpri CPU-intensive wq start execution as soon as resources are available and don't affect execution of other work items." and assumed that was what I needed, but maybe I don't need to WQ_CPU_INTENSIVE as you suggest. > >>> gfs_recovery_wq = alloc_workqueue("gfs_recovery", > >>> - WQ_NON_REENTRANT | WQ_RESCUER, 0); > >>> + WQ_NON_REENTRANT | WQ_RESCUER | > >>> + WQ_UNBOUND | WQ_FREEZEABLE, 0); > >> > >> And do these need to be WQ_UNBOUND? Unless the flags are specifically > >> needed, I think it would be better to stick with the default. I'm > >> currently working on the documentation. It's still not complete but > >> please take a look for more information the behaviors of each flag. > > > > I wouldn't say that it was 100% a requirement, but they are long running > > (potentially a few seconds, or even as far as a minute or two in extreme > > cases). The recovery workqueue seems to meet this criteria: > > Long running doesn't matter. Normal workqueues can handle them > perfectly fine. The only cases you would want to use unbound > workqueues are long running CPU hogs and (very) high fluctuation in > the number of concurrent work items. > It sounds like maybe the delete workqueue needs that, but that the recovery one certainly doesn't in that case. > >> * Long running CPU intensive workloads which can be better > >> managed by the system scheduler. > > > > and the delete_workqueue seems to meet this criteria: > > > >> * Wide fluctuation in the concurrency level requirement is > >> expected and using bound wq may end up creating large number > >> of mostly unused workers across different CPUs as the issuer > >> hops through different CPUs. > > > > It may be that I didn't understand the docs correctly, but I think I've > > found the right flags. The delete_workqueue is usually unused during > > normal fs operation, but occasionally it might have a lot to do. It was > > made a separate workqueue because it needs to be able to manipulate > > glocks and thus must never block the glock workqueue. > > Heh, these being one of the first conversions, I just wanna make sure. > Long running CPU-intensive tasks would be things like works running > RAID checksums, crypto stuff, IOW, stuff which are actually gonna > perform a long calculation. If a work is just gonna be blocking on > locks for long period of time, there's no need to use the unbound > ones. So, unless I'm misunderstanding, I don't really think > WQ_UNBOUND is necessary for the latter two. > > Thanks. > Yes, I'll try it without and see if that is ok. I am also trying to be a bit cautious about the flags in case I accidentally introduce some dependency which was not there before. I'll follow up with an updated patch shortly, Steve.