* [PATCH] add unschedule_delayed_work to the workqueue API
@ 2004-10-18 16:31 James Bottomley
2004-10-18 21:25 ` Andrew Morton
0 siblings, 1 reply; 10+ messages in thread
From: James Bottomley @ 2004-10-18 16:31 UTC (permalink / raw)
To: Linux Kernel; +Cc: mingo
I'm in the process of moving some of our scsi timers which do more work
than just a few lines of code into schedule_work() instead. The problem
is that the workqueue API lacks the equivalent of del_timer_sync().
This patch adds it as unschedule_delayed_work() (and also adds the
unschedule_work() API---it's probably much less useful, but
unschedule_delayed_work() is best constructed on top of it).
The idea is that unschedule_delayed_work() will return 0 if the work has
already executed and 1 if it hasn't. It is guaranteed that either the
work was removed or has been executed by the time
unschedule_delayed_work() returns (and thus it needs process context to
wait for the work function if necessary).
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
James
===== include/linux/workqueue.h 1.9 vs edited =====
--- 1.9/include/linux/workqueue.h 2004-08-23 03:14:58 -05:00
+++ edited/include/linux/workqueue.h 2004-10-18 10:01:54 -05:00
@@ -69,6 +69,11 @@
extern int current_is_keventd(void);
extern int keventd_up(void);
+extern int remove_work(struct workqueue_struct *wq, struct work_struct *work);
+extern int remove_delayed_work(struct workqueue_struct *wq, struct work_struct *work);
+extern int unschedule_work(struct work_struct *work);
+extern int unschedule_delayed_work(struct work_struct *work);
+
extern void init_workqueues(void);
/*
===== kernel/workqueue.c 1.28 vs edited =====
--- 1.28/kernel/workqueue.c 2004-09-08 01:33:10 -05:00
+++ edited/kernel/workqueue.c 2004-10-18 09:55:09 -05:00
@@ -81,6 +81,7 @@
spin_lock_irqsave(&cwq->lock, flags);
work->wq_data = cwq;
+ clear_bit(1, &work->pending);
list_add_tail(&work->entry, &cwq->worklist);
cwq->insert_sequence++;
wake_up(&cwq->more_work);
@@ -170,6 +171,7 @@
BUG_ON(work->wq_data != cwq);
clear_bit(0, &work->pending);
f(data);
+ set_bit(1, &work->pending);
spin_lock_irqsave(&cwq->lock, flags);
cwq->remove_sequence++;
@@ -517,13 +519,94 @@
BUG_ON(!keventd_wq);
}
+/**
+ * remove_work - try to remove a piece of queued work from a workqueue
+ * @wq: the workqueue to remove it from
+ * @work: the item of work queued on the workqueue
+ *
+ * Tries to remove the work before it is executed. Guarantees that
+ * when it returns either the work is removed or it has been executed.
+ * Requires process context since it may sleep if the work function is
+ * executing.
+ *
+ * Returns 1 if work was successfully removed without executing and 0
+ * if the work was executed.
+ */
+int remove_work(struct workqueue_struct *wq, struct work_struct *work)
+{
+ int ret = 1, cpu = get_cpu();
+ struct cpu_workqueue_struct *cwq;
+ unsigned long flags;
+
+ if (unlikely(is_single_threaded(wq)))
+ cpu = 0;
+
+ cwq = &wq->cpu_wq[cpu];
+
+ spin_lock_irqsave(&cwq->lock, flags);
+ list_del_init(&work->entry);
+ if (test_and_clear_bit(0, &work->pending))
+ goto out_unlock;
+ /* work is not pending. It has either executed or is in the
+ * process of executing */
+ ret = 0;
+ while (!test_bit(1, &work->pending)) {
+ DEFINE_WAIT(wait);
+ prepare_to_wait(&cwq->work_done, &wait,
+ TASK_UNINTERRUPTIBLE);
+ spin_unlock_irq(&cwq->lock);
+ schedule();
+ spin_lock_irq(&cwq->lock);
+ finish_wait(&cwq->work_done, &wait);
+ }
+
+ out_unlock:
+ spin_unlock_irqrestore(&cwq->lock, flags);
+
+ put_cpu();
+ return ret;
+}
+
+/**
+ * remove_delayed_work - try to remove a piece of queued work from a workqueue
+ * @wq: the workqueue to remove it from
+ * @work: the item of work queued on the workqueue
+ *
+ * Tries to remove the work before it is executed. Guarantees that
+ * when it returns either the work is removed or it has been executed.
+ * Requires process context since it may sleep if the work function is
+ * executing.
+ *
+ * Returns 1 if work was successfully removed without executing and 0
+ * if the work was executed.
+ */
+int remove_delayed_work(struct workqueue_struct *wq, struct work_struct *work)
+{
+ if (del_timer_sync(&work->timer))
+ return 1;
+ return remove_work(wq, work);
+}
+
+int unschedule_work(struct work_struct *work)
+{
+ return remove_work(keventd_wq, work);
+}
+
+int unschedule_delayed_work(struct work_struct *work)
+{
+ return remove_delayed_work(keventd_wq, work);
+}
+
EXPORT_SYMBOL_GPL(__create_workqueue);
EXPORT_SYMBOL_GPL(queue_work);
EXPORT_SYMBOL_GPL(queue_delayed_work);
EXPORT_SYMBOL_GPL(flush_workqueue);
EXPORT_SYMBOL_GPL(destroy_workqueue);
+EXPORT_SYMBOL_GPL(remove_work);
EXPORT_SYMBOL(schedule_work);
EXPORT_SYMBOL(schedule_delayed_work);
EXPORT_SYMBOL(schedule_delayed_work_on);
EXPORT_SYMBOL(flush_scheduled_work);
+EXPORT_SYMBOL(unschedule_work);
+EXPORT_SYMBOL(unschedule_delayed_work);
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] add unschedule_delayed_work to the workqueue API
2004-10-18 16:31 [PATCH] add unschedule_delayed_work to the workqueue API James Bottomley
@ 2004-10-18 21:25 ` Andrew Morton
2004-10-18 21:26 ` James Bottomley
0 siblings, 1 reply; 10+ messages in thread
From: Andrew Morton @ 2004-10-18 21:25 UTC (permalink / raw)
To: James Bottomley; +Cc: linux-kernel, mingo
James Bottomley <James.Bottomley@SteelEye.com> wrote:
>
> I'm in the process of moving some of our scsi timers which do more work
> than just a few lines of code into schedule_work() instead. The problem
> is that the workqueue API lacks the equivalent of del_timer_sync().
The usual way of doing this is:
cancel_delayed_work(...);
flush_workqueue(...);
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] add unschedule_delayed_work to the workqueue API
2004-10-18 21:25 ` Andrew Morton
@ 2004-10-18 21:26 ` James Bottomley
2004-10-18 21:29 ` James Bottomley
0 siblings, 1 reply; 10+ messages in thread
From: James Bottomley @ 2004-10-18 21:26 UTC (permalink / raw)
To: Andrew Morton; +Cc: Linux Kernel, mingo
On Mon, 2004-10-18 at 16:25, Andrew Morton wrote:
> James Bottomley <James.Bottomley@SteelEye.com> wrote:
> >
> > I'm in the process of moving some of our scsi timers which do more work
> > than just a few lines of code into schedule_work() instead. The problem
> > is that the workqueue API lacks the equivalent of del_timer_sync().
>
> The usual way of doing this is:
>
> cancel_delayed_work(...);
That API doesn't seem to be in the vanilla kernel ... is it mm only?
> flush_workqueue(...);
Yes, but the flush_workqueue() has you waiting until every piece of work
on the workqueue has completed ... that creates an unacceptable delay in
the routine that wants it either cancelled or run.
James
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] add unschedule_delayed_work to the workqueue API
2004-10-18 21:26 ` James Bottomley
@ 2004-10-18 21:29 ` James Bottomley
2004-10-18 21:43 ` Andrew Morton
0 siblings, 1 reply; 10+ messages in thread
From: James Bottomley @ 2004-10-18 21:29 UTC (permalink / raw)
To: James Bottomley; +Cc: Andrew Morton, Linux Kernel, mingo
On Mon, 2004-10-18 at 16:26, James Bottomley wrote:
> On Mon, 2004-10-18 at 16:25, Andrew Morton wrote:
> > The usual way of doing this is:
> >
> > cancel_delayed_work(...);
OK, found it in the headers, sorry .. it's not synchronous, so it can't
really be used in most of the cases where we use del_timer_sync().
James
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] add unschedule_delayed_work to the workqueue API
2004-10-18 21:29 ` James Bottomley
@ 2004-10-18 21:43 ` Andrew Morton
2004-10-18 21:47 ` James Bottomley
0 siblings, 1 reply; 10+ messages in thread
From: Andrew Morton @ 2004-10-18 21:43 UTC (permalink / raw)
To: James Bottomley; +Cc: James.Bottomley, linux-kernel, mingo
James Bottomley <James.Bottomley@SteelEye.com> wrote:
>
> On Mon, 2004-10-18 at 16:26, James Bottomley wrote:
> > On Mon, 2004-10-18 at 16:25, Andrew Morton wrote:
> > > The usual way of doing this is:
> > >
> > > cancel_delayed_work(...);
>
> OK, found it in the headers, sorry .. it's not synchronous, so it can't
> really be used in most of the cases where we use del_timer_sync().
cancel_delayed_work() will tell you whether it successfully cancelled the
timer. If it didn't, you should run flush_workqueue() to wait on the final
handler. The combination of the two is synchronous.
The missing link is cancellation of a delayed work which re-adds itself and
where the calling code has no way of telling the handler to not re-arm
itself. There's a patch in -mm to add that function, but I don't like it.
That's cancel_rearming_delayed_work.patch.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] add unschedule_delayed_work to the workqueue API
2004-10-18 21:43 ` Andrew Morton
@ 2004-10-18 21:47 ` James Bottomley
2004-10-18 22:02 ` Andrew Morton
0 siblings, 1 reply; 10+ messages in thread
From: James Bottomley @ 2004-10-18 21:47 UTC (permalink / raw)
To: Andrew Morton; +Cc: Linux Kernel, mingo
On Mon, 2004-10-18 at 16:43, Andrew Morton wrote:
> James Bottomley <James.Bottomley@SteelEye.com> wrote:
> >
> > On Mon, 2004-10-18 at 16:26, James Bottomley wrote:
> > > On Mon, 2004-10-18 at 16:25, Andrew Morton wrote:
> > > > The usual way of doing this is:
> > > >
> > > > cancel_delayed_work(...);
> >
> > OK, found it in the headers, sorry .. it's not synchronous, so it can't
> > really be used in most of the cases where we use del_timer_sync().
>
> cancel_delayed_work() will tell you whether it successfully cancelled the
> timer. If it didn't, you should run flush_workqueue() to wait on the final
> handler. The combination of the two is synchronous.
Right, but it potentially does too much work for my purposes. I want to
cancel the work if it's cancellable or wait for it if it's already
executing. I don't want to have to wait for all the work in the queue
just because the timer fired and it got added to the workqueue schedule.
> The missing link is cancellation of a delayed work which re-adds itself and
> where the calling code has no way of telling the handler to not re-arm
> itself. There's a patch in -mm to add that function, but I don't like it.
> That's cancel_rearming_delayed_work.patch.
James
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] add unschedule_delayed_work to the workqueue API
2004-10-18 21:47 ` James Bottomley
@ 2004-10-18 22:02 ` Andrew Morton
2004-10-18 22:15 ` James Bottomley
0 siblings, 1 reply; 10+ messages in thread
From: Andrew Morton @ 2004-10-18 22:02 UTC (permalink / raw)
To: James Bottomley; +Cc: linux-kernel, mingo
James Bottomley <James.Bottomley@SteelEye.com> wrote:
>
> > > OK, found it in the headers, sorry .. it's not synchronous, so it can't
> > > really be used in most of the cases where we use del_timer_sync().
> >
> > cancel_delayed_work() will tell you whether it successfully cancelled the
> > timer. If it didn't, you should run flush_workqueue() to wait on the final
> > handler. The combination of the two is synchronous.
>
> Right, but it potentially does too much work for my purposes.
Are you sure?
> I want to
> cancel the work if it's cancellable or wait for it if it's already
> executing. I don't want to have to wait for all the work in the queue
> just because the timer fired and it got added to the workqueue schedule.
The probability that the handler is running when you call
cancel_delayed_work() is surely very low. And the probability that there
is more than one thing pending in the queue at that time is also low.
Multiplying them both together, then multiplying that by the relative
expense of the handler makes me say "show me" ;)
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] add unschedule_delayed_work to the workqueue API
2004-10-18 22:02 ` Andrew Morton
@ 2004-10-18 22:15 ` James Bottomley
2004-10-18 22:57 ` Andrew Morton
0 siblings, 1 reply; 10+ messages in thread
From: James Bottomley @ 2004-10-18 22:15 UTC (permalink / raw)
To: Andrew Morton; +Cc: Linux Kernel, mingo
On Mon, 2004-10-18 at 17:02, Andrew Morton wrote:
> James Bottomley <James.Bottomley@SteelEye.com> wrote:
> >
> > > > OK, found it in the headers, sorry .. it's not synchronous, so it can't
> > > > really be used in most of the cases where we use del_timer_sync().
> > >
> > > cancel_delayed_work() will tell you whether it successfully cancelled the
> > > timer. If it didn't, you should run flush_workqueue() to wait on the final
> > > handler. The combination of the two is synchronous.
> >
> > Right, but it potentially does too much work for my purposes.
>
> Are you sure?
I'm positive the potential is there.
> > I want to
> > cancel the work if it's cancellable or wait for it if it's already
> > executing. I don't want to have to wait for all the work in the queue
> > just because the timer fired and it got added to the workqueue schedule.
>
> The probability that the handler is running when you call
> cancel_delayed_work() is surely very low. And the probability that there
> is more than one thing pending in the queue at that time is also low.
> Multiplying them both together, then multiplying that by the relative
> expense of the handler makes me say "show me" ;)
OK. In the current code, domain validation is done from the workqueue
interface. This can take several seconds per target to complete. Why
should I have to wait this extra time. As I move other SCSI daemon
threads to being work queue items, these times rise.
However, now there's a worse problem. If I want to cancel a piece of
work synchronously, flush_scheduled_work() makes me dependent on the
execution of all the prior pieces of work. If some of them are related,
like Domain Validation and device unlocking say, I have to now be
extremely careful that the place I cancel and flush from isn't likely to
deadlock with any other work scheduled on the device. This makes it a
hard to use interface. By contrast, the proposed patch will *only* wait
if the item of work is currently executing. This is a (OK reasonably
given the aic del_timer_sync() issues) well understood deadlock
problem---the main point being I now don't have to consider any of the
other work that might be queued.
James
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] add unschedule_delayed_work to the workqueue API
2004-10-18 22:15 ` James Bottomley
@ 2004-10-18 22:57 ` Andrew Morton
2004-10-18 23:24 ` James Bottomley
0 siblings, 1 reply; 10+ messages in thread
From: Andrew Morton @ 2004-10-18 22:57 UTC (permalink / raw)
To: James Bottomley; +Cc: linux-kernel, mingo
James Bottomley <James.Bottomley@SteelEye.com> wrote:
>
> >
> > The probability that the handler is running when you call
> > cancel_delayed_work() is surely very low. And the probability that there
> > is more than one thing pending in the queue at that time is also low.
> > Multiplying them both together, then multiplying that by the relative
> > expense of the handler makes me say "show me" ;)
>
> OK. In the current code, domain validation is done from the workqueue
> interface. This can take several seconds per target to complete. Why
> should I have to wait this extra time. As I move other SCSI daemon
> threads to being work queue items, these times rise.
You mean the scsi code wants to schedule a work some time in the future and
that when it executes, the handler will then run for several seconds?
If so, it sounds like this is the wrong mechanism to use. We certainly
don't want keventd gone to lunch for that long, and even if the kernel
threads were privately created by the scsi code, you'll have to create many
such threads to get any sort of parallelism across many disks. I suspect
I'm missing something.
> However, now there's a worse problem. If I want to cancel a piece of
> work synchronously, flush_scheduled_work() makes me dependent on the
> execution of all the prior pieces of work. If some of them are related,
> like Domain Validation and device unlocking say, I have to now be
> extremely careful that the place I cancel and flush from isn't likely to
> deadlock with any other work scheduled on the device. This makes it a
> hard to use interface.
Well yes, the caller wouldn't want to be holding any locks which cause
currently-queued works to block.
> By contrast, the proposed patch will *only* wait
> if the item of work is currently executing. This is a (OK reasonably
> given the aic del_timer_sync() issues) well understood deadlock
> problem---the main point being I now don't have to consider any of the
> other work that might be queued.
OK.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] add unschedule_delayed_work to the workqueue API
2004-10-18 22:57 ` Andrew Morton
@ 2004-10-18 23:24 ` James Bottomley
0 siblings, 0 replies; 10+ messages in thread
From: James Bottomley @ 2004-10-18 23:24 UTC (permalink / raw)
To: Andrew Morton; +Cc: Linux Kernel, mingo
On Mon, 2004-10-18 at 17:57, Andrew Morton wrote:
> James Bottomley <James.Bottomley@SteelEye.com> wrote:
> You mean the scsi code wants to schedule a work some time in the future and
> that when it executes, the handler will then run for several seconds?
>
> If so, it sounds like this is the wrong mechanism to use. We certainly
> don't want keventd gone to lunch for that long, and even if the kernel
> threads were privately created by the scsi code, you'll have to create many
> such threads to get any sort of parallelism across many disks. I suspect
> I'm missing something.
There is a case where this can happen, yes, but I admit it's not the
common one...most DV's are kicked off without a thread from user context
(either in the initial scan or by user request). The common case where
we use a workqueue is if an asynchronous event that alters the transport
agreement comes in (mainly a bus reset) via an interrupt.
> > However, now there's a worse problem. If I want to cancel a piece of
> > work synchronously, flush_scheduled_work() makes me dependent on the
> > execution of all the prior pieces of work. If some of them are related,
> > like Domain Validation and device unlocking say, I have to now be
> > extremely careful that the place I cancel and flush from isn't likely to
> > deadlock with any other work scheduled on the device. This makes it a
> > hard to use interface.
>
> Well yes, the caller wouldn't want to be holding any locks which cause
> currently-queued works to block.
It's not just locks (hopefully there shouldn't be any of these). It's
also device states.
For instance, DV will attempt to quiesce the device (i.e. wait for it's
currently outstanding commands to complete). Thus, I can't use the
cancel/flush method in any code path (such as may be called from error
handling) which could itself be part of the completion path.
This dependence is caused by having to wait for everything that could
potentially be on the workqueue to complete. I'd really prefer an API
that didn't have this potentially nasty entanglement, which is why I
coded mine to only do the wait if the piece of work being removed was in
the process of executing, but not otherwise.
> > By contrast, the proposed patch will *only* wait
> > if the item of work is currently executing. This is a (OK reasonably
> > given the aic del_timer_sync() issues) well understood deadlock
> > problem---the main point being I now don't have to consider any of the
> > other work that might be queued.
>
> OK.
James
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2004-10-18 23:24 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-10-18 16:31 [PATCH] add unschedule_delayed_work to the workqueue API James Bottomley
2004-10-18 21:25 ` Andrew Morton
2004-10-18 21:26 ` James Bottomley
2004-10-18 21:29 ` James Bottomley
2004-10-18 21:43 ` Andrew Morton
2004-10-18 21:47 ` James Bottomley
2004-10-18 22:02 ` Andrew Morton
2004-10-18 22:15 ` James Bottomley
2004-10-18 22:57 ` Andrew Morton
2004-10-18 23:24 ` James Bottomley
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox