* [PATCH] workqueue: remove the del_timer_sync()s in maybe_create_worker()
@ 2014-07-13 15:31 Lai Jiangshan
2014-07-14 8:13 ` [PATCH 0/1 V2] workqueue: a tiny fix for the mayday timer Lai Jiangshan
0 siblings, 1 reply; 9+ messages in thread
From: Lai Jiangshan @ 2014-07-13 15:31 UTC (permalink / raw)
To: linux-kernel; +Cc: Tejun Heo, Lai Jiangshan
It is said in the document that the timer which is being
deleted by del_timer_sync() should not be restarted:
Synchronization rules: Callers must prevent restarting of
the timer, otherwise this function is meaningless.
Repeating timer may cause the del_timer_sync() spin longer,
or even spin forever in very very very very extreme condition.
It is not considered a bug for me, but it disobeys the document.
To fix it, we need:
1) don't requeue the mayday timer when !need_to_create_worker()
it is a preparation/cleaup step for the fix. Otherwise the timer
will repeat forever if we only do the 2).
2) remove the del_timer_sync()s in maybe_create_worker()
Fix the problem. It is not enough if we only do the 1),
need_to_create_worker() would be probably true when the time
del_timer_sync() is called, the timer is still repeating.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
kernel/workqueue.c | 6 ++----
1 files changed, 2 insertions(+), 4 deletions(-)
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 35974ac..593b9bc 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1866,12 +1866,12 @@ static void pool_mayday_timeout(unsigned long __pool)
*/
list_for_each_entry(work, &pool->worklist, entry)
send_mayday(work);
+
+ mod_timer(&pool->mayday_timer, jiffies + MAYDAY_INTERVAL);
}
spin_unlock(&pool->lock);
spin_unlock_irq(&wq_mayday_lock);
-
- mod_timer(&pool->mayday_timer, jiffies + MAYDAY_INTERVAL);
}
/**
@@ -1913,7 +1913,6 @@ restart:
worker = create_worker(pool);
if (worker) {
- del_timer_sync(&pool->mayday_timer);
spin_lock_irq(&pool->lock);
start_worker(worker);
if (WARN_ON_ONCE(need_to_create_worker(pool)))
@@ -1931,7 +1930,6 @@ restart:
break;
}
- del_timer_sync(&pool->mayday_timer);
spin_lock_irq(&pool->lock);
if (need_to_create_worker(pool))
goto restart;
--
1.7.7.6
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 0/1 V2] workqueue: a tiny fix for the mayday timer
2014-07-13 15:31 [PATCH] workqueue: remove the del_timer_sync()s in maybe_create_worker() Lai Jiangshan
@ 2014-07-14 8:13 ` Lai Jiangshan
2014-07-14 8:13 ` [PATCH 1/1 V2] workqueue: remove the del_timer_sync()s in maybe_create_worker() Lai Jiangshan
0 siblings, 1 reply; 9+ messages in thread
From: Lai Jiangshan @ 2014-07-14 8:13 UTC (permalink / raw)
To: linux-kernel; +Cc: Tejun Heo, Lai Jiangshan
Hi, TJ
The del_timer_sync()s for the mayday timer in maybe_create_worker()
do not meet the requirement of the document of the del_timer_sync().
V1 patch incorrectly stops the timer when !need_to_create_worker(),
but no one can restart it when needed. In V2, we stops the
timer only when manager is not working.
Sorry for the incorrect V1 due to a general excuse: World-Cup-2014.
Another patch "[PATCH 1/1 V2] workqueue: unfold start_worker() into create_worker()"
depends on this patch. That patch still coincides with this patch even
this patch is updated.
Thanks,
Lai
Lai Jiangshan (1):
workqueue: remove the del_timer_sync()s in maybe_create_worker()
kernel/workqueue.c | 7 +++----
1 files changed, 3 insertions(+), 4 deletions(-)
--
1.7.4.4
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 1/1 V2] workqueue: remove the del_timer_sync()s in maybe_create_worker()
2014-07-14 8:13 ` [PATCH 0/1 V2] workqueue: a tiny fix for the mayday timer Lai Jiangshan
@ 2014-07-14 8:13 ` Lai Jiangshan
2014-07-14 14:27 ` Tejun Heo
0 siblings, 1 reply; 9+ messages in thread
From: Lai Jiangshan @ 2014-07-14 8:13 UTC (permalink / raw)
To: linux-kernel; +Cc: Tejun Heo, Lai Jiangshan
It is said in the document that the timer which is being
deleted by del_timer_sync() should not be restarted:
Synchronization rules: Callers must prevent restarting of
the timer, otherwise this function is meaningless.
Repeating timer may cause the del_timer_sync() spin longer,
or even spin forever in very very very very extreme condition.
It is not considered a bug for me, but it disobeys the document.
To fix it, we need:
1) don't requeue the mayday timer unless the manager is working
it is a preparation/cleanup step for the fix. Otherwise the timer
will repeat forever if we only do the 2).
2) remove the del_timer_sync()s in maybe_create_worker()
the timer is still repeating, we should not use del_timer_sync(),
let the timer dismiss itself.
Change from V1:
V1 incorrectly stops the timer when !need_to_create_worker(),
but no one can restart it when needed. In V2, we stops the
timer only when manager is not working.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
kernel/workqueue.c | 7 +++----
1 files changed, 3 insertions(+), 4 deletions(-)
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 338d418..9c86a64 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1864,10 +1864,11 @@ static void pool_mayday_timeout(unsigned long __pool)
send_mayday(work);
}
+ if (mutex_is_locked(&pool->manager_arb))
+ mod_timer(&pool->mayday_timer, jiffies + MAYDAY_INTERVAL);
+
spin_unlock(&pool->lock);
spin_unlock_irq(&wq_mayday_lock);
-
- mod_timer(&pool->mayday_timer, jiffies + MAYDAY_INTERVAL);
}
/**
@@ -1909,7 +1910,6 @@ restart:
worker = create_worker(pool);
if (worker) {
- del_timer_sync(&pool->mayday_timer);
spin_lock_irq(&pool->lock);
start_worker(worker);
if (WARN_ON_ONCE(need_to_create_worker(pool)))
@@ -1926,7 +1926,6 @@ restart:
break;
}
- del_timer_sync(&pool->mayday_timer);
spin_lock_irq(&pool->lock);
if (need_to_create_worker(pool))
goto restart;
--
1.7.4.4
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 1/1 V2] workqueue: remove the del_timer_sync()s in maybe_create_worker()
2014-07-14 8:13 ` [PATCH 1/1 V2] workqueue: remove the del_timer_sync()s in maybe_create_worker() Lai Jiangshan
@ 2014-07-14 14:27 ` Tejun Heo
2014-07-14 15:33 ` Thomas Gleixner
0 siblings, 1 reply; 9+ messages in thread
From: Tejun Heo @ 2014-07-14 14:27 UTC (permalink / raw)
To: Lai Jiangshan; +Cc: linux-kernel, Thomas Gleixner
Hello,
On Mon, Jul 14, 2014 at 04:13:21PM +0800, Lai Jiangshan wrote:
> It is said in the document that the timer which is being
> deleted by del_timer_sync() should not be restarted:
> Synchronization rules: Callers must prevent restarting of
> the timer, otherwise this function is meaningless.
>
> Repeating timer may cause the del_timer_sync() spin longer,
> or even spin forever in very very very very extreme condition.
I'm fairly sure del_timer_sync() can delete self-requeueing timers.
The implementation busy-waits if the queued timer is the currently
executing one and dequeues only while the timer isn't running which
should be able to handle self-requeueing ones just fine. Thomas,
del_timer_sync() can reliably delete self-requeueing ones, right?
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/1 V2] workqueue: remove the del_timer_sync()s in maybe_create_worker()
2014-07-14 14:27 ` Tejun Heo
@ 2014-07-14 15:33 ` Thomas Gleixner
2014-07-14 21:18 ` Lai Jiangshan
0 siblings, 1 reply; 9+ messages in thread
From: Thomas Gleixner @ 2014-07-14 15:33 UTC (permalink / raw)
To: Tejun Heo; +Cc: Lai Jiangshan, linux-kernel
On Mon, 14 Jul 2014, Tejun Heo wrote:
> Hello,
>
> On Mon, Jul 14, 2014 at 04:13:21PM +0800, Lai Jiangshan wrote:
> > It is said in the document that the timer which is being
> > deleted by del_timer_sync() should not be restarted:
> > Synchronization rules: Callers must prevent restarting of
> > the timer, otherwise this function is meaningless.
> >
> > Repeating timer may cause the del_timer_sync() spin longer,
> > or even spin forever in very very very very extreme condition.
>
> I'm fairly sure del_timer_sync() can delete self-requeueing timers.
> The implementation busy-waits if the queued timer is the currently
> executing one and dequeues only while the timer isn't running which
> should be able to handle self-requeueing ones just fine. Thomas,
> del_timer_sync() can reliably delete self-requeueing ones, right?
Yes. If the timer callback is running on the other cpu, then it waits
for the callback to finish before checking whether the timer is
enqueued or not.
Thanks,
tglx
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/1 V2] workqueue: remove the del_timer_sync()s in maybe_create_worker()
2014-07-14 15:33 ` Thomas Gleixner
@ 2014-07-14 21:18 ` Lai Jiangshan
2014-07-14 21:42 ` Thomas Gleixner
0 siblings, 1 reply; 9+ messages in thread
From: Lai Jiangshan @ 2014-07-14 21:18 UTC (permalink / raw)
To: Thomas Gleixner; +Cc: Tejun Heo, linux-kernel
On 07/14/2014 11:33 PM, Thomas Gleixner wrote:
> On Mon, 14 Jul 2014, Tejun Heo wrote:
>
>> Hello,
>>
>> On Mon, Jul 14, 2014 at 04:13:21PM +0800, Lai Jiangshan wrote:
>>> It is said in the document that the timer which is being
>>> deleted by del_timer_sync() should not be restarted:
>>> Synchronization rules: Callers must prevent restarting of
>>> the timer, otherwise this function is meaningless.
>>>
>>> Repeating timer may cause the del_timer_sync() spin longer,
>>> or even spin forever in very very very very extreme condition.
>>
>> I'm fairly sure del_timer_sync() can delete self-requeueing timers.
>> The implementation busy-waits if the queued timer is the currently
>> executing one and dequeues only while the timer isn't running which
>> should be able to handle self-requeueing ones just fine. Thomas,
>> del_timer_sync() can reliably delete self-requeueing ones, right?
>
> Yes.
The comments of the del_timer_sync() needs to be updated
if I did not misunderstood?
> If the timer callback is running on the other cpu, then it waits
> for the callback to finish before checking whether the timer is
> enqueued or not.
The syncer may be interrupted here, after it comes back, the timer
may be running again (and maybe again and again).
Current code is an acceptable ostrich algorithm, but the documents
needs to be updated. Or re-implement it as the way as cancel_work_sync()
which has similar but stronger semantics, it disables requeueing when
syncing.
>
> Thanks,
>
> tglx
>
>
> .
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/1 V2] workqueue: remove the del_timer_sync()s in maybe_create_worker()
2014-07-14 21:18 ` Lai Jiangshan
@ 2014-07-14 21:42 ` Thomas Gleixner
2014-07-15 0:22 ` Lai Jiangshan
2014-07-16 1:24 ` Lai Jiangshan
0 siblings, 2 replies; 9+ messages in thread
From: Thomas Gleixner @ 2014-07-14 21:42 UTC (permalink / raw)
To: Lai Jiangshan; +Cc: Tejun Heo, linux-kernel
On Tue, 15 Jul 2014, Lai Jiangshan wrote:
> On 07/14/2014 11:33 PM, Thomas Gleixner wrote:
> > On Mon, 14 Jul 2014, Tejun Heo wrote:
> >
> >> Hello,
> >>
> >> On Mon, Jul 14, 2014 at 04:13:21PM +0800, Lai Jiangshan wrote:
> >>> It is said in the document that the timer which is being
> >>> deleted by del_timer_sync() should not be restarted:
> >>> Synchronization rules: Callers must prevent restarting of
> >>> the timer, otherwise this function is meaningless.
> >>>
> >>> Repeating timer may cause the del_timer_sync() spin longer,
> >>> or even spin forever in very very very very extreme condition.
> >>
> >> I'm fairly sure del_timer_sync() can delete self-requeueing timers.
> >> The implementation busy-waits if the queued timer is the currently
> >> executing one and dequeues only while the timer isn't running which
> >> should be able to handle self-requeueing ones just fine. Thomas,
> >> del_timer_sync() can reliably delete self-requeueing ones, right?
> >
> > Yes.
>
> The comments of the del_timer_sync() needs to be updated
> if I did not misunderstood?
>
> > If the timer callback is running on the other cpu, then it waits
> > for the callback to finish before checking whether the timer is
> > enqueued or not.
>
> The syncer may be interrupted here, after it comes back, the timer
> may be running again (and maybe again and again).
No. The del_timer_sync() code holds the base lock with interrupts
disabled. So it can't be interrupted.
Thanks,
tglx
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/1 V2] workqueue: remove the del_timer_sync()s in maybe_create_worker()
2014-07-14 21:42 ` Thomas Gleixner
@ 2014-07-15 0:22 ` Lai Jiangshan
2014-07-16 1:24 ` Lai Jiangshan
1 sibling, 0 replies; 9+ messages in thread
From: Lai Jiangshan @ 2014-07-15 0:22 UTC (permalink / raw)
To: Thomas Gleixner; +Cc: Tejun Heo, linux-kernel
On 07/15/2014 05:42 AM, Thomas Gleixner wrote:
> On Tue, 15 Jul 2014, Lai Jiangshan wrote:
>> On 07/14/2014 11:33 PM, Thomas Gleixner wrote:
>>> On Mon, 14 Jul 2014, Tejun Heo wrote:
>>>
>>>> Hello,
>>>>
>>>> On Mon, Jul 14, 2014 at 04:13:21PM +0800, Lai Jiangshan wrote:
>>>>> It is said in the document that the timer which is being
>>>>> deleted by del_timer_sync() should not be restarted:
>>>>> Synchronization rules: Callers must prevent restarting of
>>>>> the timer, otherwise this function is meaningless.
>>>>>
>>>>> Repeating timer may cause the del_timer_sync() spin longer,
>>>>> or even spin forever in very very very very extreme condition.
>>>>
>>>> I'm fairly sure del_timer_sync() can delete self-requeueing timers.
>>>> The implementation busy-waits if the queued timer is the currently
>>>> executing one and dequeues only while the timer isn't running which
>>>> should be able to handle self-requeueing ones just fine. Thomas,
>>>> del_timer_sync() can reliably delete self-requeueing ones, right?
>>>
>>> Yes.
>>
>> The comments of the del_timer_sync() needs to be updated
>> if I did not misunderstood?
>>
>>> If the timer callback is running on the other cpu, then it waits
>>> for the callback to finish before checking whether the timer is
>>> enqueued or not.
>>
>> The syncer may be interrupted here, after it comes back, the timer
>> may be running again (and maybe again and again).
>
> No. The del_timer_sync() code holds the base lock with interrupts
> disabled. So it can't be interrupted.
>
WARN_ON(in_irq() && !tbase_get_irqsafe(timer->base));
for (;;) {
int ret = try_to_del_timer_sync(timer);
if (ret >= 0)
return ret;
cpu_relax();
How about when it is interrupted here?
}
> Thanks,
>
> tglx
>
>
> .
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/1 V2] workqueue: remove the del_timer_sync()s in maybe_create_worker()
2014-07-14 21:42 ` Thomas Gleixner
2014-07-15 0:22 ` Lai Jiangshan
@ 2014-07-16 1:24 ` Lai Jiangshan
1 sibling, 0 replies; 9+ messages in thread
From: Lai Jiangshan @ 2014-07-16 1:24 UTC (permalink / raw)
To: Thomas Gleixner; +Cc: Tejun Heo, linux-kernel
On 07/15/2014 05:42 AM, Thomas Gleixner wrote:
> On Tue, 15 Jul 2014, Lai Jiangshan wrote:
>> On 07/14/2014 11:33 PM, Thomas Gleixner wrote:
>>> On Mon, 14 Jul 2014, Tejun Heo wrote:
>>>
>>>> Hello,
>>>>
>>>> On Mon, Jul 14, 2014 at 04:13:21PM +0800, Lai Jiangshan wrote:
>>>>> It is said in the document that the timer which is being
>>>>> deleted by del_timer_sync() should not be restarted:
>>>>> Synchronization rules: Callers must prevent restarting of
>>>>> the timer, otherwise this function is meaningless.
>>>>>
>>>>> Repeating timer may cause the del_timer_sync() spin longer,
>>>>> or even spin forever in very very very very extreme condition.
>>>>
>>>> I'm fairly sure del_timer_sync() can delete self-requeueing timers.
>>>> The implementation busy-waits if the queued timer is the currently
>>>> executing one and dequeues only while the timer isn't running which
>>>> should be able to handle self-requeueing ones just fine. Thomas,
>>>> del_timer_sync() can reliably delete self-requeueing ones, right?
>>>
>>> Yes.
>>
>> The comments of the del_timer_sync() needs to be updated
>> if I did not misunderstood?
>>
>>> If the timer callback is running on the other cpu, then it waits
>>> for the callback to finish before checking whether the timer is
>>> enqueued or not.
>>
>> The syncer may be interrupted here, after it comes back, the timer
>> may be running again (and maybe again and again).
>
> No. The del_timer_sync() code holds the base lock with interrupts
> disabled. So it can't be interrupted.
>
The del_timer_sync() waits via cpu_relax() without interrupts
disabled. Could you wipe out my concern?
>
> .
>
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2014-07-16 1:23 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-07-13 15:31 [PATCH] workqueue: remove the del_timer_sync()s in maybe_create_worker() Lai Jiangshan
2014-07-14 8:13 ` [PATCH 0/1 V2] workqueue: a tiny fix for the mayday timer Lai Jiangshan
2014-07-14 8:13 ` [PATCH 1/1 V2] workqueue: remove the del_timer_sync()s in maybe_create_worker() Lai Jiangshan
2014-07-14 14:27 ` Tejun Heo
2014-07-14 15:33 ` Thomas Gleixner
2014-07-14 21:18 ` Lai Jiangshan
2014-07-14 21:42 ` Thomas Gleixner
2014-07-15 0:22 ` Lai Jiangshan
2014-07-16 1:24 ` Lai Jiangshan
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).