* workqueue question.
@ 2011-06-28 18:56 Ben Greear
2011-06-29 8:43 ` Tejun Heo
0 siblings, 1 reply; 6+ messages in thread
From: Ben Greear @ 2011-06-28 18:56 UTC (permalink / raw)
To: Linux Kernel Mailing List
Hello!
Is it OK to call INIT_WORK(&foo, bar)
if we are currently being called by the work-queue
using foo?
Also, is it valid to free the memory containing foo
in a workqueue callback?
Thanks,
Ben
--
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc http://www.candelatech.com
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: workqueue question.
2011-06-28 18:56 workqueue question Ben Greear
@ 2011-06-29 8:43 ` Tejun Heo
2011-06-29 16:02 ` Ben Greear
0 siblings, 1 reply; 6+ messages in thread
From: Tejun Heo @ 2011-06-29 8:43 UTC (permalink / raw)
To: Ben Greear; +Cc: Linux Kernel Mailing List
Hello,
On Tue, Jun 28, 2011 at 11:56:39AM -0700, Ben Greear wrote:
> Is it OK to call INIT_WORK(&foo, bar)
> if we are currently being called by the work-queue
> using foo?
Yes, but if flush_work*() races with it, flushing can finish before
execution is complete.
> Also, is it valid to free the memory containing foo
> in a workqueue callback?
Yeap.
--
tejun
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: workqueue question.
2011-06-29 8:43 ` Tejun Heo
@ 2011-06-29 16:02 ` Ben Greear
2011-06-30 10:00 ` Tejun Heo
0 siblings, 1 reply; 6+ messages in thread
From: Ben Greear @ 2011-06-29 16:02 UTC (permalink / raw)
To: Tejun Heo; +Cc: Linux Kernel Mailing List
On 06/29/2011 01:43 AM, Tejun Heo wrote:
> Hello,
>
> On Tue, Jun 28, 2011 at 11:56:39AM -0700, Ben Greear wrote:
>> Is it OK to call INIT_WORK(&foo, bar)
>> if we are currently being called by the work-queue
>> using foo?
>
> Yes, but if flush_work*() races with it, flushing can finish before
> execution is complete.
It appears that the code just wants to (re)add itself to the
work queue with a different callback method:
static void rpc_final_put_task(struct rpc_task *task,
struct workqueue_struct *q)
{
if (q != NULL) {
INIT_WORK(&task->u.tk_work, rpc_async_release);
queue_work(q, &task->u.tk_work);
} else
rpc_free_task(task);
}
My debugging leads me to believe that the rpc_async_release
is (very rarely) called on a task object that has already been logically
freed.
Is there a better way to queue this up that might have less chance
of some strange race?
>
>> Also, is it valid to free the memory containing foo
>> in a workqueue callback?
>
> Yeap.
Is there a method that can be called from a workqueue callback
to verify that the item has not been re-added to the work-queue?
I tried doing a cancel, but that caused recursive locking issues.
I'd like to call this right before freeing the object and BUG_ON()
if the object is actually still on on a work-queue.
Thanks,
Ben
--
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc http://www.candelatech.com
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: workqueue question.
2011-06-29 16:02 ` Ben Greear
@ 2011-06-30 10:00 ` Tejun Heo
2011-06-30 17:18 ` Ben Greear
0 siblings, 1 reply; 6+ messages in thread
From: Tejun Heo @ 2011-06-30 10:00 UTC (permalink / raw)
To: Ben Greear; +Cc: Linux Kernel Mailing List
Hello,
On Wed, Jun 29, 2011 at 09:02:29AM -0700, Ben Greear wrote:
> On 06/29/2011 01:43 AM, Tejun Heo wrote:
> It appears that the code just wants to (re)add itself to the
> work queue with a different callback method:
>
> static void rpc_final_put_task(struct rpc_task *task,
> struct workqueue_struct *q)
> {
> if (q != NULL) {
> INIT_WORK(&task->u.tk_work, rpc_async_release);
> queue_work(q, &task->u.tk_work);
> } else
> rpc_free_task(task);
> }
Ummm... so, at the time of INIT_WORK(), the tk_work could be already
pending or running?
> My debugging leads me to believe that the rpc_async_release
> is (very rarely) called on a task object that has already been logically
> freed.
What do you mean "logically freed"? Do you mean the rpc_task struct
is freed twice?
> Is there a better way to queue this up that might have less chance
> of some strange race?
Why not just use a separate work item?
> >>Also, is it valid to free the memory containing foo
> >>in a workqueue callback?
> >
> >Yeap.
>
> Is there a method that can be called from a workqueue callback
> to verify that the item has not been re-added to the work-queue?
Can you be a bit more specific? Are you saying that queue_work() and
INIT_WORK() may race?
> I tried doing a cancel, but that caused recursive locking issues.
>
> I'd like to call this right before freeing the object and BUG_ON()
> if the object is actually still on on a work-queue.
That may be useful as a debugging feature but is inherently racy.
Nothing guarantees the work item won't be queued after BUG_ON() but
before actual freeing. The guarantee that the work item is no longer
in use should come from the wq user. There are good number of use
cases where work item frees itself or the containing data structure
and they all work fine.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: workqueue question.
2011-06-30 10:00 ` Tejun Heo
@ 2011-06-30 17:18 ` Ben Greear
2011-07-01 16:37 ` Tejun Heo
0 siblings, 1 reply; 6+ messages in thread
From: Ben Greear @ 2011-06-30 17:18 UTC (permalink / raw)
To: Tejun Heo; +Cc: Linux Kernel Mailing List
On 06/30/2011 03:00 AM, Tejun Heo wrote:
> Hello,
>
> On Wed, Jun 29, 2011 at 09:02:29AM -0700, Ben Greear wrote:
>> On 06/29/2011 01:43 AM, Tejun Heo wrote:
>> It appears that the code just wants to (re)add itself to the
>> work queue with a different callback method:
>>
>> static void rpc_final_put_task(struct rpc_task *task,
>> struct workqueue_struct *q)
>> {
>> if (q != NULL) {
>> INIT_WORK(&task->u.tk_work, rpc_async_release);
>> queue_work(q,&task->u.tk_work);
>> } else
>> rpc_free_task(task);
>> }
>
> Ummm... so, at the time of INIT_WORK(), the tk_work could be already
> pending or running?
This method is indirectly called by the worker-thread. The
trace below shows it taking the else branch, but I'm not
sure it always does so.
__slab_free+0x57/0x150
kfree+0x107/0x13a
rpcb_map_release+0x3f/0x44 [sunrpc]
rpc_release_calldata+0x12/0x14 [sunrpc]
rpc_free_task+0x59/0x61 [sunrpc]
rpc_final_put_task+0x82/0x8a [sunrpc]
__rpc_execute+0x23c/0x24b [sunrpc]
rpc_async_schedule+0x10/0x12 [sunrpc]
process_one_work+0x230/0x41d
worker_thread+0x133/0x217
kthread+0x7d/0x85
kernel_thread_helper+0x4/0x10
>> My debugging leads me to believe that the rpc_async_release
>> is (very rarely) called on a task object that has already been logically
>> freed.
>
> What do you mean "logically freed"? Do you mean the rpc_task struct
> is freed twice?
Yes it seems so..though it's really just poked back into a mempool
instead of kfreed.
>
>> Is there a better way to queue this up that might have less chance
>> of some strange race?
>
> Why not just use a separate work item?
No idea, this is from existing net/sunrpc/* code. If the
you think that is more proper way to do this logic, I can try that.
>>>> Also, is it valid to free the memory containing foo
>>>> in a workqueue callback?
>>>
>>> Yeap.
>>
>> Is there a method that can be called from a workqueue callback
>> to verify that the item has not been re-added to the work-queue?
>
> Can you be a bit more specific? Are you saying that queue_work() and
> INIT_WORK() may race?
No, I don't think that is racing. Basically, when I'm about
to logically free (put back into mempool) the task struct, I
would like to add a sanity check to make sure it's not currently
scheduled on a work queue. If it were, that would explain the
backtraces I was seeing from slub memory debugging logic and
I'd be closer to understanding the problem.
>> I tried doing a cancel, but that caused recursive locking issues.
>>
>> I'd like to call this right before freeing the object and BUG_ON()
>> if the object is actually still on on a work-queue.
>
> That may be useful as a debugging feature but is inherently racy.
> Nothing guarantees the work item won't be queued after BUG_ON() but
> before actual freeing. The guarantee that the work item is no longer
> in use should come from the wq user. There are good number of use
> cases where work item frees itself or the containing data structure
> and they all work fine.
At this point I have no reason to believe the work-queues are buggy,
but due to state machines and callbacks and method pointers, it is
quite difficult to know the method flow in the rpc code. So an
extra sanity check might be quite useful. I'll try to code something
up for the work-queue logic when I get a chance.
Thanks,
Ben
>
> Thanks.
>
--
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc http://www.candelatech.com
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: workqueue question.
2011-06-30 17:18 ` Ben Greear
@ 2011-07-01 16:37 ` Tejun Heo
0 siblings, 0 replies; 6+ messages in thread
From: Tejun Heo @ 2011-07-01 16:37 UTC (permalink / raw)
To: Ben Greear; +Cc: Linux Kernel Mailing List
Hello,
On Thu, Jun 30, 2011 at 10:18:52AM -0700, Ben Greear wrote:
> >>Is there a method that can be called from a workqueue callback
> >>to verify that the item has not been re-added to the work-queue?
> >
> >Can you be a bit more specific? Are you saying that queue_work() and
> >INIT_WORK() may race?
>
> No, I don't think that is racing. Basically, when I'm about
> to logically free (put back into mempool) the task struct, I
> would like to add a sanity check to make sure it's not currently
> scheduled on a work queue. If it were, that would explain the
> backtraces I was seeing from slub memory debugging logic and
> I'd be closer to understanding the problem.
Ah, okay. When a work is in pending state, work_pending() is always
true; however, whether a work item is currently being executed is a
bit more complicated. You'll need to implement a function which looks
similar to the following.
bool is_work_executing(work)
{
for_each_gcwq_cpu(cpu) {
gcwq = get_gcwq(cpu);
lock gcwq;
if (find_worker_executing_work(gcwq, work)) {
unlock gcwq;
return true;
}
unlock gcwq;
}
return false;
}
But I would recommend watching workqueue tracing points first.
$ grep workqueue /sys/kernel/debug/tracing/available_events
workqueue:workqueue_queue_work
workqueue:workqueue_activate_work
workqueue:workqueue_execute_start
workqueue:workqueue_execute_end
You should be able to tell which work is doing what on which CPU.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2011-07-01 16:37 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-06-28 18:56 workqueue question Ben Greear
2011-06-29 8:43 ` Tejun Heo
2011-06-29 16:02 ` Ben Greear
2011-06-30 10:00 ` Tejun Heo
2011-06-30 17:18 ` Ben Greear
2011-07-01 16:37 ` Tejun Heo
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox