* workqueue question. @ 2011-06-28 18:56 Ben Greear 2011-06-29 8:43 ` Tejun Heo 0 siblings, 1 reply; 6+ messages in thread From: Ben Greear @ 2011-06-28 18:56 UTC (permalink / raw) To: Linux Kernel Mailing List Hello! Is it OK to call INIT_WORK(&foo, bar) if we are currently being called by the work-queue using foo? Also, is it valid to free the memory containing foo in a workqueue callback? Thanks, Ben -- Ben Greear <greearb@candelatech.com> Candela Technologies Inc http://www.candelatech.com ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: workqueue question. 2011-06-28 18:56 workqueue question Ben Greear @ 2011-06-29 8:43 ` Tejun Heo 2011-06-29 16:02 ` Ben Greear 0 siblings, 1 reply; 6+ messages in thread From: Tejun Heo @ 2011-06-29 8:43 UTC (permalink / raw) To: Ben Greear; +Cc: Linux Kernel Mailing List Hello, On Tue, Jun 28, 2011 at 11:56:39AM -0700, Ben Greear wrote: > Is it OK to call INIT_WORK(&foo, bar) > if we are currently being called by the work-queue > using foo? Yes, but if flush_work*() races with it, flushing can finish before execution is complete. > Also, is it valid to free the memory containing foo > in a workqueue callback? Yeap. -- tejun ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: workqueue question. 2011-06-29 8:43 ` Tejun Heo @ 2011-06-29 16:02 ` Ben Greear 2011-06-30 10:00 ` Tejun Heo 0 siblings, 1 reply; 6+ messages in thread From: Ben Greear @ 2011-06-29 16:02 UTC (permalink / raw) To: Tejun Heo; +Cc: Linux Kernel Mailing List On 06/29/2011 01:43 AM, Tejun Heo wrote: > Hello, > > On Tue, Jun 28, 2011 at 11:56:39AM -0700, Ben Greear wrote: >> Is it OK to call INIT_WORK(&foo, bar) >> if we are currently being called by the work-queue >> using foo? > > Yes, but if flush_work*() races with it, flushing can finish before > execution is complete. It appears that the code just wants to (re)add itself to the work queue with a different callback method: static void rpc_final_put_task(struct rpc_task *task, struct workqueue_struct *q) { if (q != NULL) { INIT_WORK(&task->u.tk_work, rpc_async_release); queue_work(q, &task->u.tk_work); } else rpc_free_task(task); } My debugging leads me to believe that the rpc_async_release is (very rarely) called on a task object that has already been logically freed. Is there a better way to queue this up that might have less chance of some strange race? > >> Also, is it valid to free the memory containing foo >> in a workqueue callback? > > Yeap. Is there a method that can be called from a workqueue callback to verify that the item has not been re-added to the work-queue? I tried doing a cancel, but that caused recursive locking issues. I'd like to call this right before freeing the object and BUG_ON() if the object is actually still on on a work-queue. Thanks, Ben -- Ben Greear <greearb@candelatech.com> Candela Technologies Inc http://www.candelatech.com ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: workqueue question. 2011-06-29 16:02 ` Ben Greear @ 2011-06-30 10:00 ` Tejun Heo 2011-06-30 17:18 ` Ben Greear 0 siblings, 1 reply; 6+ messages in thread From: Tejun Heo @ 2011-06-30 10:00 UTC (permalink / raw) To: Ben Greear; +Cc: Linux Kernel Mailing List Hello, On Wed, Jun 29, 2011 at 09:02:29AM -0700, Ben Greear wrote: > On 06/29/2011 01:43 AM, Tejun Heo wrote: > It appears that the code just wants to (re)add itself to the > work queue with a different callback method: > > static void rpc_final_put_task(struct rpc_task *task, > struct workqueue_struct *q) > { > if (q != NULL) { > INIT_WORK(&task->u.tk_work, rpc_async_release); > queue_work(q, &task->u.tk_work); > } else > rpc_free_task(task); > } Ummm... so, at the time of INIT_WORK(), the tk_work could be already pending or running? > My debugging leads me to believe that the rpc_async_release > is (very rarely) called on a task object that has already been logically > freed. What do you mean "logically freed"? Do you mean the rpc_task struct is freed twice? > Is there a better way to queue this up that might have less chance > of some strange race? Why not just use a separate work item? > >>Also, is it valid to free the memory containing foo > >>in a workqueue callback? > > > >Yeap. > > Is there a method that can be called from a workqueue callback > to verify that the item has not been re-added to the work-queue? Can you be a bit more specific? Are you saying that queue_work() and INIT_WORK() may race? > I tried doing a cancel, but that caused recursive locking issues. > > I'd like to call this right before freeing the object and BUG_ON() > if the object is actually still on on a work-queue. That may be useful as a debugging feature but is inherently racy. Nothing guarantees the work item won't be queued after BUG_ON() but before actual freeing. The guarantee that the work item is no longer in use should come from the wq user. There are good number of use cases where work item frees itself or the containing data structure and they all work fine. Thanks. -- tejun ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: workqueue question. 2011-06-30 10:00 ` Tejun Heo @ 2011-06-30 17:18 ` Ben Greear 2011-07-01 16:37 ` Tejun Heo 0 siblings, 1 reply; 6+ messages in thread From: Ben Greear @ 2011-06-30 17:18 UTC (permalink / raw) To: Tejun Heo; +Cc: Linux Kernel Mailing List On 06/30/2011 03:00 AM, Tejun Heo wrote: > Hello, > > On Wed, Jun 29, 2011 at 09:02:29AM -0700, Ben Greear wrote: >> On 06/29/2011 01:43 AM, Tejun Heo wrote: >> It appears that the code just wants to (re)add itself to the >> work queue with a different callback method: >> >> static void rpc_final_put_task(struct rpc_task *task, >> struct workqueue_struct *q) >> { >> if (q != NULL) { >> INIT_WORK(&task->u.tk_work, rpc_async_release); >> queue_work(q,&task->u.tk_work); >> } else >> rpc_free_task(task); >> } > > Ummm... so, at the time of INIT_WORK(), the tk_work could be already > pending or running? This method is indirectly called by the worker-thread. The trace below shows it taking the else branch, but I'm not sure it always does so. __slab_free+0x57/0x150 kfree+0x107/0x13a rpcb_map_release+0x3f/0x44 [sunrpc] rpc_release_calldata+0x12/0x14 [sunrpc] rpc_free_task+0x59/0x61 [sunrpc] rpc_final_put_task+0x82/0x8a [sunrpc] __rpc_execute+0x23c/0x24b [sunrpc] rpc_async_schedule+0x10/0x12 [sunrpc] process_one_work+0x230/0x41d worker_thread+0x133/0x217 kthread+0x7d/0x85 kernel_thread_helper+0x4/0x10 >> My debugging leads me to believe that the rpc_async_release >> is (very rarely) called on a task object that has already been logically >> freed. > > What do you mean "logically freed"? Do you mean the rpc_task struct > is freed twice? Yes it seems so..though it's really just poked back into a mempool instead of kfreed. > >> Is there a better way to queue this up that might have less chance >> of some strange race? > > Why not just use a separate work item? No idea, this is from existing net/sunrpc/* code. If the you think that is more proper way to do this logic, I can try that. >>>> Also, is it valid to free the memory containing foo >>>> in a workqueue callback? >>> >>> Yeap. >> >> Is there a method that can be called from a workqueue callback >> to verify that the item has not been re-added to the work-queue? > > Can you be a bit more specific? Are you saying that queue_work() and > INIT_WORK() may race? No, I don't think that is racing. Basically, when I'm about to logically free (put back into mempool) the task struct, I would like to add a sanity check to make sure it's not currently scheduled on a work queue. If it were, that would explain the backtraces I was seeing from slub memory debugging logic and I'd be closer to understanding the problem. >> I tried doing a cancel, but that caused recursive locking issues. >> >> I'd like to call this right before freeing the object and BUG_ON() >> if the object is actually still on on a work-queue. > > That may be useful as a debugging feature but is inherently racy. > Nothing guarantees the work item won't be queued after BUG_ON() but > before actual freeing. The guarantee that the work item is no longer > in use should come from the wq user. There are good number of use > cases where work item frees itself or the containing data structure > and they all work fine. At this point I have no reason to believe the work-queues are buggy, but due to state machines and callbacks and method pointers, it is quite difficult to know the method flow in the rpc code. So an extra sanity check might be quite useful. I'll try to code something up for the work-queue logic when I get a chance. Thanks, Ben > > Thanks. > -- Ben Greear <greearb@candelatech.com> Candela Technologies Inc http://www.candelatech.com ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: workqueue question. 2011-06-30 17:18 ` Ben Greear @ 2011-07-01 16:37 ` Tejun Heo 0 siblings, 0 replies; 6+ messages in thread From: Tejun Heo @ 2011-07-01 16:37 UTC (permalink / raw) To: Ben Greear; +Cc: Linux Kernel Mailing List Hello, On Thu, Jun 30, 2011 at 10:18:52AM -0700, Ben Greear wrote: > >>Is there a method that can be called from a workqueue callback > >>to verify that the item has not been re-added to the work-queue? > > > >Can you be a bit more specific? Are you saying that queue_work() and > >INIT_WORK() may race? > > No, I don't think that is racing. Basically, when I'm about > to logically free (put back into mempool) the task struct, I > would like to add a sanity check to make sure it's not currently > scheduled on a work queue. If it were, that would explain the > backtraces I was seeing from slub memory debugging logic and > I'd be closer to understanding the problem. Ah, okay. When a work is in pending state, work_pending() is always true; however, whether a work item is currently being executed is a bit more complicated. You'll need to implement a function which looks similar to the following. bool is_work_executing(work) { for_each_gcwq_cpu(cpu) { gcwq = get_gcwq(cpu); lock gcwq; if (find_worker_executing_work(gcwq, work)) { unlock gcwq; return true; } unlock gcwq; } return false; } But I would recommend watching workqueue tracing points first. $ grep workqueue /sys/kernel/debug/tracing/available_events workqueue:workqueue_queue_work workqueue:workqueue_activate_work workqueue:workqueue_execute_start workqueue:workqueue_execute_end You should be able to tell which work is doing what on which CPU. Thanks. -- tejun ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2011-07-01 16:37 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-06-28 18:56 workqueue question Ben Greear 2011-06-29 8:43 ` Tejun Heo 2011-06-29 16:02 ` Ben Greear 2011-06-30 10:00 ` Tejun Heo 2011-06-30 17:18 ` Ben Greear 2011-07-01 16:37 ` Tejun Heo
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox