linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* workqueue panic in 3.4 kernel
@ 2013-03-05  7:31 Lei Wen
  2013-03-05 16:32 ` Tejun Heo
  0 siblings, 1 reply; 16+ messages in thread
From: Lei Wen @ 2013-03-05  7:31 UTC (permalink / raw)
  To: linux-kernel, Tejun Heo, leiwen

Hi Tejun,

We met one panic issue related workqueue based over 3.4.5 Linux kernel.

Panic log as:
[153587.035369] Unable to handle kernel NULL pointer dereference at
virtual address 00000004
[153587.043731] pgd = e1e74000
[153587.046691] [00000004] *pgd=00000000
[153587.050567] Internal error: Oops: 5 [#1] PREEMPT SMP ARM
[153587.056152] Modules linked in: hwmap(O) cidatattydev(O) gs_diag(O)
diag(O) gs_modem(O) ccinetdev(O) cci_datastub(O) citty(O) msocketk(O)
smsmdtv seh(O) cploaddev(O) blcr(O) blcr_imports(O) geu(O) galcore(O)
[153587.076416] CPU: 0    Tainted: G           O  (3.4.5+ #1)
[153587.082092] PC is at delayed_work_timer_fn+0x1c/0x28
[153587.087249] LR is at delayed_work_timer_fn+0x18/0x28
[153587.092468] pc : [<c014c7bc>]    lr : [<c014c7b8>]    psr: 20000113
[153587.092468] sp : e1e3bf00  ip : 00000001  fp : 0000000a
[153587.104400] r10: 00000001  r9 : 578914dc  r8 : c014c7a0
[153587.109832] r7 : 00000101  r6 : bf03d554  r5 : 00000000  r4 : bf03d544
[153587.116638] r3 : 00000101  r2 : bf03d544  r1 : c1a0b27c  r0 : 00000000
[153587.123352] Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM
Segment user
[153587.130737] Control: 10c53c7d  Table: 21e7404a  DAC: 00000015
[153587.611328] [<c014c7bc>] (delayed_work_timer_fn+0x1c/0x28) from
[<c014185c>] (run_timer_softirq+0x260/0x384)
[153587.621368] [<c014185c>] (run_timer_softirq+0x260/0x384) from
[<c013abfc>] (__do_softirq+0x11c/0x244)
[153587.630828] [<c013abfc>] (__do_softirq+0x11c/0x244) from
[<c013b144>] (irq_exit+0x44/0x98)
[153587.639373] [<c013b144>] (irq_exit+0x44/0x98) from [<c0113ca0>]
(handle_IRQ+0x7c/0xb8)
[153587.647583] [<c0113ca0>] (handle_IRQ+0x7c/0xb8) from [<c01084ac>]
(gic_handle_irq+0x34/0x58)
[153587.656188] [<c01084ac>] (gic_handle_irq+0x34/0x58) from
[<c0112b3c>] (__irq_usr+0x3c/0x60)

With checking memory,  we find work->data becomes 0x300, when it try
to call get_work_cwq
in delayed_work_timer_fn. Thus cwq becomes NULL before calls __queue_work.
So it is reasonable kernel get panic when it try to access wq with cwq->wq.

To fix it, we try to backport below patches:
commit 60c057bca22285efefbba033624763a778f243bf
Author: Lai Jiangshan <laijs@cn.fujitsu.com>
Date:   Wed Feb 6 18:04:53 2013 -0800

    workqueue: add delayed_work->wq to simplify reentrancy handling

commit 1265057fa02c7bed3b6d9ddc8a2048065a370364
Author: Tejun Heo <tj@kernel.org>
Date:   Wed Aug 8 09:38:42 2012 -0700

    workqueue: fix CPU binding of flush_delayed_work[_sync]()

And add below change to make sure __cancel_work_timer cannot preempt
between run_timer_softirq and delayed_work_timer_fn.
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index bf4888c..0e9f77c 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -2627,7 +2627,7 @@ static bool __cancel_work_timer(struct work_struct *work,
                ret = (timer && likely(del_timer(timer)));
                if (!ret)
                        ret = try_to_grab_pending(work);
-               wait_on_work(work);
+               flush_work(work);
        } while (unlikely(ret < 0));

        clear_work_data(work);

Do you think this fix is enough? And add flush_work directly in
__cancel_work_timer is ok for
the fix?

Thanks,
Lei

^ permalink raw reply related	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2013-03-12  6:41 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-03-05  7:31 workqueue panic in 3.4 kernel Lei Wen
2013-03-05 16:32 ` Tejun Heo
2013-03-06 14:39   ` Lei Wen
2013-03-06 19:14     ` Tejun Heo
2013-03-07  1:15       ` Lei Wen
2013-03-07 15:22         ` Lei Wen
2013-03-07 15:49           ` Tejun Heo
2013-03-07 16:07             ` Thomas Gleixner
     [not found]               ` <CALZhoSQBH3RxSoaVDCYoCyRVRndht-Rk0rh_w5Dbp6+5T_auSw@mail.gmail.com>
2013-03-12  5:12                 ` Tejun Heo
2013-03-12  5:18                   ` Lei Wen
2013-03-12  5:24                     ` Tejun Heo
2013-03-12  5:34                       ` Lei Wen
2013-03-12  5:40                         ` Tejun Heo
2013-03-12  6:01                           ` Lei Wen
2013-03-12  6:13                             ` Tejun Heo
2013-03-12  6:41                               ` Lei Wen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).