From mboxrd@z Thu Jan 1 00:00:00 1970 From: Yijing Wang Subject: [BUGREPORT] Tasklet scheduled issue in Linux 3.4.x-rt Date: Mon, 3 Mar 2014 17:24:39 +0800 Message-ID: <53144A57.4040204@huawei.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: Li Zefan , Zhangwei , Yijing Wang To: Ingo Molnar , Peter Zijlstra , Sebastian Andrzej Siewior , LKML , Return-path: Sender: linux-kernel-owner@vger.kernel.org List-Id: linux-rt-users.vger.kernel.org Hi list, I found a tasklet related issue in linux-stable-rt 3.4. And after I revert following commit, the test result seems ok(test last= ed 40hours). commit 0d9f73fc1e7270a3f8709c59c913408153d9d9f8 Author: Ingo Molnar Date: Tue Nov 29 20:18:22 2011 -0500 tasklet: Prevent tasklets from going into infinite spin in RT I test FC driver IO in this kernel, and after a few hours test, FC IO w= ill abort, I found a lot of tasklet WARNING Call Trace in kernel messag= e,like: [2012-03-26 18:55:43][ 929.252289] ------------[ cut here ]-----------= - [2012-03-26 18:55:43][ 929.252312] WARNING: at kernel/softirq.c:773 __= tasklet_action+0x51/0x1a0() [2012-03-26 18:55:43][ 929.252314] Hardware name: Romley [2012-03-26 18:55:43][ 929.252316] Modules linked in: isd_fid(O) ivs_e= dft(O) ivs_emp(O) ivs_xnet(O) isd_rds(O) isd_idm(O) isd_dft(O) isd_base= (O) sdm(O) isd_cmm(O) isd_ibc(O) isd_lib(O) xve_hab(PO) xve_net(PO) xve= _cls_msg_filter(PO) xve_dscp(PO) pagepool(PO) iod(O) cmm(PO) util(PO) i= ntel_t10(PO) itest_nid(PO) dmi(PO) bsp_adapter(PO) mpa(O) ipmi_si ipmi_= devintf ipmi_msghandler iscsi_sw(PO) iscsi_prot(O) iscsi_seg(PO) iscsi_= comm(PO) iscsi_initiator(PO) 8192cu(O) pciehp(PO) pcieaer(PO) pciecore(= PO) drvinstallthird(PO) quark(O) sal(O) pmsas(O) foe(O) lfcoe(O) libfc(= O) ib_uverbs(O) ibtgt(O) ib_srpt(O) ib_cm(O) ib_sa(O) mlx4_ib(O) ib_uma= d(O) ib_mad(O) mlx4_core(O) ib_core(O) drvtom(O) cxgb4(O) drvtoecore(O)= fcdrv(PO) unflowlevel(PO) unfcommon(O) drvmml(PO) scsi_transport_fc sc= si_tgt memtest(PO) drv_iosubsys_ini(O) iocount(O) bsp_mml(PO) agetty_qu= ery(PO) cpufreq_powersave af_packet nf_conntrack_ipv6 nf_defrag_ipv6 ip= 6table_filter ip6_tables xt_limit xt_tcpudp xt_multiport nf_conntr! ack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack usr_cache(O) acpi_cpufreq mperf p= rocessor thermal_sys sg hwmon iptable_filter ip_tables x_tables ixgbe(O= ) igb(O) bonding(O) tg(O) netmgmt(O) drvinstall(PO) dal(PO) dca usb_sto= rage(O) uhci_hcd ehci_hcd usbcore(O) usb_common sata_mml(O) ahci(O) lib= ata(O) satahp(O) drvframe(PO) sd_mod crc_t10dif scsi_mod agetty_interfa= ce(PO) ext3 jbd mbcache nvram_printk(PO) os_feeddog(PO) os_debug(O) osp= _proc(PO) os_rnvramdev(PO) vos(O) bsp(PO) os_die_handler(O) os_oom_hand= ler(O) os_panic_handler(O) biosnvramdriver(O) kbox(O) [2012-03-26 18:55:43][ 929.252460] Pid: 17495, comm: 3th SioT Tainted:= P O 3.4.24.15-0.11-default #1 [2012-03-26 18:55:43][ 929.252463] Call Trace: [2012-03-26 18:55:43][ 929.252465] [] ? __ta= sklet_action+0x51/0x1a0 [2012-03-26 18:55:43][ 929.252481] [] warn_slowpath= _common+0x7a/0xb0 [2012-03-26 18:55:43][ 929.252486] [] warn_slowpath= _null+0x15/0x20 [2012-03-26 18:55:43][ 929.252490] [] __tasklet_act= ion+0x51/0x1a0 [2012-03-26 18:55:43][ 929.252494] [] tasklet_actio= n+0x59/0x60 [2012-03-26 18:55:43][ 929.252498] [] handle_pendin= g_softirqs+0xb0/0x170 [2012-03-26 18:55:43][ 929.252502] [] __do_softirq+= 0x49/0xa0 [2012-03-26 18:55:43][ 929.252513] [] call_softirq+= 0x1c/0x30 [2012-03-26 18:55:43][ 929.252519] [] do_softirq+0x= 65/0xa0 [2012-03-26 18:55:43][ 929.252523] [] irq_exit+0xc5= /0xe0 [2012-03-26 18:55:43][ 929.252526] [] do_IRQ+0x64/0= xe0 [2012-03-26 18:55:43][ 929.252534] [] common_interr= upt+0x6a/0x6a [2012-03-26 18:55:43][ 929.252536] [] ? _raw= _spin_unlock_irqrestore+0x16/0x30 [2012-03-26 18:55:43][ 929.252565] [] SDM_SDGetDisk= +0x64/0x100 [sdm] [2012-03-26 18:55:43][ 929.252575] [] SDM_FRAMESdGe= tDisk+0x17/0x80 [sdm] [2012-03-26 18:55:43][ 929.252585] [] SDM_ERRAddTim= er+0x33/0x370 [sdm] [2012-03-26 18:55:43][ 929.252594] [] SDM_FRAMEErrA= ddTimer+0x17/0x80 [sdm] [2012-03-26 18:55:43][ 929.252604] [] SDM_SIOQueueR= eqProcess+0x67/0x7d0 [sdm] [2012-03-26 18:55:43][ 929.252612] [] SDM_SIOQueueT= hread+0x142/0x310 [sdm] [2012-03-26 18:55:43][ 929.252618] [] kernel_thread= _helper+0x4/0x10 [2012-03-26 18:55:43][ 929.252627] [] ? SDM_SIOQueu= eReqProcess+0x7d0/0x7d0 [sdm] [2012-03-26 18:55:43][ 929.252632] [] ? gs_change+0= x13/0x13 [2012-03-26 18:55:43][ 929.252635] ---[ end trace a82addcbe6cbf131 ]--= - =2E......[snip]......... [2012-03-27 03:41:06][ 3647.885973] ------------[ cut here ]-----------= - [2012-03-27 03:41:06][ 3647.886005] WARNING: at kernel/softirq.c:773 __= tasklet_action+0x51/0x1a0() [2012-03-27 03:41:06][ 3647.886010] Hardware name: Romley [2012-03-27 03:41:06][ 3647.886012] Modules linked in: isd_fid(O) ivs_e= dft(O) ivs_emp(O) ivs_xnet(O) isd_rds(O) isd_idm(O) isd_dft(O) isd_base= (O) sdm(O) isd_cmm(O) isd_ibc(O) isd_lib(O) xve_hab(PO) xve_net(PO) xve= _cls_msg_filter(PO) xve_dscp(PO) pagepool(PO) iod(O) cmm(PO) util(PO) i= ntel_t10(PO) itest_nid(PO) dmi(PO) bsp_adapter(PO) mpa(O) ipmi_si ipmi_= devintf ipmi_msghandler iscsi_sw(PO) iscsi_prot(O) iscsi_seg(PO) iscsi_= comm(PO) iscsi_initiator(PO) 8192cu(O) pciehp(PO) pcieaer(PO) pciecore(= PO) drvinstallthird(PO) quark(O) sal(O) pmsas(O) foe(O) lfcoe(O) libfc(= O) ib_uverbs(O) ibtgt(O) ib_srpt(O) ib_cm(O) ib_sa(O) mlx4_ib(O) ib_uma= d(O) ib_mad(O) mlx4_core(O) ib_core(O) drvtom(O) cxgb4(O) drvtoecore(O)= fcdrv(PO) unflowlevel(PO) unfcommon(O) drvmml(PO) scsi_transport_fc sc= si_tgt memtest(PO) drv_iosubsys_ini(O) iocount(O) bsp_mml(PO) agetty_qu= ery(PO) cpufreq_powersave af_packet nf_conntrack_ipv6 nf_defrag_ipv6 ip= 6table_filter ip6_tables xt_limit xt_tcpudp xt_multiport nf_conntr! ack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack usr_cache(O) acpi_cpufreq mperf p= rocessor thermal_sys sg hwmon iptable_filter ip_tables x_tables ixgbe(O= ) igb(O) bonding(O) tg(O) netmgmt(O) drvinstall(PO) dal(PO) dca usb_sto= rage(O) uhci_hcd ehci_hcd usbcore(O) usb_common sata_mml(O) ahci(O) lib= ata(O) satahp(O) drvframe(PO) sd_mod crc_t10dif scsi_mod agetty_interfa= ce(PO) ext3 jbd mbcache nvram_printk(PO) os_feeddog(PO) os_debug(O) osp= _proc(PO) os_rnvramdev(PO) vos(O) bsp(PO) os_die_handler(O) os_oom_hand= ler(O) os_panic_handler(O) biosnvramdriver(O) kbox(O) [2012-03-27 03:41:06][ 3647.886175] Pid: 21635, comm: sdtester Tainted:= P W O 3.4.24.15-0.11-default #1 [2012-03-27 03:41:06][ 3647.886179] Call Trace: [2012-03-27 03:41:06][ 3647.886182] [] ? __ta= sklet_action+0x51/0x1a0 [2012-03-27 03:41:06][ 3647.886197] [] warn_slowpath= _common+0x7a/0xb0 [2012-03-27 03:41:06][ 3647.886201] [] warn_slowpath= _null+0x15/0x20 [2012-03-27 03:41:06][ 3647.886205] [] __tasklet_act= ion+0x51/0x1a0 [2012-03-27 03:41:06][ 3647.886209] [] tasklet_actio= n+0x59/0x60 [2012-03-27 03:41:06][ 3647.886213] [] handle_pendin= g_softirqs+0xb0/0x170 [2012-03-27 03:41:06][ 3647.886217] [] __do_softirq+= 0x49/0xa0 [2012-03-27 03:41:06][ 3647.886228] [] call_softirq+= 0x1c/0x30 [2012-03-27 03:41:06][ 3647.886234] [] do_softirq+0x= 65/0xa0 [2012-03-27 03:41:06][ 3647.886238] [] irq_exit+0xc5= /0xe0 [2012-03-27 03:41:06][ 3647.886243] [] do_IRQ+0x64/0= xe0 [2012-03-27 03:41:06][ 3647.886252] [] common_interr= upt+0x6a/0x6a [2012-03-27 03:41:06][ 3647.886254] [] ? _raw= _spin_unlock_irq+0x10/0x20 [2012-03-27 03:41:06][ 3647.886264] [] ? _raw_spin_u= nlock_irq+0x9/0x20 [2012-03-27 03:41:06][ 3647.886275] [] kiocb_batch_r= efill+0x2c2/0x2e0 [2012-03-27 03:41:06][ 3647.886279] [] io_submit_one= +0x21c/0x2b0 [2012-03-27 03:41:06][ 3647.886284] [] do_io_submit+= 0x16a/0x360 [2012-03-27 03:41:06][ 3647.886288] [] sys_io_submit= +0xb/0x10 [2012-03-27 03:41:06][ 3647.886293] [] system_call_f= astpath+0x16/0x1b [2012-03-27 03:41:06][ 3647.886297] ---[ end trace a82addcbe6cbf137 ]--= - [2012-03-27 03:42:04][ 3705.434405] ------------[ cut here ]-----------= - [2012-03-27 03:42:04][ 3705.434418] WARNING: at kernel/softirq.c:799 __= tasklet_action+0xae/0x1a0() [2012-03-27 03:42:04][ 3705.434421] Hardware name: Romley [2012-03-27 03:42:04][ 3705.434423] Modules linked in: isd_fid(O) ivs_e= dft(O) ivs_emp(O) ivs_xnet(O) isd_rds(O) isd_idm(O) isd_dft(O) isd_base= (O) sdm(O) isd_cmm(O) isd_ibc(O) isd_lib(O) xve_hab(PO) xve_net(PO) xve= _cls_msg_filter(PO) xve_dscp(PO) pagepool(PO) iod(O) cmm(PO) util(PO) i= ntel_t10(PO) itest_nid(PO) dmi(PO) bsp_adapter(PO) mpa(O) ipmi_si ipmi_= devintf ipmi_msghandler iscsi_sw(PO) iscsi_prot(O) iscsi_seg(PO) iscsi_= comm(PO) iscsi_initiator(PO) 8192cu(O) pciehp(PO) pcieaer(PO) pciecore(= PO) drvinstallthird(PO) quark(O) sal(O) pmsas(O) foe(O) lfcoe(O) libfc(= O) ib_uverbs(O) ibtgt(O) ib_srpt(O) ib_cm(O) ib_sa(O) mlx4_ib(O) ib_uma= d(O) ib_mad(O) mlx4_core(O) ib_core(O) drvtom(O) cxgb4(O) drvtoecore(O)= fcdrv(PO) unflowlevel(PO) unfcommon(O) drvmml(PO) scsi_transport_fc sc= si_tgt memtest(PO) drv_iosubsys_ini(O) iocount(O) bsp_mml(PO) agetty_qu= ery(PO) cpufreq_powersave af_packet nf_conntrack_ipv6 nf_defrag_ipv6 ip= 6table_filter ip6_tables xt_limit xt_tcpudp xt_multiport nf_conntr! ack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack usr_cache(O) acpi_cpufreq mperf p= rocessor thermal_sys sg hwmon iptable_filter ip_tables x_tables ixgbe(O= ) igb(O) bonding(O) tg(O) netmgmt(O) drvinstall(PO) dal(PO) dca usb_sto= rage(O) uhci_hcd ehci_hcd usbcore(O) usb_common sata_mml(O) ahci(O) lib= ata(O) satahp(O) drvframe(PO) sd_mod crc_t10dif scsi_mod agetty_interfa= ce(PO) ext3 jbd mbcache nvram_printk(PO) os_feeddog(PO) os_debug(O) osp= _proc(PO) os_rnvramdev(PO) vos(O) bsp(PO) os_die_handler(O) os_oom_hand= ler(O) os_panic_handler(O) biosnvramdriver(O) kbox(O) [2012-03-27 03:42:04][ 3705.434574] Pid: 11, comm: ksoftirqd/1 Tainted:= P W O 3.4.24.15-0.11-default #1 [2012-03-27 03:42:04][ 3705.434577] Call Trace: [2012-03-27 03:42:04][ 3705.434586] [] ? __tasklet_a= ction+0xae/0x1a0 [2012-03-27 03:42:04][ 3705.434592] [] warn_slowpath= _common+0x7a/0xb0 [2012-03-27 03:42:04][ 3705.434596] [] warn_slowpath= _null+0x15/0x20 [2012-03-27 03:42:04][ 3705.434600] [] __tasklet_act= ion+0xae/0x1a0 [2012-03-27 03:42:04][ 3705.434604] [] tasklet_actio= n+0x59/0x60 [2012-03-27 03:42:04][ 3705.434608] [] handle_pendin= g_softirqs+0xb0/0x170 [2012-03-27 03:42:04][ 3705.434613] [] __do_softirq+= 0x49/0xa0 [2012-03-27 03:42:04][ 3705.434616] [] run_ksoftirqd= +0x145/0x160 [2012-03-27 03:42:04][ 3705.434620] [] ? __do_softir= q+0xa0/0xa0 [2012-03-27 03:42:04][ 3705.434623] [] ? __do_softir= q+0xa0/0xa0 [2012-03-27 03:42:04][ 3705.434629] [] kthread+0x9e/= 0xb0 [2012-03-27 03:42:04][ 3705.434637] [] kernel_thread= _helper+0x4/0x10 [2012-03-27 03:42:04][ 3705.434641] [] ? __init_kthr= ead_worker+0x40/0x40 [2012-03-27 03:42:04][ 3705.434646] [] ? gs_change+0= x13/0x13 [2012-03-27 03:42:04][ 3705.434648] ---[ end trace a82addcbe6cbf138 ]--= - I tested my FC IO in my platform(x86 smp machine, which have 8 cpus), a= fter FC IO running a few hours(sometimes 1-2h), I found FC IO abort. We found FC d= river tasklet can not run in current CPU(which received interrupts). In our FC driver, the process flow: =46C card hardware -------> FC driver interrupt handler --------->tas= klet_schedule(fc driver tasklet) ------->tasklet running, call function= process FC IO data. here will disable FC card interrupt = here will enable FC card inter= rupt again After the issue occur, OS can not receive the interrupt again(we check = the FC hardware and the interrupt is disabled). At this time, for debug, we force FC hardware reset to send a interrupt= to OS, and we track this interrupt in FC driver interrupt handler. We found the tasklet state is 0x1(mean state is TASKLET_STATE_SCHED),co= unt is 0, before we call tasklet_schedule(). So the new tasklet can not add to CPU list. =2E........... if (TRUE =3D=3D fcinterrupthandler(&(pstHba->hpRoot))) ---------= ---> disable FC card interrupt here. { set_bit(TADRV_DPC_INT, (void *)&(pstHba->ulDelayProcessFlag)); /* use schedule_task because some of the OS callback events * allocate memory which we can not do in the immediate queue. */ tasklet_schedule(&pstHba->stDelayProcessTasklet); spin_unlock(pstHba->pstPortLock);//=C3=88=C2=A5=C2=B5=C3=B4irq = save return IRQ_HANDLED; } =2E........... And I also add some dynamic debug in __tasklet_action(); after the issu= e occur, I open the dynamic debug. After we force the hardware reset to interrupt OS, we never found the F= C driver tasklet running in dmesg(I identify the tasklet by its data). I guess the FC tasklet is not in CPU global tasklet list. So I suspect the FC driver tasklet maybe lost with the patch 0d9f73fc1e= =2E =46urther more, I tested the kernel without this patch(0d9f73fc1e), I t= ested the FC IO exceed 40 hours, and the result is ok. I hope somebody can help to look at it. If I missing something, let me = know. Thanks! Yijing. --=20 Thanks! Yijing