* Re: [PATCH] sc16is7xx: switch to threaded irq for fixing RT issues [not found] ` <56C41FA5.4010605@prevas.dk> @ 2016-02-17 21:54 ` Jakub Kicinski 2016-02-17 23:25 ` Josh Cartwright 0 siblings, 1 reply; 9+ messages in thread From: Jakub Kicinski @ 2016-02-17 21:54 UTC (permalink / raw) To: Sean Nyekjær; +Cc: linux-serial, kubakici, linux-rt-users [CC: linux-rt-users, can you guys help?] On Wed, 17 Feb 2016 08:22:13 +0100, Sean Nyekjær wrote: > On 2016-02-16 22:02, Jakub Kiciński wrote: > > On Tue, 16 Feb 2016 08:37:11 +0100, Sean Nyekjær wrote: > >> On 2016-02-15 23:08, Jakub Kicinski wrote: > >>> On Mon, 15 Feb 2016 08:58:53 +0100, Sean Nyekjaer wrote: > >>>> Working with RT patches reveals this driver have some spin_lock issues. > >>>> > >>>> spin_lock moved from sc16is7xx_reg_proc to protect to call to > >>>> uart_handle_cts_change as it's the only call that isn't already > >>>> protected in serial-core. > >>> Sorry but this does not look fine. You are removing all async items > >>> from the driver. This works for you because in RT code can sleep with > >>> spin locks taken but for upstream it's not acceptable. Did you test > >>> this patch with upstream kernel and lockdep enabled? > >> I have a 4.1.y with cherry-pick of the commits from tty-testing > >> regarding sc16is7xx. (exept the new gpio api) > >> On our custom board we have 3 sc16is750 on a single SPI channel. I'm > >> using an FTDI chip(connected to a test PC) TX wired to the 3x RX pins on > >> the sc16is750. > >> The problem is easy reproducible just by creating data from the PC at > >> 115200 baud, causes the kernel oops almost immediately. > >> > >> I have not seen any issues with a vanilla 4.1.y, only when i'm applying > >> the RT patch and loading the system. > > Thanks for providing the details, I think there is a misunderstanding > > between us. Allow me to explain again what I meant ;) > > > > I do acknowledge that there may be a problem with the driver on RT > > kernels and I do trust you that you tested your patch on RT kernel > > with latest version of the driver cherry-picked from mainline and it > > works there. > > > > However, IIUC you are proposing your patch to be included in mainline. > > > > What I'm trying to say is that in mainline we need the async works > > which you remove in your patch because we cannot perform SPI/I2C > > blocking I/O under a spinlock... If you compile mainline with your > > patch you will see a slew of "sleeping in atomic context" warnings. > > > > So basically the questions are do you want this patch in mainline and > > have you tested it on non-RT kernels? (Please don't be discouraged - > > there definitely must be a bug somewhere if you're seeing crashes, but > > I'm afraid the proposed solutions is not good enough...) > > > > Thanks > Yeah the first priority must be for the driver to work on the mainline > kernel :-) > > I just wanted to flag a problem with the driver with the RT patches > applied. I thought > RT patch (maybe) revaled that the driver have underlying problem. > > So maybe this patch should be included in the RT patch? > No i have not tested this with a non-RT kernel, maybe i should try :-) I've looked at the stack dump you posted in previous email. It looks like we deadlock on the on spinlock_irqsave() in queue_kthread_work() because our interrupt is not threaded!? I thought all interrupts are threaded in RT by default. ================================= [92/1901] [ INFO: inconsistent lock state ] 4.1.15-rt17-00120-g6a753b6-dirty #10 Not tainted --------------------------------- inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage. mdev/309 [HC1[1]:SC0[0]:HE0:SE1] takes: ((&s->kworker)->lock){?.+...}, at: [<80046c50>] queue_kthread_work+0x18/0x54 {HARDIRQ-ON-W} state was registered at: [<80500028>] rt_spin_lock+0x74/0x88 [<80046930>] kthread_worker_fn+0x84/0x198 [<80046820>] kthread+0xd8/0xec [<8000f658>] ret_from_fork+0x14/0x3c irq event stamp: 1362 hardirqs last enabled at (1361): [<800bc364>] filemap_map_pages+0x2b0/0x424 hardirqs last disabled at (1362): [<800139b4>] __irq_svc+0x34/0x90 softirqs last enabled at (0): [<80026108>] copy_process.part.2+0x370/0x1650 softirqs last disabled at (0): [< (null)>] (null) other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock((&s->kworker)->lock); <Interrupt> lock((&s->kworker)->lock); *** DEADLOCK *** 3 locks held by mdev/309: #0: (&mm->mmap_sem){++++++}, at: [<8001d420>] do_page_fault+0x80/0x368 #1: (&mm->page_table_lock){+.+...}, at: [<800e19fc>] handle_mm_fault+0x630/0xcd0 #2: (rcu_read_lock){......}, at: [<800bc0b4>] filemap_map_pages+0x0/0x424 stack backtrace: CPU: 0 PID: 309 Comm: mdev Not tainted 4.1.15-rt17-00120-g6a753b6-dirty #10 Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) [<80016e58>] (unwind_backtrace) from [<80012e48>] (show_stack+0x10/0x14) [<80012e48>] (show_stack) from [<804fb00c>] (dump_stack+0x7c/0x90) [<804fb00c>] (dump_stack) from [<804f8c6c>] (print_usage_bug+0x2d8/0x2e8) [<804f8c6c>] (print_usage_bug) from [<800665ec>] (mark_lock+0x214/0x7a0) [<800665ec>] (mark_lock) from [<800678dc>] (__lock_acquire+0x778/0x1dd8) [<800678dc>] (__lock_acquire) from [<800697d4>] (lock_acquire+0x6c/0x8c) [<800697d4>] (lock_acquire) from [<80500028>] (rt_spin_lock+0x74/0x88) [<80500028>] (rt_spin_lock) from [<80046c50>] (queue_kthread_work+0x18/0x54) [<80046c50>] (queue_kthread_work) from [<7f00b260>] (sc16is7xx_irq+0x10/0x18 [sc16is7xx]) [<7f00b260>] (sc16is7xx_irq [sc16is7xx]) from [<800734c0>] (handle_irq_event_percpu+0x70/0x15c) [<800734c0>] (handle_irq_event_percpu) from [<800735e8>] (handle_irq_event+0x3c/0x5c) [<800735e8>] (handle_irq_event) from [<80076890>] (handle_level_irq+0xc4/0x13c) [<80076890>] (handle_level_irq) from [<80072b2c>] (generic_handle_irq+0x2c/0x3c) [<80072b2c>] (generic_handle_irq) from [<8025f8e0>] (mxc_gpio_irq_handler+0x3c/0x108) [<8025f8e0>] (mxc_gpio_irq_handler) from [<8025fa2c>] (mx3_gpio_irq_handler+0x80/0xcc) [<8025fa2c>] (mx3_gpio_irq_handler) from [<80072b2c>] (generic_handle_irq+0x2c/0x3c) [<80072b2c>] (generic_handle_irq) from [<80072e1c>] (__handle_domain_irq+0x7c/0xe8) [<80072e1c>] (__handle_domain_irq) from [<800094dc>] (gic_handle_irq+0x24/0x5c) [<800094dc>] (gic_handle_irq) from [<800139c4>] (__irq_svc+0x44/0x90) Exception stack(0xa9ecdda0 to 0xa9ecdde8) dda0: 00000001 00000110 00000000 a9f40b00 20070113 00000001 aa811eb8 80782edf ddc0: a9ecde84 aaa231a0 aaa232ec aae4b800 00000003 a9ecdde8 80066be0 800bc368 dde0: 20070113 ffffffff [<800139c4>] (__irq_svc) from [<800bc368>] (filemap_map_pages+0x2b4/0x424) [<800bc368>] (filemap_map_pages) from [<800e1b0c>] (handle_mm_fault+0x740/0xcd0) [<800e1b0c>] (handle_mm_fault) from [<8001d604>] (do_page_fault+0x264/0x368) [<8001d604>] (do_page_fault) from [<800093b4>] (do_PrefetchAbort+0x34/0x9c) [<800093b4>] (do_PrefetchAbort) from [<80013e9c>] (ret_from_exception+0x0/0x24) Exception stack(0xa9ecdfb0 to 0xa9ecdff8) dfa0: 003f8008 00001000 00079c84 7ed0bf99 dfc0: 000808f0 ffffffff 7ed0bfc3 003f8008 00000000 00000000 000808f0 00000001 dfe0: 76e18598 7ed0bc28 00040d1c 76e18598 60070010 ffffffff random: nonblocking pool is initialized ------------[ cut here ]------------ Kernel BUG at 804fe6d8 [verbose debug info unavailable] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM Modules linked in: gpio_sn65hvs885 ti_ads8688 ad5755 sc16is7xx regmap_spi gpio_pca953x CPU: 0 PID: 215 Comm: kworker/0:2 Not tainted 4.1.15-rt17-00120-g6a753b6-dirty #10 Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) [17/1901] Workqueue: events flush_to_ldisc task: aa414d00 ti: a9c0c000 task.ti: a9c0c000 PC is at rt_spin_lock_slowlock+0x2c8/0x2e4 LR is at rt_spin_lock_slowlock+0x54/0x2e4 pc : [<804fe6d8>] lr : [<804fe464>] psr: 60070193 sp : a9c0dc20 ip : aa414d01 fp : a9edc0c4 r10: 00000000 r9 : 80046be0 r8 : 80782e40 r7 : aa414d00 r6 : 00000000 r5 : 00000001 r4 : a9c0dc38 r3 : aa414d00 r2 : 00000000 r1 : aa414d00 r0 : 00000000 Flags: nZCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment kernel Control: 10c5387d Table: 39fe804a DAC: 00000015 Process kworker/0:2 (pid: 215, stack limit = 0xa9c0c210) Stack: (0xa9c0dc20 to 0xa9c0e000) dc20: a9c0dce0 aac0c000 a9c0dd64 800697d4 00000001 00000080 a9c0dc38 800735f4 dc40: 00000000 a9c0dc44 00000000 aad6b868 00000000 00000002 00000001 804ff9ac dc60: aa414d00 a9edc0c4 80046c50 00000067 aad96400 80782e40 80046be0 00000000 dc80: c0ac9018 80500038 00000001 00000001 80f8474c 80748984 7f00b250 a9edc0c4 dca0: a9edc11c 80046c50 7f00b250 a9efb2c0 00000000 7f00b260 7f00b250 800734c0 dcc0: 800777d4 00000004 00060002 aad96400 aad96468 a9efb2c0 aad41610 00000001 dce0: 80f8474c 80748984 c0ac9018 800735e8 aad96400 aad96468 00000002 80076890 dd00: 800767cc 00000067 00000004 80072b2c 00000004 8025f8e0 d17b30c0 00000001 dd20: 8073e500 a9fdc200 00000000 aad41610 80751108 aad95a00 00000000 00000001 dd40: a9c0dda0 aac0c000 c0ac9018 8025fa2c 00000063 00000000 00000063 80072b2c dd60: 8073c6d0 80072e1c f400010c 00000056 80748ca0 a9c0dda0 f4000100 00000008 dd80: a9fe2000 800094dc 80046be0 60070013 ffffffff a9c0ddd4 ffffffff 800139c4 dda0: a9edc0c4 a9edc268 a9edc108 a9edc108 a9edc268 a9edc108 a9edc0c4 a9fe38a9 ddc0: ffffffff 00000008 a9fe2000 c0ac9018 a9c0dda8 a9c0dde8 80046c88 80046be0 dde0: 60070013 ffffffff a9edc0c4 00000001 00000000 80046c88 aa4f8800 a9edc12c de00: a9fe2000 802a6914 aa414d00 802a75f8 802a7608 c0ac9000 c0ac9000 80292ce8 de20: 60070013 a9fe39a1 a9fe38a1 00000008 c0ac9000 00000000 00000008 80526d20 de40: 55555556 c0ac9000 00000000 a9fe2124 00000000 a9fe2000 aa5722a0 a9ea5cc0 de60: a9fe3800 aa572280 aa57227c 00000000 00000000 80293120 00000001 aa572280 de80: a9fe2000 80296374 00000000 80040ee0 aa678d00 aa572280 ab70ce80 a9c0dec8 dea0: ab710700 00000001 00000000 80040f60 00000001 00000000 80040ee0 800412cc dec0: 00000000 00000000 80f87484 00000000 00000000 80663790 80782dca ab70ce80 dee0: ab70ced4 a9c0c000 80782dca aa678d18 aa678d00 00000008 ab70ce80 80041230 df00: 80742a40 aa414d00 aa678d00 aa655c00 00000000 aa678d00 800411e4 00000000 df20: 00000000 00000000 00000000 80046820 a9c0df9c 00000000 aa414d00 aa678d00 df40: 00000000 00000000 dead4ead ffffffff ffffffff 80785368 00000000 00000000 df60: 8066b7cc a9c0df64 a9c0df64 00000000 00000000 dead4ead ffffffff ffffffff df80: 80785368 00000000 00000000 8066b7cc a9c0df90 a9c0df90 a9c0dfac aa655c00 dfa0: 80046748 00000000 00000000 8000f658 00000000 00000000 00000000 00000000 dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 dfe0: 00000000 00000000 00000000 00000000 00000013 00000000 3bf5a811 3bf5ac11 [<804fe6d8>] (rt_spin_lock_slowlock) from [<80500038>] (rt_spin_lock+0x84/0x88) [<80500038>] (rt_spin_lock) from [<80046c50>] (queue_kthread_work+0x18/0x54) [<80046c50>] (queue_kthread_work) from [<7f00b260>] (sc16is7xx_irq+0x10/0x18 [sc16is7xx]) [<7f00b260>] (sc16is7xx_irq [sc16is7xx]) from [<800734c0>] (handle_irq_event_percpu+0x70/0x15c) [<800734c0>] (handle_irq_event_percpu) from [<800735e8>] (handle_irq_event+0x3c/0x5c) [<800735e8>] (handle_irq_event) from [<80076890>] (handle_level_irq+0xc4/0x13c) [<80076890>] (handle_level_irq) from [<80072b2c>] (generic_handle_irq+0x2c/0x3c) [<80072b2c>] (generic_handle_irq) from [<8025f8e0>] (mxc_gpio_irq_handler+0x3c/0x108) [<8025f8e0>] (mxc_gpio_irq_handler) from [<8025fa2c>] (mx3_gpio_irq_handler+0x80/0xcc) [<8025fa2c>] (mx3_gpio_irq_handler) from [<80072b2c>] (generic_handle_irq+0x2c/0x3c) [<80072b2c>] (generic_handle_irq) from [<80072e1c>] (__handle_domain_irq+0x7c/0xe8) [<80072e1c>] (__handle_domain_irq) from [<800094dc>] (gic_handle_irq+0x24/0x5c) [<800094dc>] (gic_handle_irq) from [<800139c4>] (__irq_svc+0x44/0x90) Exception stack(0xa9c0dda0 to 0xa9c0dde8) dda0: a9edc0c4 a9edc268 a9edc108 a9edc108 a9edc268 a9edc108 a9edc0c4 a9fe38a9 ddc0: ffffffff 00000008 a9fe2000 c0ac9018 a9c0dda8 a9c0dde8 80046c88 80046be0 dde0: 60070013 ffffffff [<800139c4>] (__irq_svc) from [<80046be0>] (insert_kthread_work+0x28/0x80) [<80046be0>] (insert_kthread_work) from [<80046c88>] (queue_kthread_work+0x50/0x54) [<80046c88>] (queue_kthread_work) from [<802a6914>] (__uart_start+0x38/0x3c) [<802a6914>] (__uart_start) from [<802a75f8>] (uart_start+0x24/0x34) [<802a75f8>] (uart_start) from [<80292ce8>] (n_tty_receive_buf_common+0x55c/0x980) [<80292ce8>] (n_tty_receive_buf_common) from [<80293120>] (n_tty_receive_buf2+0x14/0x1c) [<80293120>] (n_tty_receive_buf2) from [<80296374>] (flush_to_ldisc+0xd4/0x188) [<80296374>] (flush_to_ldisc) from [<80040f60>] (process_one_work+0x1a0/0x424) [<80040f60>] (process_one_work) from [<80041230>] (worker_thread+0x4c/0x4cc) [<80041230>] (worker_thread) from [<80046820>] (kthread+0xd8/0xec) [<80046820>] (kthread) from [<8000f658>] (ret_from_fork+0x14/0x3c) Code: 0a000002 e59d3018 e1530004 0a000000 (e7f001f2) ---[ end trace 0000000000000002 ]--- Kernel panic - not syncing: Fatal exception in interrupt CPU1: stopping CPU: 1 PID: 0 Comm: swapper/1 Tainted: G D 4.1.15-rt17-00120-g6a753b6-dirty #10 Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) [<80016e58>] (unwind_backtrace) from [<80012e48>] (show_stack+0x10/0x14) [<80012e48>] (show_stack) from [<804fb00c>] (dump_stack+0x7c/0x90) [<804fb00c>] (dump_stack) from [<80015d4c>] (handle_IPI+0x17c/0x190) [<80015d4c>] (handle_IPI) from [<80009510>] (gic_handle_irq+0x58/0x5c) [<80009510>] (gic_handle_irq) from [<800139c4>] (__irq_svc+0x44/0x90) Exception stack(0xaacadf58 to 0xaacadfa0) df40: 8036afbc aaca2100 df60: 00000001 00000000 ba4c1c99 0000000f d76b24e1 0000000f 00000001 ab7197e8 df80: 8073a894 00000001 aacac000 aacadfa0 8036afbc 8036afc0 60000013 ffffffff [<800139c4>] (__irq_svc) from [<8036afc0>] (cpuidle_enter_state+0xbc/0x1f4) [<8036afc0>] (cpuidle_enter_state) from [<80063198>] (cpu_startup_entry+0x23c/0x394) [<80063198>] (cpu_startup_entry) from [<100095ac>] (0x100095ac) -- To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] sc16is7xx: switch to threaded irq for fixing RT issues 2016-02-17 21:54 ` [PATCH] sc16is7xx: switch to threaded irq for fixing RT issues Jakub Kicinski @ 2016-02-17 23:25 ` Josh Cartwright 2016-02-18 7:29 ` Sean Nyekjær 0 siblings, 1 reply; 9+ messages in thread From: Josh Cartwright @ 2016-02-17 23:25 UTC (permalink / raw) To: Jakub Kicinski; +Cc: Sean Nyekj?r, linux-serial, kubakici, linux-rt-users On Wed, Feb 17, 2016 at 09:54:27PM +0000, Jakub Kicinski wrote: > [CC: linux-rt-users, can you guys help?] > > On Wed, 17 Feb 2016 08:22:13 +0100, Sean Nyekjær wrote: > > On 2016-02-16 22:02, Jakub Kici??ski wrote: > > > On Tue, 16 Feb 2016 08:37:11 +0100, Sean Nyekjær wrote: > > >> On 2016-02-15 23:08, Jakub Kicinski wrote: > > >>> On Mon, 15 Feb 2016 08:58:53 +0100, Sean Nyekjaer wrote: > > >>>> Working with RT patches reveals this driver have some spin_lock issues. > > >>>> > > >>>> spin_lock moved from sc16is7xx_reg_proc to protect to call to > > >>>> uart_handle_cts_change as it's the only call that isn't already > > >>>> protected in serial-core. > > >>> Sorry but this does not look fine. You are removing all async items > > >>> from the driver. This works for you because in RT code can sleep with > > >>> spin locks taken but for upstream it's not acceptable. Did you test > > >>> this patch with upstream kernel and lockdep enabled? > > >> I have a 4.1.y with cherry-pick of the commits from tty-testing > > >> regarding sc16is7xx. (exept the new gpio api) > > >> On our custom board we have 3 sc16is750 on a single SPI channel. I'm > > >> using an FTDI chip(connected to a test PC) TX wired to the 3x RX pins on > > >> the sc16is750. > > >> The problem is easy reproducible just by creating data from the PC at > > >> 115200 baud, causes the kernel oops almost immediately. > > >> > > >> I have not seen any issues with a vanilla 4.1.y, only when i'm applying > > >> the RT patch and loading the system. > > > Thanks for providing the details, I think there is a misunderstanding > > > between us. Allow me to explain again what I meant ;) > > > > > > I do acknowledge that there may be a problem with the driver on RT > > > kernels and I do trust you that you tested your patch on RT kernel > > > with latest version of the driver cherry-picked from mainline and it > > > works there. > > > > > > However, IIUC you are proposing your patch to be included in mainline. > > > > > > What I'm trying to say is that in mainline we need the async works > > > which you remove in your patch because we cannot perform SPI/I2C > > > blocking I/O under a spinlock... If you compile mainline with your > > > patch you will see a slew of "sleeping in atomic context" warnings. > > > > > > So basically the questions are do you want this patch in mainline and > > > have you tested it on non-RT kernels? (Please don't be discouraged - > > > there definitely must be a bug somewhere if you're seeing crashes, but > > > I'm afraid the proposed solutions is not good enough...) > > > > > > Thanks > > Yeah the first priority must be for the driver to work on the mainline > > kernel :-) > > > > I just wanted to flag a problem with the driver with the RT patches > > applied. I thought > > RT patch (maybe) revaled that the driver have underlying problem. > > > > So maybe this patch should be included in the RT patch? > > No i have not tested this with a non-RT kernel, maybe i should try :-) > > I've looked at the stack dump you posted in previous email. It looks > like we deadlock on the on spinlock_irqsave() in queue_kthread_work() > because our interrupt is not threaded!? I thought all interrupts are > threaded in RT by default. I believe the actual problem is that the handler is being registered w/ IRQF_ONESHOT. I'm not sure that even makes sense in this scenario, but is perhaps an unintended carry over from request_threaded_irq -> request_irq transition in 9e6f4ca3 ("sc16is7xx: use kthread_worker for tx_work and irq"). Sean- Does this solve your problem? Josh diff --git a/drivers/tty/serial/sc16is7xx.c b/drivers/tty/serial/sc16is7xx.c index e78fa99..1bb5c0e 100644 --- a/drivers/tty/serial/sc16is7xx.c +++ b/drivers/tty/serial/sc16is7xx.c @@ -1257,7 +1257,7 @@ static int sc16is7xx_probe(struct device *dev, /* Setup interrupt */ ret = devm_request_irq(dev, irq, sc16is7xx_irq, - IRQF_ONESHOT | flags, dev_name(dev), s); + flags, dev_name(dev), s); if (!ret) return 0; -- To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] sc16is7xx: switch to threaded irq for fixing RT issues 2016-02-17 23:25 ` Josh Cartwright @ 2016-02-18 7:29 ` Sean Nyekjær 2016-02-18 8:46 ` Jakub Kicinski 0 siblings, 1 reply; 9+ messages in thread From: Sean Nyekjær @ 2016-02-18 7:29 UTC (permalink / raw) To: Josh Cartwright, Jakub Kicinski; +Cc: linux-serial, kubakici, linux-rt-users On 2016-02-18 00:25, Josh Cartwright wrote: > On Wed, Feb 17, 2016 at 09:54:27PM +0000, Jakub Kicinski wrote: >> [CC: linux-rt-users, can you guys help?] >> >> On Wed, 17 Feb 2016 08:22:13 +0100, Sean Nyekjær wrote: >>> On 2016-02-16 22:02, Jakub Kici??ski wrote: >>>> On Tue, 16 Feb 2016 08:37:11 +0100, Sean Nyekjær wrote: >>>>> On 2016-02-15 23:08, Jakub Kicinski wrote: >>>>>> On Mon, 15 Feb 2016 08:58:53 +0100, Sean Nyekjaer wrote: >>>>>>> Working with RT patches reveals this driver have some spin_lock issues. >>>>>>> >>>>>>> spin_lock moved from sc16is7xx_reg_proc to protect to call to >>>>>>> uart_handle_cts_change as it's the only call that isn't already >>>>>>> protected in serial-core. >>>>>> Sorry but this does not look fine. You are removing all async items >>>>>> from the driver. This works for you because in RT code can sleep with >>>>>> spin locks taken but for upstream it's not acceptable. Did you test >>>>>> this patch with upstream kernel and lockdep enabled? >>>>> I have a 4.1.y with cherry-pick of the commits from tty-testing >>>>> regarding sc16is7xx. (exept the new gpio api) >>>>> On our custom board we have 3 sc16is750 on a single SPI channel. I'm >>>>> using an FTDI chip(connected to a test PC) TX wired to the 3x RX pins on >>>>> the sc16is750. >>>>> The problem is easy reproducible just by creating data from the PC at >>>>> 115200 baud, causes the kernel oops almost immediately. >>>>> >>>>> I have not seen any issues with a vanilla 4.1.y, only when i'm applying >>>>> the RT patch and loading the system. >>>> Thanks for providing the details, I think there is a misunderstanding >>>> between us. Allow me to explain again what I meant ;) >>>> >>>> I do acknowledge that there may be a problem with the driver on RT >>>> kernels and I do trust you that you tested your patch on RT kernel >>>> with latest version of the driver cherry-picked from mainline and it >>>> works there. >>>> >>>> However, IIUC you are proposing your patch to be included in mainline. >>>> >>>> What I'm trying to say is that in mainline we need the async works >>>> which you remove in your patch because we cannot perform SPI/I2C >>>> blocking I/O under a spinlock... If you compile mainline with your >>>> patch you will see a slew of "sleeping in atomic context" warnings. >>>> >>>> So basically the questions are do you want this patch in mainline and >>>> have you tested it on non-RT kernels? (Please don't be discouraged - >>>> there definitely must be a bug somewhere if you're seeing crashes, but >>>> I'm afraid the proposed solutions is not good enough...) >>>> >>>> Thanks >>> Yeah the first priority must be for the driver to work on the mainline >>> kernel :-) >>> >>> I just wanted to flag a problem with the driver with the RT patches >>> applied. I thought >>> RT patch (maybe) revaled that the driver have underlying problem. >>> >>> So maybe this patch should be included in the RT patch? >>> No i have not tested this with a non-RT kernel, maybe i should try :-) >> I've looked at the stack dump you posted in previous email. It looks >> like we deadlock on the on spinlock_irqsave() in queue_kthread_work() >> because our interrupt is not threaded!? I thought all interrupts are >> threaded in RT by default. > I believe the actual problem is that the handler is being registered w/ > IRQF_ONESHOT. I'm not sure that even makes sense in this scenario, but > is perhaps an unintended carry over from request_threaded_irq -> > request_irq transition in 9e6f4ca3 ("sc16is7xx: use kthread_worker for > tx_work and irq"). > > Sean- > > Does this solve your problem? > > Josh > > diff --git a/drivers/tty/serial/sc16is7xx.c b/drivers/tty/serial/sc16is7xx.c > index e78fa99..1bb5c0e 100644 > --- a/drivers/tty/serial/sc16is7xx.c > +++ b/drivers/tty/serial/sc16is7xx.c > @@ -1257,7 +1257,7 @@ static int sc16is7xx_probe(struct device *dev, > > /* Setup interrupt */ > ret = devm_request_irq(dev, irq, sc16is7xx_irq, > - IRQF_ONESHOT | flags, dev_name(dev), s); > + flags, dev_name(dev), s); > if (!ret) > return 0; > Wow, nice that you CC'd the RT guys :-) Josh that was it, it runs without deadlocks now :-D Thank you. Jakub, is this fix ok? Who is making this into a PATCH? -- To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] sc16is7xx: switch to threaded irq for fixing RT issues 2016-02-18 7:29 ` Sean Nyekjær @ 2016-02-18 8:46 ` Jakub Kicinski 2016-02-18 17:26 ` [PATCH] sc16is7xx: drop bogus use of IRQF_ONESHOT Josh Cartwright 0 siblings, 1 reply; 9+ messages in thread From: Jakub Kicinski @ 2016-02-18 8:46 UTC (permalink / raw) To: Sean Nyekjær; +Cc: Josh Cartwright, linux-serial, kubakici, linux-rt-users On Thu, 18 Feb 2016 08:29:40 +0100, Sean Nyekjær wrote: > > diff --git a/drivers/tty/serial/sc16is7xx.c b/drivers/tty/serial/sc16is7xx.c > > index e78fa99..1bb5c0e 100644 > > --- a/drivers/tty/serial/sc16is7xx.c > > +++ b/drivers/tty/serial/sc16is7xx.c > > @@ -1257,7 +1257,7 @@ static int sc16is7xx_probe(struct device *dev, > > > > /* Setup interrupt */ > > ret = devm_request_irq(dev, irq, sc16is7xx_irq, > > - IRQF_ONESHOT | flags, dev_name(dev), s); > > + flags, dev_name(dev), s); > > if (!ret) > > return 0; > > > Wow, nice that you CC'd the RT guys :-) > > Josh that was it, it runs without deadlocks now :-D > > Thank you. > > Jakub, is this fix ok? Who is making this into a PATCH? Yes, in fact similar patch circulated at some point but the author flaked out before it was ready for merging. Josh could you submit officially? Please CC stable for 4.1+. Fixes: 9e6f4ca3e567 ("sc16is7xx: use kthread_worker for tx_work and irq") Thanks a lot! -- To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH] sc16is7xx: drop bogus use of IRQF_ONESHOT 2016-02-18 8:46 ` Jakub Kicinski @ 2016-02-18 17:26 ` Josh Cartwright 2016-03-08 12:43 ` Sean Nyekjær 2016-09-09 13:14 ` Sebastian Andrzej Siewior 0 siblings, 2 replies; 9+ messages in thread From: Josh Cartwright @ 2016-02-18 17:26 UTC (permalink / raw) To: Jakub Kicinski Cc: linux-serial, linux-rt-users, Sean Nyekjaer, linux-kernel, stable The use of IRQF_ONESHOT when registering an interrupt handler with request_irq() is non-sensical. Not only that, it also prevents the handler from being threaded when it otherwise should be w/ IRQ_FORCED_THREADING is enabled. This causes the following deadlock observed by Sean Nyekjaer on -rt: Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM [..] rt_spin_lock_slowlock from (queue_kthread_work+0x18/0x74) queue_kthread_work) from (sc16is7xx_irq+0x10/0x18 [sc16is7xx]) sc16is7xx_irq [sc16is7xx]) from (handle_irq_event_percpu+0x70/0x158) handle_irq_event_percpu) from (handle_irq_event+0x68/0xa8) handle_irq_event) from (handle_level_irq+0x10c/0x184) handle_level_irq) from (generic_handle_irq+0x2c/0x3c) generic_handle_irq) from (mxc_gpio_irq_handler+0x3c/0x108) mxc_gpio_irq_handler) from (mx3_gpio_irq_handler+0x80/0xcc) mx3_gpio_irq_handler) from (generic_handle_irq+0x2c/0x3c) generic_handle_irq) from (__handle_domain_irq+0x7c/0xe8) __handle_domain_irq) from (gic_handle_irq+0x24/0x5c) gic_handle_irq) from (__irq_svc+0x40/0x88) (__irq_svc) from (rt_spin_unlock+0x1c/0x68) (rt_spin_unlock) from (kthread_worker_fn+0x104/0x17c) (kthread_worker_fn) from (kthread+0xd0/0xe8) (kthread) from (ret_from_fork+0x14/0x2c) Reported-by: Sean Nyekjaer <sean.nyekjaer@prevas.dk> Fixes: 9e6f4ca3e567 ("sc16is7xx: use kthread_worker for tx_work and irq") Cc: stable@vger.kernel.org # v4.1+ Signed-off-by: Josh Cartwright <joshc@ni.com> --- drivers/tty/serial/sc16is7xx.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/tty/serial/sc16is7xx.c b/drivers/tty/serial/sc16is7xx.c index e78fa99..1bb5c0e 100644 --- a/drivers/tty/serial/sc16is7xx.c +++ b/drivers/tty/serial/sc16is7xx.c @@ -1257,7 +1257,7 @@ static int sc16is7xx_probe(struct device *dev, /* Setup interrupt */ ret = devm_request_irq(dev, irq, sc16is7xx_irq, - IRQF_ONESHOT | flags, dev_name(dev), s); + flags, dev_name(dev), s); if (!ret) return 0; -- 2.7.0 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] sc16is7xx: drop bogus use of IRQF_ONESHOT 2016-02-18 17:26 ` [PATCH] sc16is7xx: drop bogus use of IRQF_ONESHOT Josh Cartwright @ 2016-03-08 12:43 ` Sean Nyekjær 2016-09-09 13:14 ` Sebastian Andrzej Siewior 1 sibling, 0 replies; 9+ messages in thread From: Sean Nyekjær @ 2016-03-08 12:43 UTC (permalink / raw) To: Josh Cartwright, Jakub Kicinski Cc: linux-serial, linux-rt-users, linux-kernel, stable On 2016-02-18 18:26, Josh Cartwright wrote: > The use of IRQF_ONESHOT when registering an interrupt handler with > request_irq() is non-sensical. > > Not only that, it also prevents the handler from being threaded when it > otherwise should be w/ IRQ_FORCED_THREADING is enabled. This causes the > following deadlock observed by Sean Nyekjaer on -rt: > > Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM > [..] > rt_spin_lock_slowlock from (queue_kthread_work+0x18/0x74) > queue_kthread_work) from (sc16is7xx_irq+0x10/0x18 [sc16is7xx]) > sc16is7xx_irq [sc16is7xx]) from (handle_irq_event_percpu+0x70/0x158) > handle_irq_event_percpu) from (handle_irq_event+0x68/0xa8) > handle_irq_event) from (handle_level_irq+0x10c/0x184) > handle_level_irq) from (generic_handle_irq+0x2c/0x3c) > generic_handle_irq) from (mxc_gpio_irq_handler+0x3c/0x108) > mxc_gpio_irq_handler) from (mx3_gpio_irq_handler+0x80/0xcc) > mx3_gpio_irq_handler) from (generic_handle_irq+0x2c/0x3c) > generic_handle_irq) from (__handle_domain_irq+0x7c/0xe8) > __handle_domain_irq) from (gic_handle_irq+0x24/0x5c) > gic_handle_irq) from (__irq_svc+0x40/0x88) > (__irq_svc) from (rt_spin_unlock+0x1c/0x68) > (rt_spin_unlock) from (kthread_worker_fn+0x104/0x17c) > (kthread_worker_fn) from (kthread+0xd0/0xe8) > (kthread) from (ret_from_fork+0x14/0x2c) > > Reported-by: Sean Nyekjaer <sean.nyekjaer@prevas.dk> > Fixes: 9e6f4ca3e567 ("sc16is7xx: use kthread_worker for tx_work and irq") > Cc: stable@vger.kernel.org # v4.1+ > Signed-off-by: Josh Cartwright <joshc@ni.com> > --- > drivers/tty/serial/sc16is7xx.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/tty/serial/sc16is7xx.c b/drivers/tty/serial/sc16is7xx.c > index e78fa99..1bb5c0e 100644 > --- a/drivers/tty/serial/sc16is7xx.c > +++ b/drivers/tty/serial/sc16is7xx.c > @@ -1257,7 +1257,7 @@ static int sc16is7xx_probe(struct device *dev, > > /* Setup interrupt */ > ret = devm_request_irq(dev, irq, sc16is7xx_irq, > - IRQF_ONESHOT | flags, dev_name(dev), s); > + flags, dev_name(dev), s); > if (!ret) > return 0; > Is PATCH this dropped? It clearly fixes the irq when running with RT patches :-) /Sean ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] sc16is7xx: drop bogus use of IRQF_ONESHOT 2016-02-18 17:26 ` [PATCH] sc16is7xx: drop bogus use of IRQF_ONESHOT Josh Cartwright 2016-03-08 12:43 ` Sean Nyekjær @ 2016-09-09 13:14 ` Sebastian Andrzej Siewior 1 sibling, 0 replies; 9+ messages in thread From: Sebastian Andrzej Siewior @ 2016-09-09 13:14 UTC (permalink / raw) To: Josh Cartwright Cc: Jakub Kicinski, linux-serial, linux-rt-users, Sean Nyekjaer, linux-kernel On 2016-02-18 11:26:12 [-0600], Josh Cartwright wrote: > The use of IRQF_ONESHOT when registering an interrupt handler with > request_irq() is non-sensical. > > Not only that, it also prevents the handler from being threaded when it > otherwise should be w/ IRQ_FORCED_THREADING is enabled. This causes the > following deadlock observed by Sean Nyekjaer on -rt: So we this patch http://lkml.iu.edu/hypermail/linux/kernel/1602.2/02637.html sitting in -RT. Then someone posted http://www.spinics.net/lists/linux-rt-users/msg14975.html which depends on it. Before I was are of the first patch, I posted http://www.spinics.net/lists/linux-rt-users/msg14666.html and nobody seems to care if anything of this gets merged upstream where it belongs. Could some please carry this to Greg upstream? Sebastian ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH] sc16is7xx: Drop bogus use of IRQF_ONESHOT @ 2016-10-03 15:14 Julia Cartwright 2016-10-13 0:48 ` Jakub Kicinski 0 siblings, 1 reply; 9+ messages in thread From: Julia Cartwright @ 2016-10-03 15:14 UTC (permalink / raw) To: Greg Kroah-Hartman Cc: linux-kernel, Sean Nyekjaer, Josh Cartwright, linux-rt-users, Jakub Kicinski, stable, linux-serial, Sebastian Andrzej Siewior The use of IRQF_ONESHOT when registering an interrupt handler with request_irq() is non-sensical. Not only that, it also prevents the handler from being threaded when it otherwise should be w/ IRQ_FORCED_THREADING is enabled. This causes the following deadlock observed by Sean Nyekjaer on -rt: Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM [..] rt_spin_lock_slowlock from queue_kthread_work queue_kthread_work from sc16is7xx_irq sc16is7xx_irq [sc16is7xx] from handle_irq_event_percpu handle_irq_event_percpu from handle_irq_event handle_irq_event from handle_level_irq handle_level_irq from generic_handle_irq generic_handle_irq from mxc_gpio_irq_handler mxc_gpio_irq_handler from mx3_gpio_irq_handler mx3_gpio_irq_handler from generic_handle_irq generic_handle_irq from __handle_domain_irq __handle_domain_irq from gic_handle_irq gic_handle_irq from __irq_svc __irq_svc from rt_spin_unlock rt_spin_unlock from kthread_worker_fn kthread_worker_fn from kthread kthread from ret_from_fork Fixes: 9e6f4ca3e567 ("sc16is7xx: use kthread_worker for tx_work and irq") Reported-by: Sean Nyekjaer <sean.nyekjaer@prevas.dk> Signed-off-by: Josh Cartwright <joshc@ni.com> Cc: linux-rt-users@vger.kernel.org Cc: Jakub Kicinski <moorray3@wp.pl> Cc: stable@vger.kernel.org Cc: linux-serial@vger.kernel.org Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Julia Cartwright <julia@ni.com> --- This patch has been in -rt for a little while, but it isn't -rt specific. See relevant bug report thread which starts here: http://www.spinics.net/lists/linux-rt-users/msg14620.html drivers/tty/serial/sc16is7xx.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/tty/serial/sc16is7xx.c b/drivers/tty/serial/sc16is7xx.c index a9d94f7..96a374d 100644 --- a/drivers/tty/serial/sc16is7xx.c +++ b/drivers/tty/serial/sc16is7xx.c @@ -1260,7 +1260,7 @@ static int sc16is7xx_probe(struct device *dev, /* Setup interrupt */ ret = devm_request_irq(dev, irq, sc16is7xx_irq, - IRQF_ONESHOT | flags, dev_name(dev), s); + flags, dev_name(dev), s); if (!ret) return 0; -- 2.9.3 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] sc16is7xx: Drop bogus use of IRQF_ONESHOT 2016-10-03 15:14 [PATCH] sc16is7xx: Drop " Julia Cartwright @ 2016-10-13 0:48 ` Jakub Kicinski 0 siblings, 0 replies; 9+ messages in thread From: Jakub Kicinski @ 2016-10-13 0:48 UTC (permalink / raw) To: Julia Cartwright Cc: Greg Kroah-Hartman, linux-kernel, Sean Nyekjaer, Josh Cartwright, linux-rt-users, stable, linux-serial, Sebastian Andrzej Siewior On Mon, 3 Oct 2016 10:14:32 -0500, Julia Cartwright wrote: > The use of IRQF_ONESHOT when registering an interrupt handler with > request_irq() is non-sensical. > ... Disabling the irq when kthread is pending as mentioned here: http://www.spinics.net/lists/linux-rt-users/msg14673.html could be beneficial but this is definitely a step in the right direction! Acked-by: Jakub Kicinski <kubakici@wp.pl> Thanks a lot Julia! ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2016-10-13 0:48 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <1455523133-29895-1-git-send-email-sean.nyekjaer@prevas.dk> [not found] ` <20160215220810.11529101@laptop> [not found] ` <56C2D1A7.1030908@prevas.dk> [not found] ` <20160216210250.34c2f07f@laptop> [not found] ` <56C41FA5.4010605@prevas.dk> 2016-02-17 21:54 ` [PATCH] sc16is7xx: switch to threaded irq for fixing RT issues Jakub Kicinski 2016-02-17 23:25 ` Josh Cartwright 2016-02-18 7:29 ` Sean Nyekjær 2016-02-18 8:46 ` Jakub Kicinski 2016-02-18 17:26 ` [PATCH] sc16is7xx: drop bogus use of IRQF_ONESHOT Josh Cartwright 2016-03-08 12:43 ` Sean Nyekjær 2016-09-09 13:14 ` Sebastian Andrzej Siewior 2016-10-03 15:14 [PATCH] sc16is7xx: Drop " Julia Cartwright 2016-10-13 0:48 ` Jakub Kicinski
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).