* [PATCH RT 3.18] ring-buffer: Mark irq_work as HARD_IRQ to prevent deadlocks @ 2015-04-16 14:06 Jan Kiszka 2015-04-16 14:12 ` Steven Rostedt 2015-04-16 14:26 ` Sebastian Andrzej Siewior 0 siblings, 2 replies; 26+ messages in thread From: Jan Kiszka @ 2015-04-16 14:06 UTC (permalink / raw) To: Sebastian Andrzej Siewior, RT; +Cc: Linux Kernel Mailing List, Steven Rostedt ftrace may trigger rb_wakeups while holding pi_lock which will also be requested via trace_...->...->ring_buffer_unlock_commit->...-> irq_work_queue->raise_softirq->try_to_wake_up. This quickly causes deadlocks when trying to use ftrace under -rt. Resolve this by marking the ring buffer's irq_work as HARD_IRQ. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> --- I'm not yet sure if this doesn't push work into hard-irq context that is better not done there on -rt. I'm also not sure if there aren't more such cases, given that -rt turns the default irq_work wakeup policy around. But maybe we are lucky. kernel/trace/ring_buffer.c | 1 + 1 file changed, 1 insertion(+) diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c index f4fbbfc..6a1939e 100644 --- a/kernel/trace/ring_buffer.c +++ b/kernel/trace/ring_buffer.c @@ -1252,6 +1252,7 @@ rb_allocate_cpu_buffer(struct ring_buffer *buffer, int nr_pages, int cpu) init_irq_work(&cpu_buffer->irq_work.work, rb_wake_up_waiters); init_waitqueue_head(&cpu_buffer->irq_work.waiters); init_waitqueue_head(&cpu_buffer->irq_work.full_waiters); + cpu_buffer->irq_work.work.flags = IRQ_WORK_HARD_IRQ; bpage = kzalloc_node(ALIGN(sizeof(*bpage), cache_line_size()), GFP_KERNEL, cpu_to_node(cpu)); ^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [PATCH RT 3.18] ring-buffer: Mark irq_work as HARD_IRQ to prevent deadlocks 2015-04-16 14:06 [PATCH RT 3.18] ring-buffer: Mark irq_work as HARD_IRQ to prevent deadlocks Jan Kiszka @ 2015-04-16 14:12 ` Steven Rostedt 2015-04-16 14:26 ` Sebastian Andrzej Siewior 1 sibling, 0 replies; 26+ messages in thread From: Steven Rostedt @ 2015-04-16 14:12 UTC (permalink / raw) To: Jan Kiszka; +Cc: Sebastian Andrzej Siewior, RT, Linux Kernel Mailing List On Thu, 16 Apr 2015 16:06:54 +0200 Jan Kiszka <jan.kiszka@siemens.com> wrote: > ftrace may trigger rb_wakeups while holding pi_lock which will also be > requested via trace_...->...->ring_buffer_unlock_commit->...-> > irq_work_queue->raise_softirq->try_to_wake_up. This quickly causes > deadlocks when trying to use ftrace under -rt. > > Resolve this by marking the ring buffer's irq_work as HARD_IRQ. Hmm, I could have sworn I wrote a patch like this not too long ago. I'll have to check that out. Thanks, -- Steve > > Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> > --- > > I'm not yet sure if this doesn't push work into hard-irq context that > is better not done there on -rt. > > I'm also not sure if there aren't more such cases, given that -rt turns > the default irq_work wakeup policy around. But maybe we are lucky. > > kernel/trace/ring_buffer.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c > index f4fbbfc..6a1939e 100644 > --- a/kernel/trace/ring_buffer.c > +++ b/kernel/trace/ring_buffer.c > @@ -1252,6 +1252,7 @@ rb_allocate_cpu_buffer(struct ring_buffer *buffer, int nr_pages, int cpu) > init_irq_work(&cpu_buffer->irq_work.work, rb_wake_up_waiters); > init_waitqueue_head(&cpu_buffer->irq_work.waiters); > init_waitqueue_head(&cpu_buffer->irq_work.full_waiters); > + cpu_buffer->irq_work.work.flags = IRQ_WORK_HARD_IRQ; > > bpage = kzalloc_node(ALIGN(sizeof(*bpage), cache_line_size()), > GFP_KERNEL, cpu_to_node(cpu)); ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH RT 3.18] ring-buffer: Mark irq_work as HARD_IRQ to prevent deadlocks 2015-04-16 14:06 [PATCH RT 3.18] ring-buffer: Mark irq_work as HARD_IRQ to prevent deadlocks Jan Kiszka 2015-04-16 14:12 ` Steven Rostedt @ 2015-04-16 14:26 ` Sebastian Andrzej Siewior 2015-04-16 14:28 ` Jan Kiszka 1 sibling, 1 reply; 26+ messages in thread From: Sebastian Andrzej Siewior @ 2015-04-16 14:26 UTC (permalink / raw) To: Jan Kiszka, RT; +Cc: Linux Kernel Mailing List, Steven Rostedt On 04/16/2015 04:06 PM, Jan Kiszka wrote: > ftrace may trigger rb_wakeups while holding pi_lock which will also be > requested via trace_...->...->ring_buffer_unlock_commit->...-> > irq_work_queue->raise_softirq->try_to_wake_up. This quickly causes > deadlocks when trying to use ftrace under -rt. > > Resolve this by marking the ring buffer's irq_work as HARD_IRQ. > > Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> > --- > > I'm not yet sure if this doesn't push work into hard-irq context that > is better not done there on -rt. everything should be done in the soft-irq. > > I'm also not sure if there aren't more such cases, given that -rt turns > the default irq_work wakeup policy around. But maybe we are lucky. The only thing that is getting done in the hardirq is the FULL_NO_HZ thingy. I would be _very_ glad if we could keep it that way. > kernel/trace/ring_buffer.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c > index f4fbbfc..6a1939e 100644 > --- a/kernel/trace/ring_buffer.c > +++ b/kernel/trace/ring_buffer.c > @@ -1252,6 +1252,7 @@ rb_allocate_cpu_buffer(struct ring_buffer *buffer, int nr_pages, int cpu) > init_irq_work(&cpu_buffer->irq_work.work, rb_wake_up_waiters); > init_waitqueue_head(&cpu_buffer->irq_work.waiters); > init_waitqueue_head(&cpu_buffer->irq_work.full_waiters); > + cpu_buffer->irq_work.work.flags = IRQ_WORK_HARD_IRQ; > > bpage = kzalloc_node(ALIGN(sizeof(*bpage), cache_line_size()), > GFP_KERNEL, cpu_to_node(cpu)); > Sebastian ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH RT 3.18] ring-buffer: Mark irq_work as HARD_IRQ to prevent deadlocks 2015-04-16 14:26 ` Sebastian Andrzej Siewior @ 2015-04-16 14:28 ` Jan Kiszka 2015-04-16 14:57 ` Sebastian Andrzej Siewior 2015-04-16 15:10 ` Steven Rostedt 0 siblings, 2 replies; 26+ messages in thread From: Jan Kiszka @ 2015-04-16 14:28 UTC (permalink / raw) To: Sebastian Andrzej Siewior, RT; +Cc: Linux Kernel Mailing List, Steven Rostedt On 2015-04-16 16:26, Sebastian Andrzej Siewior wrote: > On 04/16/2015 04:06 PM, Jan Kiszka wrote: >> ftrace may trigger rb_wakeups while holding pi_lock which will also be >> requested via trace_...->...->ring_buffer_unlock_commit->...-> >> irq_work_queue->raise_softirq->try_to_wake_up. This quickly causes >> deadlocks when trying to use ftrace under -rt. >> >> Resolve this by marking the ring buffer's irq_work as HARD_IRQ. >> >> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> >> --- >> >> I'm not yet sure if this doesn't push work into hard-irq context that >> is better not done there on -rt. > > everything should be done in the soft-irq. > >> >> I'm also not sure if there aren't more such cases, given that -rt turns >> the default irq_work wakeup policy around. But maybe we are lucky. > > The only thing that is getting done in the hardirq is the FULL_NO_HZ > thingy. I would be _very_ glad if we could keep it that way. Then - to my current understanding - we need an NMI-safe trigger for soft-irq work. Is there anything like this existing already? Or can we still use the IPI-based kick without actually doing the work in hard-irq context? Jan > >> kernel/trace/ring_buffer.c | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c >> index f4fbbfc..6a1939e 100644 >> --- a/kernel/trace/ring_buffer.c >> +++ b/kernel/trace/ring_buffer.c >> @@ -1252,6 +1252,7 @@ rb_allocate_cpu_buffer(struct ring_buffer *buffer, int nr_pages, int cpu) >> init_irq_work(&cpu_buffer->irq_work.work, rb_wake_up_waiters); >> init_waitqueue_head(&cpu_buffer->irq_work.waiters); >> init_waitqueue_head(&cpu_buffer->irq_work.full_waiters); >> + cpu_buffer->irq_work.work.flags = IRQ_WORK_HARD_IRQ; >> >> bpage = kzalloc_node(ALIGN(sizeof(*bpage), cache_line_size()), >> GFP_KERNEL, cpu_to_node(cpu)); >> > > Sebastian > -- Siemens AG, Corporate Technology, CT RTC ITP SES-DE Corporate Competence Center Embedded Linux ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH RT 3.18] ring-buffer: Mark irq_work as HARD_IRQ to prevent deadlocks 2015-04-16 14:28 ` Jan Kiszka @ 2015-04-16 14:57 ` Sebastian Andrzej Siewior 2015-04-16 15:31 ` Jan Kiszka 2015-04-16 15:10 ` Steven Rostedt 1 sibling, 1 reply; 26+ messages in thread From: Sebastian Andrzej Siewior @ 2015-04-16 14:57 UTC (permalink / raw) To: Jan Kiszka, RT; +Cc: Linux Kernel Mailing List, Steven Rostedt On 04/16/2015 04:28 PM, Jan Kiszka wrote: > On 2015-04-16 16:26, Sebastian Andrzej Siewior wrote: >> On 04/16/2015 04:06 PM, Jan Kiszka wrote: >>> ftrace may trigger rb_wakeups while holding pi_lock which will also be >>> requested via trace_...->...->ring_buffer_unlock_commit->...-> >>> irq_work_queue->raise_softirq->try_to_wake_up. This quickly causes >>> deadlocks when trying to use ftrace under -rt. >>> >>> Resolve this by marking the ring buffer's irq_work as HARD_IRQ. >>> >>> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> >>> --- >>> >>> I'm not yet sure if this doesn't push work into hard-irq context that >>> is better not done there on -rt. >> >> everything should be done in the soft-irq. >> >>> >>> I'm also not sure if there aren't more such cases, given that -rt turns >>> the default irq_work wakeup policy around. But maybe we are lucky. >> >> The only thing that is getting done in the hardirq is the FULL_NO_HZ >> thingy. I would be _very_ glad if we could keep it that way. > > Then - to my current understanding - we need an NMI-safe trigger for > soft-irq work. Is there anything like this existing already? Or can we > still use the IPI-based kick without actually doing the work in hard-irq > context? But if you trigger it via IPI it will still run in hardirq context, right? Can you describe how run into this and try to think about it in a quiet moment. It it just enabling the function tracer and running it? > Jan Sebastian ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH RT 3.18] ring-buffer: Mark irq_work as HARD_IRQ to prevent deadlocks 2015-04-16 14:57 ` Sebastian Andrzej Siewior @ 2015-04-16 15:31 ` Jan Kiszka 0 siblings, 0 replies; 26+ messages in thread From: Jan Kiszka @ 2015-04-16 15:31 UTC (permalink / raw) To: Sebastian Andrzej Siewior, RT; +Cc: Linux Kernel Mailing List, Steven Rostedt On 2015-04-16 16:57, Sebastian Andrzej Siewior wrote: > On 04/16/2015 04:28 PM, Jan Kiszka wrote: >> On 2015-04-16 16:26, Sebastian Andrzej Siewior wrote: >>> On 04/16/2015 04:06 PM, Jan Kiszka wrote: >>>> ftrace may trigger rb_wakeups while holding pi_lock which will also be >>>> requested via trace_...->...->ring_buffer_unlock_commit->...-> >>>> irq_work_queue->raise_softirq->try_to_wake_up. This quickly causes >>>> deadlocks when trying to use ftrace under -rt. >>>> >>>> Resolve this by marking the ring buffer's irq_work as HARD_IRQ. >>>> >>>> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> >>>> --- >>>> >>>> I'm not yet sure if this doesn't push work into hard-irq context that >>>> is better not done there on -rt. >>> >>> everything should be done in the soft-irq. >>> >>>> >>>> I'm also not sure if there aren't more such cases, given that -rt turns >>>> the default irq_work wakeup policy around. But maybe we are lucky. >>> >>> The only thing that is getting done in the hardirq is the FULL_NO_HZ >>> thingy. I would be _very_ glad if we could keep it that way. >> >> Then - to my current understanding - we need an NMI-safe trigger for >> soft-irq work. Is there anything like this existing already? Or can we >> still use the IPI-based kick without actually doing the work in hard-irq >> context? > > But if you trigger it via IPI it will still run in hardirq context, > right? Can you describe how run into this and try to think about it in > a quiet moment. It it just enabling the function tracer and running it? # trace-cmd record -e sched /sys/kernel/debug/tracing/events/sched/filter /sys/kernel/debug/tracing/events/*/sched/filter Hit Ctrl^C to stop recording ^C and you are dead(locked). Jan -- Siemens AG, Corporate Technology, CT RTC ITP SES-DE Corporate Competence Center Embedded Linux ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH RT 3.18] ring-buffer: Mark irq_work as HARD_IRQ to prevent deadlocks 2015-04-16 14:28 ` Jan Kiszka 2015-04-16 14:57 ` Sebastian Andrzej Siewior @ 2015-04-16 15:10 ` Steven Rostedt 2015-04-16 15:29 ` Jan Kiszka 1 sibling, 1 reply; 26+ messages in thread From: Steven Rostedt @ 2015-04-16 15:10 UTC (permalink / raw) To: Jan Kiszka; +Cc: Sebastian Andrzej Siewior, RT, Linux Kernel Mailing List On Thu, 16 Apr 2015 16:28:58 +0200 Jan Kiszka <jan.kiszka@siemens.com> wrote: > On 2015-04-16 16:26, Sebastian Andrzej Siewior wrote: > > On 04/16/2015 04:06 PM, Jan Kiszka wrote: > >> ftrace may trigger rb_wakeups while holding pi_lock which will also be > >> requested via trace_...->...->ring_buffer_unlock_commit->...-> > >> irq_work_queue->raise_softirq->try_to_wake_up. This quickly causes > >> deadlocks when trying to use ftrace under -rt. > >> > >> Resolve this by marking the ring buffer's irq_work as HARD_IRQ. > >> > >> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> > >> --- > >> > >> I'm not yet sure if this doesn't push work into hard-irq context that > >> is better not done there on -rt. > > > > everything should be done in the soft-irq. > > > >> > >> I'm also not sure if there aren't more such cases, given that -rt turns > >> the default irq_work wakeup policy around. But maybe we are lucky. > > > > The only thing that is getting done in the hardirq is the FULL_NO_HZ > > thingy. I would be _very_ glad if we could keep it that way. tracing is special, even more so than NO_HZ_FULL, as it also traces that as well (and even RCU). Tracing the kernel is like a debugger. Ideally, it would not be part of the kernel, but just an external observer. Without special hardware that is not the case, so we try to be outside the main system as much as possible. > > Then - to my current understanding - we need an NMI-safe trigger for > soft-irq work. Is there anything like this existing already? Or can we > still use the IPI-based kick without actually doing the work in hard-irq > context? > The reason why it uses irq_work() is because a simple wakeup can deadlock the system if called by the tracing infrastructure (as we see raise_softirq() does too). But yeah, there's no real need to have the ring buffer irq work handler run from hardirq context. The only requirement is that you can not do the raise from the irq_work_queue call. If you want to have the hardirq work handle do the raise softirq, that's fine. Perhaps that's the solution? Have all irq_work_queue() always trigger the hard irq, but the hard irq may just raise a softirq or it will call the handler directly if IRQ_WORK_HARD_IRQ is set. -- Steve ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH RT 3.18] ring-buffer: Mark irq_work as HARD_IRQ to prevent deadlocks 2015-04-16 15:10 ` Steven Rostedt @ 2015-04-16 15:29 ` Jan Kiszka 2015-04-16 15:33 ` Sebastian Andrzej Siewior 2015-04-16 16:28 ` [PATCH RT 3.18] irq_work: Provide a soft-irq based queue Jan Kiszka 0 siblings, 2 replies; 26+ messages in thread From: Jan Kiszka @ 2015-04-16 15:29 UTC (permalink / raw) To: Steven Rostedt; +Cc: Sebastian Andrzej Siewior, RT, Linux Kernel Mailing List On 2015-04-16 17:10, Steven Rostedt wrote: > On Thu, 16 Apr 2015 16:28:58 +0200 > Jan Kiszka <jan.kiszka@siemens.com> wrote: > >> On 2015-04-16 16:26, Sebastian Andrzej Siewior wrote: >>> On 04/16/2015 04:06 PM, Jan Kiszka wrote: >>>> ftrace may trigger rb_wakeups while holding pi_lock which will also be >>>> requested via trace_...->...->ring_buffer_unlock_commit->...-> >>>> irq_work_queue->raise_softirq->try_to_wake_up. This quickly causes >>>> deadlocks when trying to use ftrace under -rt. >>>> >>>> Resolve this by marking the ring buffer's irq_work as HARD_IRQ. >>>> >>>> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> >>>> --- >>>> >>>> I'm not yet sure if this doesn't push work into hard-irq context that >>>> is better not done there on -rt. >>> >>> everything should be done in the soft-irq. >>> >>>> >>>> I'm also not sure if there aren't more such cases, given that -rt turns >>>> the default irq_work wakeup policy around. But maybe we are lucky. >>> >>> The only thing that is getting done in the hardirq is the FULL_NO_HZ >>> thingy. I would be _very_ glad if we could keep it that way. > > tracing is special, even more so than NO_HZ_FULL, as it also traces > that as well (and even RCU). Tracing the kernel is like a debugger. > Ideally, it would not be part of the kernel, but just an external > observer. Without special hardware that is not the case, so we try to > be outside the main system as much as possible. > > >> >> Then - to my current understanding - we need an NMI-safe trigger for >> soft-irq work. Is there anything like this existing already? Or can we >> still use the IPI-based kick without actually doing the work in hard-irq >> context? >> > > The reason why it uses irq_work() is because a simple wakeup can > deadlock the system if called by the tracing infrastructure (as we see > raise_softirq() does too). > > But yeah, there's no real need to have the ring buffer irq work > handler run from hardirq context. The only requirement is that you can > not do the raise from the irq_work_queue call. If you want to have the > hardirq work handle do the raise softirq, that's fine. Perhaps that's > the solution? Have all irq_work_queue() always trigger the hard irq, but > the hard irq may just raise a softirq or it will call the handler > directly if IRQ_WORK_HARD_IRQ is set. I'll play with that. My patch is definitely not OK. It causes [ 380.372579] BUG: scheduling while atomic: trace-cmd/2149/0x00010004 ... [ 380.372604] Call Trace: [ 380.372610] <IRQ> [<ffffffff81607694>] dump_stack+0x50/0x9f [ 380.372613] [<ffffffff8160413c>] __schedule_bug+0x59/0x69 [ 380.372615] [<ffffffff8160a1d5>] __schedule+0x675/0x800 [ 380.372617] [<ffffffff8160a394>] schedule+0x34/0xa0 [ 380.372619] [<ffffffff8160bf7d>] rt_spin_lock_slowlock+0xcd/0x290 [ 380.372621] [<ffffffff8160d8b5>] rt_spin_lock+0x25/0x30 [ 380.372623] [<ffffffff8108fe39>] __wake_up+0x29/0x60 [ 380.372626] [<ffffffff81106960>] rb_wake_up_waiters+0x40/0x50 [ 380.372628] [<ffffffff8112cdbf>] irq_work_run_list+0x3f/0x60 [ 380.372630] [<ffffffff8112cdf9>] irq_work_run+0x19/0x20 [ 380.372632] [<ffffffff81008409>] smp_trace_irq_work_interrupt+0x39/0x120 [ 380.372633] [<ffffffff8160f8ef>] trace_irq_work_interrupt+0x6f/0x80 [ 380.372636] <EOI> [<ffffffff8103d66d>] ? native_apic_msr_write+0x2d/0x30 [ 380.372637] [<ffffffff8103d53d>] x2apic_send_IPI_self+0x1d/0x20 [ 380.372638] [<ffffffff8100851e>] arch_irq_work_raise+0x2e/0x40 [ 380.372639] [<ffffffff8112d025>] irq_work_queue+0xc5/0xf0 [ 380.372641] [<ffffffff81107d8a>] ring_buffer_unlock_commit+0x14a/0x2e0 [ 380.372643] [<ffffffff8110f894>] trace_buffer_unlock_commit+0x24/0x60 [ 380.372644] [<ffffffff8111f9da>] ftrace_event_buffer_commit+0x8a/0xc0 [ 380.372647] [<ffffffff811c58de>] ftrace_raw_event_writeback_dirty_inode_template+0x8e/0xc0 [ 380.372648] [<ffffffff811c8b21>] __mark_inode_dirty+0x1d1/0x310 [ 380.372650] [<ffffffff811d0ec8>] generic_write_end+0x78/0xb0 [ 380.372658] [<ffffffffa021c42b>] ext4_da_write_end+0x10b/0x2f0 [ext4] [ 380.372661] [<ffffffff8116335e>] ? pagefault_enable+0x1e/0x20 [ 380.372662] [<ffffffff8113c337>] generic_perform_write+0x107/0x1b0 [ 380.372664] [<ffffffff8113e49f>] __generic_file_write_iter+0x15f/0x350 [ 380.372668] [<ffffffffa0210c91>] ext4_file_write_iter+0x101/0x3d0 [ext4] [ 380.372670] [<ffffffff8118f59b>] ? __kmalloc+0x16b/0x250 [ 380.372672] [<ffffffff811ca96e>] ? iter_file_splice_write+0x8e/0x430 [ 380.372673] [<ffffffff811ca96e>] ? iter_file_splice_write+0x8e/0x430 [ 380.372674] [<ffffffff811cab35>] iter_file_splice_write+0x255/0x430 [ 380.372676] [<ffffffff811cc474>] SyS_splice+0x214/0x760 [ 380.372677] [<ffffffff81011fe7>] ? syscall_trace_enter_phase2+0xa7/0x1e0 [ 380.372679] [<ffffffff8160e266>] tracesys_phase2+0xd4/0xd9 Jan -- Siemens AG, Corporate Technology, CT RTC ITP SES-DE Corporate Competence Center Embedded Linux ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH RT 3.18] ring-buffer: Mark irq_work as HARD_IRQ to prevent deadlocks 2015-04-16 15:29 ` Jan Kiszka @ 2015-04-16 15:33 ` Sebastian Andrzej Siewior 2015-04-16 16:28 ` [PATCH RT 3.18] irq_work: Provide a soft-irq based queue Jan Kiszka 1 sibling, 0 replies; 26+ messages in thread From: Sebastian Andrzej Siewior @ 2015-04-16 15:33 UTC (permalink / raw) To: Jan Kiszka, Steven Rostedt; +Cc: RT, Linux Kernel Mailing List On 04/16/2015 05:29 PM, Jan Kiszka wrote: > My patch is definitely not OK. It causes > > [ 380.372579] BUG: scheduling while atomic: trace-cmd/2149/0x00010004 > ... > [ 380.372604] Call Trace: > [ 380.372610] <IRQ> [<ffffffff81607694>] dump_stack+0x50/0x9f > [ 380.372613] [<ffffffff8160413c>] __schedule_bug+0x59/0x69 > [ 380.372615] [<ffffffff8160a1d5>] __schedule+0x675/0x800 > [ 380.372617] [<ffffffff8160a394>] schedule+0x34/0xa0 > [ 380.372619] [<ffffffff8160bf7d>] rt_spin_lock_slowlock+0xcd/0x290 > [ 380.372621] [<ffffffff8160d8b5>] rt_spin_lock+0x25/0x30 > [ 380.372623] [<ffffffff8108fe39>] __wake_up+0x29/0x60 right. you must not take any sleeping locks in the hardirq context :) This works for the no-hz-work thingy. It grabs raw locks and may cancel one hrtimer which is marked irqsafe. Sebastian ^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH RT 3.18] irq_work: Provide a soft-irq based queue 2015-04-16 15:29 ` Jan Kiszka 2015-04-16 15:33 ` Sebastian Andrzej Siewior @ 2015-04-16 16:28 ` Jan Kiszka 2015-04-20 8:03 ` Mike Galbraith 1 sibling, 1 reply; 26+ messages in thread From: Jan Kiszka @ 2015-04-16 16:28 UTC (permalink / raw) To: Steven Rostedt, Sebastian Andrzej Siewior; +Cc: RT, Linux Kernel Mailing List Instead of turning all irq_work requests into lazy ones on -rt, just move their execution from hard into soft-irq context. This resolves deadlocks of ftrace which will queue work from arbitrary contexts, including those that have locks held that are needed for raising a soft-irq. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> --- Second try, looks much better so far. And it also removes my concerns regarding other potential cases besides ftrace. kernel/irq_work.c | 26 ++++++++++---------------- 1 file changed, 10 insertions(+), 16 deletions(-) diff --git a/kernel/irq_work.c b/kernel/irq_work.c index 9dda38a..3f6ffcd 100644 --- a/kernel/irq_work.c +++ b/kernel/irq_work.c @@ -85,12 +85,9 @@ bool irq_work_queue_on(struct irq_work *work, int cpu) raise_irqwork = llist_add(&work->llnode, &per_cpu(hirq_work_list, cpu)); else - raise_irqwork = llist_add(&work->llnode, - &per_cpu(lazy_list, cpu)); -#else +#endif raise_irqwork = llist_add(&work->llnode, &per_cpu(raised_list, cpu)); -#endif if (raise_irqwork) arch_send_call_function_single_ipi(cpu); @@ -114,21 +111,20 @@ bool irq_work_queue(struct irq_work *work) if (work->flags & IRQ_WORK_HARD_IRQ) { if (llist_add(&work->llnode, this_cpu_ptr(&hirq_work_list))) arch_irq_work_raise(); - } else { + } else +#endif + if (work->flags & IRQ_WORK_LAZY) { if (llist_add(&work->llnode, this_cpu_ptr(&lazy_list)) && tick_nohz_tick_stopped()) +#ifdef CONFIG_PREEMPT_RT_FULL raise_softirq(TIMER_SOFTIRQ); - } #else - if (work->flags & IRQ_WORK_LAZY) { - if (llist_add(&work->llnode, this_cpu_ptr(&lazy_list)) && - tick_nohz_tick_stopped()) arch_irq_work_raise(); +#endif } else { if (llist_add(&work->llnode, this_cpu_ptr(&raised_list))) arch_irq_work_raise(); } -#endif preempt_enable(); @@ -202,6 +198,8 @@ void irq_work_run(void) { #ifdef CONFIG_PREEMPT_RT_FULL irq_work_run_list(this_cpu_ptr(&hirq_work_list)); + if (!llist_empty(this_cpu_ptr(&raised_list))) + raise_softirq(TIMER_SOFTIRQ); #else irq_work_run_list(this_cpu_ptr(&raised_list)); irq_work_run_list(this_cpu_ptr(&lazy_list)); @@ -211,15 +209,11 @@ EXPORT_SYMBOL_GPL(irq_work_run); void irq_work_tick(void) { -#ifdef CONFIG_PREEMPT_RT_FULL - irq_work_run_list(this_cpu_ptr(&lazy_list)); -#else - struct llist_head *raised = &__get_cpu_var(raised_list); + struct llist_head *raised = this_cpu_ptr(&raised_list); if (!llist_empty(raised) && !arch_irq_work_has_interrupt()) irq_work_run_list(raised); - irq_work_run_list(&__get_cpu_var(lazy_list)); -#endif + irq_work_run_list(this_cpu_ptr(&lazy_list)); } /* -- 2.1.4 ^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [PATCH RT 3.18] irq_work: Provide a soft-irq based queue 2015-04-16 16:28 ` [PATCH RT 3.18] irq_work: Provide a soft-irq based queue Jan Kiszka @ 2015-04-20 8:03 ` Mike Galbraith 2015-04-23 6:11 ` Mike Galbraith 0 siblings, 1 reply; 26+ messages in thread From: Mike Galbraith @ 2015-04-20 8:03 UTC (permalink / raw) To: Jan Kiszka Cc: Steven Rostedt, Sebastian Andrzej Siewior, RT, Linux Kernel Mailing List On Thu, 2015-04-16 at 18:28 +0200, Jan Kiszka wrote: > Instead of turning all irq_work requests into lazy ones on -rt, just > move their execution from hard into soft-irq context. > > This resolves deadlocks of ftrace which will queue work from > arbitrary > contexts, including those that have locks held that are needed for > raising a soft-irq. Yup, trace-cmd record -> dead-box fully repeatable, and now fixed. > Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> > --- > > Second try, looks much better so far. And it also removes my concerns > regarding other potential cases besides ftrace. > > kernel/irq_work.c | 26 ++++++++++---------------- > 1 file changed, 10 insertions(+), 16 deletions(-) > > diff --git a/kernel/irq_work.c b/kernel/irq_work.c > index 9dda38a..3f6ffcd 100644 > --- a/kernel/irq_work.c > +++ b/kernel/irq_work.c > @@ -85,12 +85,9 @@ bool irq_work_queue_on(struct irq_work *work, int > cpu) > raise_irqwork = llist_add(&work->llnode, > &per_cpu(hirq_work_list, cpu)); > else > - raise_irqwork = llist_add(&work->llnode, > - &per_cpu(lazy_list, cpu)); > -#else > +#endif > raise_irqwork = llist_add(&work->llnode, > &per_cpu(raised_list, cpu)); > -#endif > > if (raise_irqwork) > arch_send_call_function_single_ipi(cpu); > @@ -114,21 +111,20 @@ bool irq_work_queue(struct irq_work *work) > if (work->flags & IRQ_WORK_HARD_IRQ) { > if (llist_add(&work->llnode, this_cpu_ptr(&hirq_work_list))) > arch_irq_work_raise(); > - } else { > + } else > +#endif > + if (work->flags & IRQ_WORK_LAZY) { > if (llist_add(&work->llnode, this_cpu_ptr(&lazy_list)) && > tick_nohz_tick_stopped()) > +#ifdef CONFIG_PREEMPT_RT_FULL > raise_softirq(TIMER_SOFTIRQ); > - } > #else > - if (work->flags & IRQ_WORK_LAZY) { > - if (llist_add(&work->llnode, this_cpu_ptr(&lazy_list)) && > - tick_nohz_tick_stopped()) > arch_irq_work_raise(); > +#endif > } else { > if (llist_add(&work->llnode, this_cpu_ptr(&raised_list))) > arch_irq_work_raise(); > } > -#endif > > preempt_enable(); > > @@ -202,6 +198,8 @@ void irq_work_run(void) > { > #ifdef CONFIG_PREEMPT_RT_FULL > irq_work_run_list(this_cpu_ptr(&hirq_work_list)); > + if (!llist_empty(this_cpu_ptr(&raised_list))) > + raise_softirq(TIMER_SOFTIRQ); > #else > irq_work_run_list(this_cpu_ptr(&raised_list)); > irq_work_run_list(this_cpu_ptr(&lazy_list)); > @@ -211,15 +209,11 @@ EXPORT_SYMBOL_GPL(irq_work_run); > > void irq_work_tick(void) > { > -#ifdef CONFIG_PREEMPT_RT_FULL > - irq_work_run_list(this_cpu_ptr(&lazy_list)); > -#else > - struct llist_head *raised = &__get_cpu_var(raised_list); > + struct llist_head *raised = this_cpu_ptr(&raised_list); > > if (!llist_empty(raised) && !arch_irq_work_has_interrupt()) > irq_work_run_list(raised); > - irq_work_run_list(&__get_cpu_var(lazy_list)); > -#endif > + irq_work_run_list(this_cpu_ptr(&lazy_list)); > } > > /* ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH RT 3.18] irq_work: Provide a soft-irq based queue 2015-04-20 8:03 ` Mike Galbraith @ 2015-04-23 6:11 ` Mike Galbraith 2015-04-23 6:29 ` Jan Kiszka 2015-04-23 6:50 ` Jan Kiszka 0 siblings, 2 replies; 26+ messages in thread From: Mike Galbraith @ 2015-04-23 6:11 UTC (permalink / raw) To: Jan Kiszka Cc: Steven Rostedt, Sebastian Andrzej Siewior, RT, Linux Kernel Mailing List On Mon, 2015-04-20 at 10:03 +0200, Mike Galbraith wrote: > On Thu, 2015-04-16 at 18:28 +0200, Jan Kiszka wrote: > > Instead of turning all irq_work requests into lazy ones on -rt, > > just > > move their execution from hard into soft-irq context. > > > > This resolves deadlocks of ftrace which will queue work from > > arbitrary > > contexts, including those that have locks held that are needed for > > raising a soft-irq. > > Yup, trace-cmd record -> dead-box fully repeatable, and now fixed. Except you kinda forgot to run the raised list. The reformatted (which saved two whole lines;) patch below adds that to irq_work_tick(), which fixes the livelock both powertop and perf top otherwise meet. Subject: [PATCH RT 3.18] irq_work: Provide a soft-irq based queue Date: Thu, 16 Apr 2015 18:28:16 +0200 From: Jan Kiszka <jan.kiszka@siemens.com> Instead of turning all irq_work requests into lazy ones on -rt, just move their execution from hard into soft-irq context. This resolves deadlocks of ftrace which will queue work from arbitrary contexts, including those that have locks held that are needed for raising a soft-irq. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> --- Second try, looks much better so far. And it also removes my concerns regarding other potential cases besides ftrace. kernel/irq_work.c | 84 ++++++++++++++++++++++++++---------------------------- 1 file changed, 41 insertions(+), 43 deletions(-) --- a/kernel/irq_work.c +++ b/kernel/irq_work.c @@ -80,17 +80,12 @@ bool irq_work_queue_on(struct irq_work * if (!irq_work_claim(work)) return false; -#ifdef CONFIG_PREEMPT_RT_FULL - if (work->flags & IRQ_WORK_HARD_IRQ) + if (IS_ENABLED(CONFIG_PREEMPT_RT_FULL) && (work->flags & IRQ_WORK_HARD_IRQ)) raise_irqwork = llist_add(&work->llnode, &per_cpu(hirq_work_list, cpu)); else raise_irqwork = llist_add(&work->llnode, - &per_cpu(lazy_list, cpu)); -#else - raise_irqwork = llist_add(&work->llnode, &per_cpu(raised_list, cpu)); -#endif if (raise_irqwork) arch_send_call_function_single_ipi(cpu); @@ -103,6 +98,9 @@ EXPORT_SYMBOL_GPL(irq_work_queue_on); /* Enqueue the irq work @work on the current CPU */ bool irq_work_queue(struct irq_work *work) { + bool realtime = IS_ENABLED(CONFIG_PREEMPT_RT_FULL); + bool raise = false; + /* Only queue if not already pending */ if (!irq_work_claim(work)) return false; @@ -110,25 +108,22 @@ bool irq_work_queue(struct irq_work *wor /* Queue the entry and raise the IPI if needed. */ preempt_disable(); -#ifdef CONFIG_PREEMPT_RT_FULL - if (work->flags & IRQ_WORK_HARD_IRQ) { + if (realtime && (work->flags & IRQ_WORK_HARD_IRQ)) { if (llist_add(&work->llnode, this_cpu_ptr(&hirq_work_list))) - arch_irq_work_raise(); - } else { - if (llist_add(&work->llnode, this_cpu_ptr(&lazy_list)) && - tick_nohz_tick_stopped()) - raise_softirq(TIMER_SOFTIRQ); - } -#else - if (work->flags & IRQ_WORK_LAZY) { + raise = 1; + } else if (work->flags & IRQ_WORK_LAZY) { if (llist_add(&work->llnode, this_cpu_ptr(&lazy_list)) && - tick_nohz_tick_stopped()) - arch_irq_work_raise(); - } else { - if (llist_add(&work->llnode, this_cpu_ptr(&raised_list))) - arch_irq_work_raise(); - } -#endif + tick_nohz_tick_stopped()) { + if (realtime) + raise_softirq(TIMER_SOFTIRQ); + else + raise = true; + } + } else if (llist_add(&work->llnode, this_cpu_ptr(&raised_list))) + raise = true; + + if (raise) + arch_irq_work_raise(); preempt_enable(); @@ -143,12 +138,13 @@ bool irq_work_needs_cpu(void) raised = this_cpu_ptr(&raised_list); lazy = this_cpu_ptr(&lazy_list); - if (llist_empty(raised)) - if (llist_empty(lazy)) -#ifdef CONFIG_PREEMPT_RT_FULL + if (llist_empty(raised) && llist_empty(lazy)) { + if (IS_ENABLED(CONFIG_PREEMPT_RT_FULL)) { if (llist_empty(this_cpu_ptr(&hirq_work_list))) -#endif return false; + } else + return false; + } /* All work should have been flushed before going offline */ WARN_ON_ONCE(cpu_is_offline(smp_processor_id())); @@ -162,9 +158,7 @@ static void irq_work_run_list(struct lli struct irq_work *work; struct llist_node *llnode; -#ifndef CONFIG_PREEMPT_RT_FULL - BUG_ON(!irqs_disabled()); -#endif + BUG_ON(!IS_ENABLED(CONFIG_PREEMPT_RT_FULL) && !irqs_disabled()); if (llist_empty(list)) return; @@ -200,26 +194,30 @@ static void irq_work_run_list(struct lli */ void irq_work_run(void) { -#ifdef CONFIG_PREEMPT_RT_FULL - irq_work_run_list(this_cpu_ptr(&hirq_work_list)); -#else - irq_work_run_list(this_cpu_ptr(&raised_list)); - irq_work_run_list(this_cpu_ptr(&lazy_list)); -#endif + if (IS_ENABLED(CONFIG_PREEMPT_RT_FULL)) { + irq_work_run_list(this_cpu_ptr(&hirq_work_list)); + /* + * NOTE: we raise softirq via IPI for safety, + * and execute in irq_work_tick() to move the + * overhead from hard to soft irq context. + */ + if (!llist_empty(this_cpu_ptr(&raised_list))) + raise_softirq(TIMER_SOFTIRQ); + } else { + irq_work_run_list(this_cpu_ptr(&raised_list)); + irq_work_run_list(this_cpu_ptr(&lazy_list)); + } } EXPORT_SYMBOL_GPL(irq_work_run); void irq_work_tick(void) { -#ifdef CONFIG_PREEMPT_RT_FULL - irq_work_run_list(this_cpu_ptr(&lazy_list)); -#else - struct llist_head *raised = &__get_cpu_var(raised_list); + struct llist_head *raised = this_cpu_ptr(&raised_list); - if (!llist_empty(raised) && !arch_irq_work_has_interrupt()) + if (!llist_empty(raised) && (!arch_irq_work_has_interrupt() || + IS_ENABLED(CONFIG_PREEMPT_RT_FULL))) irq_work_run_list(raised); - irq_work_run_list(&__get_cpu_var(lazy_list)); -#endif + irq_work_run_list(this_cpu_ptr(&lazy_list)); } /* ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH RT 3.18] irq_work: Provide a soft-irq based queue 2015-04-23 6:11 ` Mike Galbraith @ 2015-04-23 6:29 ` Jan Kiszka 2015-04-23 6:58 ` Mike Galbraith 2015-04-23 6:50 ` Jan Kiszka 1 sibling, 1 reply; 26+ messages in thread From: Jan Kiszka @ 2015-04-23 6:29 UTC (permalink / raw) To: Mike Galbraith Cc: Steven Rostedt, Sebastian Andrzej Siewior, RT, Linux Kernel Mailing List On 2015-04-23 08:11, Mike Galbraith wrote: > On Mon, 2015-04-20 at 10:03 +0200, Mike Galbraith wrote: >> On Thu, 2015-04-16 at 18:28 +0200, Jan Kiszka wrote: >>> Instead of turning all irq_work requests into lazy ones on -rt, >>> just >>> move their execution from hard into soft-irq context. >>> >>> This resolves deadlocks of ftrace which will queue work from >>> arbitrary >>> contexts, including those that have locks held that are needed for >>> raising a soft-irq. >> >> Yup, trace-cmd record -> dead-box fully repeatable, and now fixed. > > Except you kinda forgot to run the raised list. The reformatted > (which saved two whole lines;) patch below adds that to > irq_work_tick(), which fixes the livelock both powertop and perf top > otherwise meet. > > Subject: [PATCH RT 3.18] irq_work: Provide a soft-irq based queue > Date: Thu, 16 Apr 2015 18:28:16 +0200 > From: Jan Kiszka <jan.kiszka@siemens.com> > > Instead of turning all irq_work requests into lazy ones on -rt, just > move their execution from hard into soft-irq context. > > This resolves deadlocks of ftrace which will queue work from arbitrary > contexts, including those that have locks held that are needed for > raising a soft-irq. > > Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> > --- > > Second try, looks much better so far. And it also removes my concerns > regarding other potential cases besides ftrace. > > kernel/irq_work.c | 84 ++++++++++++++++++++++++++---------------------------- > 1 file changed, 41 insertions(+), 43 deletions(-) > > --- a/kernel/irq_work.c > +++ b/kernel/irq_work.c > @@ -80,17 +80,12 @@ bool irq_work_queue_on(struct irq_work * > if (!irq_work_claim(work)) > return false; > > -#ifdef CONFIG_PREEMPT_RT_FULL > - if (work->flags & IRQ_WORK_HARD_IRQ) > + if (IS_ENABLED(CONFIG_PREEMPT_RT_FULL) && (work->flags & IRQ_WORK_HARD_IRQ)) > raise_irqwork = llist_add(&work->llnode, > &per_cpu(hirq_work_list, cpu)); > else > raise_irqwork = llist_add(&work->llnode, > - &per_cpu(lazy_list, cpu)); > -#else > - raise_irqwork = llist_add(&work->llnode, > &per_cpu(raised_list, cpu)); > -#endif > > if (raise_irqwork) > arch_send_call_function_single_ipi(cpu); > @@ -103,6 +98,9 @@ EXPORT_SYMBOL_GPL(irq_work_queue_on); > /* Enqueue the irq work @work on the current CPU */ > bool irq_work_queue(struct irq_work *work) > { > + bool realtime = IS_ENABLED(CONFIG_PREEMPT_RT_FULL); > + bool raise = false; > + > /* Only queue if not already pending */ > if (!irq_work_claim(work)) > return false; > @@ -110,25 +108,22 @@ bool irq_work_queue(struct irq_work *wor > /* Queue the entry and raise the IPI if needed. */ > preempt_disable(); > > -#ifdef CONFIG_PREEMPT_RT_FULL > - if (work->flags & IRQ_WORK_HARD_IRQ) { > + if (realtime && (work->flags & IRQ_WORK_HARD_IRQ)) { > if (llist_add(&work->llnode, this_cpu_ptr(&hirq_work_list))) > - arch_irq_work_raise(); > - } else { > - if (llist_add(&work->llnode, this_cpu_ptr(&lazy_list)) && > - tick_nohz_tick_stopped()) > - raise_softirq(TIMER_SOFTIRQ); > - } > -#else > - if (work->flags & IRQ_WORK_LAZY) { > + raise = 1; > + } else if (work->flags & IRQ_WORK_LAZY) { > if (llist_add(&work->llnode, this_cpu_ptr(&lazy_list)) && > - tick_nohz_tick_stopped()) > - arch_irq_work_raise(); > - } else { > - if (llist_add(&work->llnode, this_cpu_ptr(&raised_list))) > - arch_irq_work_raise(); > - } > -#endif > + tick_nohz_tick_stopped()) { > + if (realtime) > + raise_softirq(TIMER_SOFTIRQ); > + else > + raise = true; > + } > + } else if (llist_add(&work->llnode, this_cpu_ptr(&raised_list))) > + raise = true; > + > + if (raise) > + arch_irq_work_raise(); > > preempt_enable(); > > @@ -143,12 +138,13 @@ bool irq_work_needs_cpu(void) > raised = this_cpu_ptr(&raised_list); > lazy = this_cpu_ptr(&lazy_list); > > - if (llist_empty(raised)) > - if (llist_empty(lazy)) > -#ifdef CONFIG_PREEMPT_RT_FULL > + if (llist_empty(raised) && llist_empty(lazy)) { > + if (IS_ENABLED(CONFIG_PREEMPT_RT_FULL)) { > if (llist_empty(this_cpu_ptr(&hirq_work_list))) > -#endif > return false; > + } else > + return false; > + } > > /* All work should have been flushed before going offline */ > WARN_ON_ONCE(cpu_is_offline(smp_processor_id())); > @@ -162,9 +158,7 @@ static void irq_work_run_list(struct lli > struct irq_work *work; > struct llist_node *llnode; > > -#ifndef CONFIG_PREEMPT_RT_FULL > - BUG_ON(!irqs_disabled()); > -#endif > + BUG_ON(!IS_ENABLED(CONFIG_PREEMPT_RT_FULL) && !irqs_disabled()); > > if (llist_empty(list)) > return; > @@ -200,26 +194,30 @@ static void irq_work_run_list(struct lli > */ > void irq_work_run(void) > { > -#ifdef CONFIG_PREEMPT_RT_FULL > - irq_work_run_list(this_cpu_ptr(&hirq_work_list)); > -#else > - irq_work_run_list(this_cpu_ptr(&raised_list)); > - irq_work_run_list(this_cpu_ptr(&lazy_list)); > -#endif > + if (IS_ENABLED(CONFIG_PREEMPT_RT_FULL)) { > + irq_work_run_list(this_cpu_ptr(&hirq_work_list)); > + /* > + * NOTE: we raise softirq via IPI for safety, > + * and execute in irq_work_tick() to move the > + * overhead from hard to soft irq context. > + */ > + if (!llist_empty(this_cpu_ptr(&raised_list))) > + raise_softirq(TIMER_SOFTIRQ); > + } else { > + irq_work_run_list(this_cpu_ptr(&raised_list)); > + irq_work_run_list(this_cpu_ptr(&lazy_list)); > + } > } > EXPORT_SYMBOL_GPL(irq_work_run); > > void irq_work_tick(void) > { > -#ifdef CONFIG_PREEMPT_RT_FULL > - irq_work_run_list(this_cpu_ptr(&lazy_list)); > -#else > - struct llist_head *raised = &__get_cpu_var(raised_list); > + struct llist_head *raised = this_cpu_ptr(&raised_list); > > - if (!llist_empty(raised) && !arch_irq_work_has_interrupt()) > + if (!llist_empty(raised) && (!arch_irq_work_has_interrupt() || > + IS_ENABLED(CONFIG_PREEMPT_RT_FULL))) OK, that additional condition is addressing archs that don't have irq_work support and fall back to the timer, right? > irq_work_run_list(raised); > - irq_work_run_list(&__get_cpu_var(lazy_list)); > -#endif > + irq_work_run_list(this_cpu_ptr(&lazy_list)); > } > > /* > The patch is whitespace-damaged - could you resent? I'm trying to visualize the full diff for me. Thanks, Jan -- Siemens AG, Corporate Technology, CT RTC ITP SES-DE Corporate Competence Center Embedded Linux ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH RT 3.18] irq_work: Provide a soft-irq based queue 2015-04-23 6:29 ` Jan Kiszka @ 2015-04-23 6:58 ` Mike Galbraith 2015-04-23 7:14 ` Jan Kiszka 0 siblings, 1 reply; 26+ messages in thread From: Mike Galbraith @ 2015-04-23 6:58 UTC (permalink / raw) To: Jan Kiszka Cc: Steven Rostedt, Sebastian Andrzej Siewior, RT, Linux Kernel Mailing List [-- Attachment #1: Type: text/plain, Size: 1092 bytes --] On Thu, 2015-04-23 at 08:29 +0200, Jan Kiszka wrote: > > > void irq_work_tick(void) > > { > > -#ifdef CONFIG_PREEMPT_RT_FULL > > - irq_work_run_list(this_cpu_ptr(&lazy_list)); > > -#else > > - struct llist_head *raised = &__get_cpu_var(raised_list); > > + struct llist_head *raised = this_cpu_ptr(&raised_list); > > > > - if (!llist_empty(raised) && !arch_irq_work_has_interrupt()) > > + if (!llist_empty(raised) && > > (!arch_irq_work_has_interrupt() || > > + IS_ENABLED(CONFIG_PREEMPT_RT_FULL))) > > OK, that additional condition is addressing archs that don't have > irq_work support and fall back to the timer, right? How will ever run if it is not run in either irq_work_run() or irq_work_tick()? There are two choices, we better pick one. Attaching patch since either evolution fscked up again (it does that), or someone has managed to turn it into a completely useless piece of crap... if so, likely the same dipstick who made it save messages such that you need fromdos to wipe away the shite it smears all over it. -Mike [-- Attachment #2: irq_work-Provide-a-soft-irq-based-queue.patch --] [-- Type: text/x-patch, Size: 4954 bytes --] Subject: [PATCH RT 3.18] irq_work: Provide a soft-irq based queue Date: Thu, 16 Apr 2015 18:28:16 +0200 From: Jan Kiszka <jan.kiszka@siemens.com> Instead of turning all irq_work requests into lazy ones on -rt, just move their execution from hard into soft-irq context. This resolves deadlocks of ftrace which will queue work from arbitrary contexts, including those that have locks held that are needed for raising a soft-irq. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> --- Second try, looks much better so far. And it also removes my concerns regarding other potential cases besides ftrace. kernel/irq_work.c | 84 ++++++++++++++++++++++++++---------------------------- 1 file changed, 41 insertions(+), 43 deletions(-) --- a/kernel/irq_work.c +++ b/kernel/irq_work.c @@ -80,17 +80,12 @@ bool irq_work_queue_on(struct irq_work * if (!irq_work_claim(work)) return false; -#ifdef CONFIG_PREEMPT_RT_FULL - if (work->flags & IRQ_WORK_HARD_IRQ) + if (IS_ENABLED(CONFIG_PREEMPT_RT_FULL) && (work->flags & IRQ_WORK_HARD_IRQ)) raise_irqwork = llist_add(&work->llnode, &per_cpu(hirq_work_list, cpu)); else raise_irqwork = llist_add(&work->llnode, - &per_cpu(lazy_list, cpu)); -#else - raise_irqwork = llist_add(&work->llnode, &per_cpu(raised_list, cpu)); -#endif if (raise_irqwork) arch_send_call_function_single_ipi(cpu); @@ -103,6 +98,9 @@ EXPORT_SYMBOL_GPL(irq_work_queue_on); /* Enqueue the irq work @work on the current CPU */ bool irq_work_queue(struct irq_work *work) { + bool realtime = IS_ENABLED(CONFIG_PREEMPT_RT_FULL); + bool raise = false; + /* Only queue if not already pending */ if (!irq_work_claim(work)) return false; @@ -110,25 +108,22 @@ bool irq_work_queue(struct irq_work *wor /* Queue the entry and raise the IPI if needed. */ preempt_disable(); -#ifdef CONFIG_PREEMPT_RT_FULL - if (work->flags & IRQ_WORK_HARD_IRQ) { + if (realtime && (work->flags & IRQ_WORK_HARD_IRQ)) { if (llist_add(&work->llnode, this_cpu_ptr(&hirq_work_list))) - arch_irq_work_raise(); - } else { - if (llist_add(&work->llnode, this_cpu_ptr(&lazy_list)) && - tick_nohz_tick_stopped()) - raise_softirq(TIMER_SOFTIRQ); - } -#else - if (work->flags & IRQ_WORK_LAZY) { + raise = 1; + } else if (work->flags & IRQ_WORK_LAZY) { if (llist_add(&work->llnode, this_cpu_ptr(&lazy_list)) && - tick_nohz_tick_stopped()) - arch_irq_work_raise(); - } else { - if (llist_add(&work->llnode, this_cpu_ptr(&raised_list))) - arch_irq_work_raise(); - } -#endif + tick_nohz_tick_stopped()) { + if (realtime) + raise_softirq(TIMER_SOFTIRQ); + else + raise = true; + } + } else if (llist_add(&work->llnode, this_cpu_ptr(&raised_list))) + raise = true; + + if (raise) + arch_irq_work_raise(); preempt_enable(); @@ -143,12 +138,13 @@ bool irq_work_needs_cpu(void) raised = this_cpu_ptr(&raised_list); lazy = this_cpu_ptr(&lazy_list); - if (llist_empty(raised)) - if (llist_empty(lazy)) -#ifdef CONFIG_PREEMPT_RT_FULL + if (llist_empty(raised) && llist_empty(lazy)) { + if (IS_ENABLED(CONFIG_PREEMPT_RT_FULL)) { if (llist_empty(this_cpu_ptr(&hirq_work_list))) -#endif return false; + } else + return false; + } /* All work should have been flushed before going offline */ WARN_ON_ONCE(cpu_is_offline(smp_processor_id())); @@ -162,9 +158,7 @@ static void irq_work_run_list(struct lli struct irq_work *work; struct llist_node *llnode; -#ifndef CONFIG_PREEMPT_RT_FULL - BUG_ON(!irqs_disabled()); -#endif + BUG_ON(!IS_ENABLED(CONFIG_PREEMPT_RT_FULL) && !irqs_disabled()); if (llist_empty(list)) return; @@ -200,26 +194,30 @@ static void irq_work_run_list(struct lli */ void irq_work_run(void) { -#ifdef CONFIG_PREEMPT_RT_FULL - irq_work_run_list(this_cpu_ptr(&hirq_work_list)); -#else - irq_work_run_list(this_cpu_ptr(&raised_list)); - irq_work_run_list(this_cpu_ptr(&lazy_list)); -#endif + if (IS_ENABLED(CONFIG_PREEMPT_RT_FULL)) { + irq_work_run_list(this_cpu_ptr(&hirq_work_list)); + /* + * NOTE: we raise softirq via IPI for safety, + * and execute in irq_work_tick() to move the + * overhead from hard to soft irq context. + */ + if (!llist_empty(this_cpu_ptr(&raised_list))) + raise_softirq(TIMER_SOFTIRQ); + } else { + irq_work_run_list(this_cpu_ptr(&raised_list)); + irq_work_run_list(this_cpu_ptr(&lazy_list)); + } } EXPORT_SYMBOL_GPL(irq_work_run); void irq_work_tick(void) { -#ifdef CONFIG_PREEMPT_RT_FULL - irq_work_run_list(this_cpu_ptr(&lazy_list)); -#else - struct llist_head *raised = &__get_cpu_var(raised_list); + struct llist_head *raised = this_cpu_ptr(&raised_list); - if (!llist_empty(raised) && !arch_irq_work_has_interrupt()) + if (!llist_empty(raised) && (!arch_irq_work_has_interrupt() || + IS_ENABLED(CONFIG_PREEMPT_RT_FULL))) irq_work_run_list(raised); - irq_work_run_list(&__get_cpu_var(lazy_list)); -#endif + irq_work_run_list(this_cpu_ptr(&lazy_list)); } /* ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH RT 3.18] irq_work: Provide a soft-irq based queue 2015-04-23 6:58 ` Mike Galbraith @ 2015-04-23 7:14 ` Jan Kiszka 0 siblings, 0 replies; 26+ messages in thread From: Jan Kiszka @ 2015-04-23 7:14 UTC (permalink / raw) To: Mike Galbraith Cc: Steven Rostedt, Sebastian Andrzej Siewior, RT, Linux Kernel Mailing List On 2015-04-23 08:58, Mike Galbraith wrote: > On Thu, 2015-04-23 at 08:29 +0200, Jan Kiszka wrote: >> >>> void irq_work_tick(void) >>> { >>> -#ifdef CONFIG_PREEMPT_RT_FULL >>> - irq_work_run_list(this_cpu_ptr(&lazy_list)); >>> -#else >>> - struct llist_head *raised = &__get_cpu_var(raised_list); >>> + struct llist_head *raised = this_cpu_ptr(&raised_list); >>> >>> - if (!llist_empty(raised) && !arch_irq_work_has_interrupt()) >>> + if (!llist_empty(raised) && >>> (!arch_irq_work_has_interrupt() || >>> + IS_ENABLED(CONFIG_PREEMPT_RT_FULL))) >> >> OK, that additional condition is addressing archs that don't have >> irq_work support and fall back to the timer, right? > > How will ever run if it is not run in either irq_work_run() or > irq_work_tick()? There are two choices, we better pick one. Ah, now I see it. Indeed. OK, will run through your fix and suggestions and come up with a new version. Thanks, Jan -- Siemens AG, Corporate Technology, CT RTC ITP SES-DE Corporate Competence Center Embedded Linux ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH RT 3.18] irq_work: Provide a soft-irq based queue 2015-04-23 6:11 ` Mike Galbraith 2015-04-23 6:29 ` Jan Kiszka @ 2015-04-23 6:50 ` Jan Kiszka 2015-04-23 7:01 ` Mike Galbraith 1 sibling, 1 reply; 26+ messages in thread From: Jan Kiszka @ 2015-04-23 6:50 UTC (permalink / raw) To: Mike Galbraith Cc: Steven Rostedt, Sebastian Andrzej Siewior, RT, Linux Kernel Mailing List On 2015-04-23 08:11, Mike Galbraith wrote: > @@ -103,6 +98,9 @@ EXPORT_SYMBOL_GPL(irq_work_queue_on); > /* Enqueue the irq work @work on the current CPU */ > bool irq_work_queue(struct irq_work *work) > { > + bool realtime = IS_ENABLED(CONFIG_PREEMPT_RT_FULL); > + bool raise = false; > + > /* Only queue if not already pending */ > if (!irq_work_claim(work)) > return false; > @@ -110,25 +108,22 @@ bool irq_work_queue(struct irq_work *wor > /* Queue the entry and raise the IPI if needed. */ > preempt_disable(); > > -#ifdef CONFIG_PREEMPT_RT_FULL > - if (work->flags & IRQ_WORK_HARD_IRQ) { > + if (realtime && (work->flags & IRQ_WORK_HARD_IRQ)) { > if (llist_add(&work->llnode, this_cpu_ptr(&hirq_work_list))) This boils down to #ifdef CONFIG_X some_type x; #endif ... if (IS_ENABLED(CONFIG_X) && ...) use(x); And here we even have an indirection for IS_ENABLED via that local bool variable. Is that pattern OK for Linux? Does it compile in all supported optimization levels of all supported compilers? Jan -- Siemens AG, Corporate Technology, CT RTC ITP SES-DE Corporate Competence Center Embedded Linux ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH RT 3.18] irq_work: Provide a soft-irq based queue 2015-04-23 6:50 ` Jan Kiszka @ 2015-04-23 7:01 ` Mike Galbraith 2015-04-23 7:12 ` Jan Kiszka 0 siblings, 1 reply; 26+ messages in thread From: Mike Galbraith @ 2015-04-23 7:01 UTC (permalink / raw) To: Jan Kiszka Cc: Steven Rostedt, Sebastian Andrzej Siewior, RT, Linux Kernel Mailing List On Thu, 2015-04-23 at 08:50 +0200, Jan Kiszka wrote: > On 2015-04-23 08:11, Mike Galbraith wrote: > > @@ -103,6 +98,9 @@ EXPORT_SYMBOL_GPL(irq_work_queue_on); > > /* Enqueue the irq work @work on the current CPU */ > > bool irq_work_queue(struct irq_work *work) > > { > > + bool realtime = IS_ENABLED(CONFIG_PREEMPT_RT_FULL); > > + bool raise = false; > > + > > /* Only queue if not already pending */ > > if (!irq_work_claim(work)) > > return false; > > @@ -110,25 +108,22 @@ bool irq_work_queue(struct irq_work *wor > > /* Queue the entry and raise the IPI if needed. */ > > preempt_disable(); > > > > -#ifdef CONFIG_PREEMPT_RT_FULL > > - if (work->flags & IRQ_WORK_HARD_IRQ) { > > + if (realtime && (work->flags & IRQ_WORK_HARD_IRQ)) { > > if (llist_add(&work->llnode, > > this_cpu_ptr(&hirq_work_list))) > > This boils down to > > #ifdef CONFIG_X > some_type x; > #endif > ... > > if (IS_ENABLED(CONFIG_X) && ...) > use(x); > > And here we even have an indirection for IS_ENABLED via that local > bool > variable. Is that pattern OK for Linux? Does it compile in all > supported > optimization levels of all supported compilers? I hope it all goes away, that being what IS_ENABLED() is there for. -Mike ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH RT 3.18] irq_work: Provide a soft-irq based queue 2015-04-23 7:01 ` Mike Galbraith @ 2015-04-23 7:12 ` Jan Kiszka 2015-04-23 7:19 ` Mike Galbraith 0 siblings, 1 reply; 26+ messages in thread From: Jan Kiszka @ 2015-04-23 7:12 UTC (permalink / raw) To: Mike Galbraith Cc: Steven Rostedt, Sebastian Andrzej Siewior, RT, Linux Kernel Mailing List On 2015-04-23 09:01, Mike Galbraith wrote: > On Thu, 2015-04-23 at 08:50 +0200, Jan Kiszka wrote: >> On 2015-04-23 08:11, Mike Galbraith wrote: >>> @@ -103,6 +98,9 @@ EXPORT_SYMBOL_GPL(irq_work_queue_on); >>> /* Enqueue the irq work @work on the current CPU */ >>> bool irq_work_queue(struct irq_work *work) >>> { >>> + bool realtime = IS_ENABLED(CONFIG_PREEMPT_RT_FULL); >>> + bool raise = false; >>> + >>> /* Only queue if not already pending */ >>> if (!irq_work_claim(work)) >>> return false; >>> @@ -110,25 +108,22 @@ bool irq_work_queue(struct irq_work *wor >>> /* Queue the entry and raise the IPI if needed. */ >>> preempt_disable(); >>> >>> -#ifdef CONFIG_PREEMPT_RT_FULL >>> - if (work->flags & IRQ_WORK_HARD_IRQ) { >>> + if (realtime && (work->flags & IRQ_WORK_HARD_IRQ)) { >>> if (llist_add(&work->llnode, >>> this_cpu_ptr(&hirq_work_list))) >> >> This boils down to >> >> #ifdef CONFIG_X >> some_type x; >> #endif >> ... >> >> if (IS_ENABLED(CONFIG_X) && ...) >> use(x); >> >> And here we even have an indirection for IS_ENABLED via that local >> bool >> variable. Is that pattern OK for Linux? Does it compile in all >> supported >> optimization levels of all supported compilers? > > I hope it all goes away, that being what IS_ENABLED() is there for. Hope is good - but not enough here: it breaks the build under !CONFIG_X, even the case without the bool var. CC kernel/irq_work.o In file included from ../include/asm-generic/percpu.h:6:0, from ../arch/x86/include/asm/percpu.h:522, from ../arch/x86/include/asm/current.h:5, from ../arch/x86/include/asm/processor.h:15, from ../arch/x86/include/asm/irq_work.h:4, from ../include/linux/irq_work.h:47, from ../kernel/irq_work.c:11: ../kernel/irq_work.c: In function ‘irq_work_queue_on’: ../kernel/irq_work.c:85:17: error: ‘hirq_work_list’ undeclared (first use in this function) &per_cpu(hirq_work_list, cpu)); Jan -- Siemens AG, Corporate Technology, CT RTC ITP SES-DE Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH RT 3.18] irq_work: Provide a soft-irq based queue 2015-04-23 7:12 ` Jan Kiszka @ 2015-04-23 7:19 ` Mike Galbraith 2015-04-23 21:00 ` Steven Rostedt 0 siblings, 1 reply; 26+ messages in thread From: Mike Galbraith @ 2015-04-23 7:19 UTC (permalink / raw) To: Jan Kiszka Cc: Steven Rostedt, Sebastian Andrzej Siewior, RT, Linux Kernel Mailing List On Thu, 2015-04-23 at 09:12 +0200, Jan Kiszka wrote: > On 2015-04-23 09:01, Mike Galbraith wrote: > > On Thu, 2015-04-23 at 08:50 +0200, Jan Kiszka wrote: > > > On 2015-04-23 08:11, Mike Galbraith wrote: > > > > @@ -103,6 +98,9 @@ EXPORT_SYMBOL_GPL(irq_work_queue_on); > > > > /* Enqueue the irq work @work on the current CPU */ > > > > bool irq_work_queue(struct irq_work *work) > > > > { > > > > + bool realtime = IS_ENABLED(CONFIG_PREEMPT_RT_FULL); > > > > + bool raise = false; > > > > + > > > > /* Only queue if not already pending */ > > > > if (!irq_work_claim(work)) > > > > return false; > > > > @@ -110,25 +108,22 @@ bool irq_work_queue(struct irq_work *wor > > > > /* Queue the entry and raise the IPI if needed. */ > > > > preempt_disable(); > > > > > > > > -#ifdef CONFIG_PREEMPT_RT_FULL > > > > - if (work->flags & IRQ_WORK_HARD_IRQ) { > > > > + if (realtime && (work->flags & IRQ_WORK_HARD_IRQ)) { > > > > if (llist_add(&work->llnode, > > > > this_cpu_ptr(&hirq_work_list))) > > > > > > This boils down to > > > > > > #ifdef CONFIG_X > > > some_type x; > > > #endif > > > ... > > > > > > if (IS_ENABLED(CONFIG_X) && ...) > > > use(x); > > > > > > And here we even have an indirection for IS_ENABLED via that > > > local > > > bool > > > variable. Is that pattern OK for Linux? Does it compile in all > > > supported > > > optimization levels of all supported compilers? > > > > I hope it all goes away, that being what IS_ENABLED() is there for. > > Hope is good - but not enough here: it breaks the build under > !CONFIG_X, even the case without the bool var. > > CC kernel/irq_work.o > In file included from ../include/asm-generic/percpu.h:6:0, > from ../arch/x86/include/asm/percpu.h:522, > from ../arch/x86/include/asm/current.h:5, > from ../arch/x86/include/asm/processor.h:15, > from ../arch/x86/include/asm/irq_work.h:4, > from ../include/linux/irq_work.h:47, > from ../kernel/irq_work.c:11: > ../kernel/irq_work.c: In function ‘irq_work_queue_on’: > ../kernel/irq_work.c:85:17: error: ‘hirq_work_list’ undeclared > (first use in this function) > &per_cpu(hirq_work_list, cpu)); Aw poo, so that's just what I _thought_ it was for. -Mike -- To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH RT 3.18] irq_work: Provide a soft-irq based queue 2015-04-23 7:19 ` Mike Galbraith @ 2015-04-23 21:00 ` Steven Rostedt 2015-04-24 6:54 ` Mike Galbraith 0 siblings, 1 reply; 26+ messages in thread From: Steven Rostedt @ 2015-04-23 21:00 UTC (permalink / raw) To: Mike Galbraith Cc: Jan Kiszka, Sebastian Andrzej Siewior, RT, Linux Kernel Mailing List On Thu, 23 Apr 2015 09:19:26 +0200 Mike Galbraith <umgwanakikbuti@gmail.com> wrote: > > CC kernel/irq_work.o > > In file included from ../include/asm-generic/percpu.h:6:0, > > from ../arch/x86/include/asm/percpu.h:522, > > from ../arch/x86/include/asm/current.h:5, > > from ../arch/x86/include/asm/processor.h:15, > > from ../arch/x86/include/asm/irq_work.h:4, > > from ../include/linux/irq_work.h:47, > > from ../kernel/irq_work.c:11: > > ../kernel/irq_work.c: In function ‘irq_work_queue_on’: > > ../kernel/irq_work.c:85:17: error: ‘hirq_work_list’ undeclared > > (first use in this function) > > &per_cpu(hirq_work_list, cpu)); > > Aw poo, so that's just what I _thought_ it was for. It helps optimization but does nothing for undefined symbols. That said, why don't we clean up that irq_work code and at least declare both lists, and get rid of all the #ifdefs. I wonder if gcc is smart enough to not allocate a static variable if it happens to be optimized out? -- Steve ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH RT 3.18] irq_work: Provide a soft-irq based queue 2015-04-23 21:00 ` Steven Rostedt @ 2015-04-24 6:54 ` Mike Galbraith 2015-04-24 9:00 ` Jan Kiszka 0 siblings, 1 reply; 26+ messages in thread From: Mike Galbraith @ 2015-04-24 6:54 UTC (permalink / raw) To: Steven Rostedt Cc: Jan Kiszka, Sebastian Andrzej Siewior, RT, Linux Kernel Mailing List On Thu, 2015-04-23 at 17:00 -0400, Steven Rostedt wrote: > On Thu, 23 Apr 2015 09:19:26 +0200 > Mike Galbraith <umgwanakikbuti@gmail.com> wrote: > > > > CC kernel/irq_work.o > > > In file included from ../include/asm-generic/percpu.h:6:0, > > > from ../arch/x86/include/asm/percpu.h:522, > > > from ../arch/x86/include/asm/current.h:5, > > > from ../arch/x86/include/asm/processor.h:15, > > > from ../arch/x86/include/asm/irq_work.h:4, > > > from ../include/linux/irq_work.h:47, > > > from ../kernel/irq_work.c:11: > > > ../kernel/irq_work.c: In function ‘irq_work_queue_on’: > > > ../kernel/irq_work.c:85:17: error: ‘hirq_work_list’ undeclared > > > (first use in this function) > > > &per_cpu(hirq_work_list, cpu)); > > > > Aw poo, so that's just what I _thought_ it was for. > > It helps optimization but does nothing for undefined symbols. > > That said, why don't we clean up that irq_work code and at least > declare both lists, and get rid of all the #ifdefs. I wonder if gcc is > smart enough to not allocate a static variable if it happens to be > optimized out? Nope, it didn't notice a thing. This is a stab at that cleanup. Usable as is with Jan's ok, or as fodder for your bitmaster-9000 patch shredder, or whatever. Box works and it makes line count shrink... I downgraded evolution v3.16->v3.12 to restore its ability to read it's own fscking "Preformatted" switch, so whitespace should be fine. Oh, btw, if anyone (else) makes a 4.1-rt, your rt push work will want one of those nifty hirq tags lest box make boom due to trying to do that not only way late, but with irqs enabled which pisses sched all off. Subject: [PATCH RT 3.18] irq_work: Provide a soft-irq based queue Date: Thu, 16 Apr 2015 18:28:16 +0200 From: Jan Kiszka <jan.kiszka@siemens.com> Instead of turning all irq_work requests into lazy ones on -rt, just move their execution from hard into soft-irq context. This resolves deadlocks of ftrace which will queue work from arbitrary contexts, including those that have locks held that are needed for raising a soft-irq. Mike: cleanup ifdef mess and kill hirq_work_list. We need two lists, and already have them, merely need to select according to work type. In -rt all work not tagged for hirq execution is queued to the lazy list and runs via irq_work_tick(). Raising SOFTIRQ_TIMER is always done via IPI for deadlock safety, if the work item is not a lazy work or the tick is stopped, fire IPI immediately, otherwise let it wait. IOW, lazy work is lazy in -rt only until someone queues immediate work. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com> --- Second try, looks much better so far. And it also removes my concerns regarding other potential cases besides ftrace. kernel/irq_work.c | 85 ++++++++++++++++++++---------------------------------- 1 file changed, 33 insertions(+), 52 deletions(-) --- a/kernel/irq_work.c +++ b/kernel/irq_work.c @@ -23,9 +23,7 @@ static DEFINE_PER_CPU(struct llist_head, raised_list); static DEFINE_PER_CPU(struct llist_head, lazy_list); -#ifdef CONFIG_PREEMPT_RT_FULL -static DEFINE_PER_CPU(struct llist_head, hirq_work_list); -#endif + /* * Claim the entry so that no one else will poke at it. */ @@ -68,7 +66,7 @@ void __weak arch_irq_work_raise(void) */ bool irq_work_queue_on(struct irq_work *work, int cpu) { - bool raise_irqwork; + struct llist_head *list; /* All work should have been flushed before going offline */ WARN_ON_ONCE(cpu_is_offline(cpu)); @@ -80,19 +78,12 @@ bool irq_work_queue_on(struct irq_work * if (!irq_work_claim(work)) return false; -#ifdef CONFIG_PREEMPT_RT_FULL - if (work->flags & IRQ_WORK_HARD_IRQ) - raise_irqwork = llist_add(&work->llnode, - &per_cpu(hirq_work_list, cpu)); + if (IS_ENABLED(CONFIG_PREEMPT_RT_FULL) && !(work->flags & IRQ_WORK_HARD_IRQ)) + list = &per_cpu(lazy_list, cpu); else - raise_irqwork = llist_add(&work->llnode, - &per_cpu(lazy_list, cpu)); -#else - raise_irqwork = llist_add(&work->llnode, - &per_cpu(raised_list, cpu)); -#endif + list = &per_cpu(raised_list, cpu); - if (raise_irqwork) + if (llist_add(&work->llnode, list)) arch_send_call_function_single_ipi(cpu); return true; @@ -103,6 +94,9 @@ EXPORT_SYMBOL_GPL(irq_work_queue_on); /* Enqueue the irq work @work on the current CPU */ bool irq_work_queue(struct irq_work *work) { + struct llist_head *list; + bool lazy_work, realtime = IS_ENABLED(CONFIG_PREEMPT_RT_FULL); + /* Only queue if not already pending */ if (!irq_work_claim(work)) return false; @@ -110,25 +104,17 @@ bool irq_work_queue(struct irq_work *wor /* Queue the entry and raise the IPI if needed. */ preempt_disable(); -#ifdef CONFIG_PREEMPT_RT_FULL - if (work->flags & IRQ_WORK_HARD_IRQ) { - if (llist_add(&work->llnode, this_cpu_ptr(&hirq_work_list))) - arch_irq_work_raise(); - } else { - if (llist_add(&work->llnode, this_cpu_ptr(&lazy_list)) && - tick_nohz_tick_stopped()) - raise_softirq(TIMER_SOFTIRQ); - } -#else - if (work->flags & IRQ_WORK_LAZY) { - if (llist_add(&work->llnode, this_cpu_ptr(&lazy_list)) && - tick_nohz_tick_stopped()) - arch_irq_work_raise(); - } else { - if (llist_add(&work->llnode, this_cpu_ptr(&raised_list))) + lazy_work = work->flags & IRQ_WORK_LAZY; + + if (lazy_work || (realtime && !(work->flags & IRQ_WORK_HARD_IRQ))) + list = this_cpu_ptr(&lazy_list); + else + list = this_cpu_ptr(&raised_list); + + if (llist_add(&work->llnode, list)) { + if (!lazy_work || tick_nohz_tick_stopped()) arch_irq_work_raise(); } -#endif preempt_enable(); @@ -143,12 +129,8 @@ bool irq_work_needs_cpu(void) raised = this_cpu_ptr(&raised_list); lazy = this_cpu_ptr(&lazy_list); - if (llist_empty(raised)) - if (llist_empty(lazy)) -#ifdef CONFIG_PREEMPT_RT_FULL - if (llist_empty(this_cpu_ptr(&hirq_work_list))) -#endif - return false; + if (llist_empty(raised) && llist_empty(lazy)) + return false; /* All work should have been flushed before going offline */ WARN_ON_ONCE(cpu_is_offline(smp_processor_id())); @@ -162,9 +144,7 @@ static void irq_work_run_list(struct lli struct irq_work *work; struct llist_node *llnode; -#ifndef CONFIG_PREEMPT_RT_FULL - BUG_ON(!irqs_disabled()); -#endif + BUG_ON(!IS_ENABLED(CONFIG_PREEMPT_RT_FULL) && !irqs_disabled()); if (llist_empty(list)) return; @@ -200,26 +180,27 @@ static void irq_work_run_list(struct lli */ void irq_work_run(void) { -#ifdef CONFIG_PREEMPT_RT_FULL - irq_work_run_list(this_cpu_ptr(&hirq_work_list)); -#else irq_work_run_list(this_cpu_ptr(&raised_list)); - irq_work_run_list(this_cpu_ptr(&lazy_list)); -#endif + if (IS_ENABLED(CONFIG_PREEMPT_RT_FULL)) { + /* + * NOTE: we raise softirq via IPI for safety, + * and execute in irq_work_tick() to move the + * overhead from hard to soft irq context. + */ + if (!llist_empty(this_cpu_ptr(&lazy_list))) + raise_softirq(TIMER_SOFTIRQ); + } else + irq_work_run_list(this_cpu_ptr(&lazy_list)); } EXPORT_SYMBOL_GPL(irq_work_run); void irq_work_tick(void) { -#ifdef CONFIG_PREEMPT_RT_FULL - irq_work_run_list(this_cpu_ptr(&lazy_list)); -#else - struct llist_head *raised = &__get_cpu_var(raised_list); + struct llist_head *raised = this_cpu_ptr(&raised_list); if (!llist_empty(raised) && !arch_irq_work_has_interrupt()) irq_work_run_list(raised); - irq_work_run_list(&__get_cpu_var(lazy_list)); -#endif + irq_work_run_list(this_cpu_ptr(&lazy_list)); } /* ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH RT 3.18] irq_work: Provide a soft-irq based queue 2015-04-24 6:54 ` Mike Galbraith @ 2015-04-24 9:00 ` Jan Kiszka 2015-04-24 9:59 ` Mike Galbraith 2015-04-25 7:20 ` Mike Galbraith 0 siblings, 2 replies; 26+ messages in thread From: Jan Kiszka @ 2015-04-24 9:00 UTC (permalink / raw) To: Mike Galbraith, Steven Rostedt Cc: Sebastian Andrzej Siewior, RT, Linux Kernel Mailing List On 2015-04-24 08:54, Mike Galbraith wrote: > On Thu, 2015-04-23 at 17:00 -0400, Steven Rostedt wrote: >> On Thu, 23 Apr 2015 09:19:26 +0200 >> Mike Galbraith <umgwanakikbuti@gmail.com> wrote: >> >>>> CC kernel/irq_work.o >>>> In file included from ../include/asm-generic/percpu.h:6:0, >>>> from ../arch/x86/include/asm/percpu.h:522, >>>> from ../arch/x86/include/asm/current.h:5, >>>> from ../arch/x86/include/asm/processor.h:15, >>>> from ../arch/x86/include/asm/irq_work.h:4, >>>> from ../include/linux/irq_work.h:47, >>>> from ../kernel/irq_work.c:11: >>>> ../kernel/irq_work.c: In function ‘irq_work_queue_on’: >>>> ../kernel/irq_work.c:85:17: error: ‘hirq_work_list’ undeclared >>>> (first use in this function) >>>> &per_cpu(hirq_work_list, cpu)); >>> >>> Aw poo, so that's just what I _thought_ it was for. >> >> It helps optimization but does nothing for undefined symbols. >> >> That said, why don't we clean up that irq_work code and at least >> declare both lists, and get rid of all the #ifdefs. I wonder if gcc is >> smart enough to not allocate a static variable if it happens to be >> optimized out? > > Nope, it didn't notice a thing. > > This is a stab at that cleanup. Usable as is with Jan's ok, or as > fodder for your bitmaster-9000 patch shredder, or whatever. Box works > and it makes line count shrink... > > I downgraded evolution v3.16->v3.12 to restore its ability to read it's > own fscking "Preformatted" switch, so whitespace should be fine. > > Oh, btw, if anyone (else) makes a 4.1-rt, your rt push work will want > one of those nifty hirq tags lest box make boom due to trying to do that > not only way late, but with irqs enabled which pisses sched all off. > > Subject: [PATCH RT 3.18] irq_work: Provide a soft-irq based queue > Date: Thu, 16 Apr 2015 18:28:16 +0200 > From: Jan Kiszka <jan.kiszka@siemens.com> > > Instead of turning all irq_work requests into lazy ones on -rt, just > move their execution from hard into soft-irq context. > > This resolves deadlocks of ftrace which will queue work from arbitrary > contexts, including those that have locks held that are needed for > raising a soft-irq. > > Mike: cleanup ifdef mess and kill hirq_work_list. We need two lists, > and already have them, merely need to select according to work type. > In -rt all work not tagged for hirq execution is queued to the lazy > list and runs via irq_work_tick(). Raising SOFTIRQ_TIMER is always > done via IPI for deadlock safety, if the work item is not a lazy work > or the tick is stopped, fire IPI immediately, otherwise let it wait. > IOW, lazy work is lazy in -rt only until someone queues immediate work. The approach looks good to me, but the commit log deserves a rework now. Jan -- Siemens AG, Corporate Technology, CT RTC ITP SES-DE Corporate Competence Center Embedded Linux ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH RT 3.18] irq_work: Provide a soft-irq based queue 2015-04-24 9:00 ` Jan Kiszka @ 2015-04-24 9:59 ` Mike Galbraith 2015-04-25 7:20 ` Mike Galbraith 1 sibling, 0 replies; 26+ messages in thread From: Mike Galbraith @ 2015-04-24 9:59 UTC (permalink / raw) To: Jan Kiszka Cc: Steven Rostedt, Sebastian Andrzej Siewior, RT, Linux Kernel Mailing List On Fri, 2015-04-24 at 11:00 +0200, Jan Kiszka wrote: > The approach looks good to me, but the commit log deserves a rework now. Yeah. While you're at it, you should change my chop to an ack or a tested-by too, as it's your patch, I just rearranged a bit. -Mike ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH RT 3.18] irq_work: Provide a soft-irq based queue 2015-04-24 9:00 ` Jan Kiszka 2015-04-24 9:59 ` Mike Galbraith @ 2015-04-25 7:20 ` Mike Galbraith 2015-04-25 7:26 ` Jan Kiszka 1 sibling, 1 reply; 26+ messages in thread From: Mike Galbraith @ 2015-04-25 7:20 UTC (permalink / raw) To: Jan Kiszka Cc: Steven Rostedt, Sebastian Andrzej Siewior, RT, Linux Kernel Mailing List On Fri, 2015-04-24 at 11:00 +0200, Jan Kiszka wrote: > The approach looks good to me, but the commit log deserves a rework now. Ok, we agree on the approach, and that the changelog wants a bit of attention, so either you're gonna rewrite it to suit you, do a pretty changelog, and I ack, or I take the blame for the posted form, scribble something that I hope is a better log, and you ack. Either will work. Here's my changelog+blame-taking, if you're ok with it, ack, and we can call it a day, otherwise onward to plan B. irq_work: Delegate non-immediate irq work to ksoftirqd Based on a patch from Jan Kiszka. Jan reported that ftrace queueing work from arbitrary contexts can and does lead to deadlock. trace-cmd -e sched:* deadlocked in fact. Resolve the problem by delegating all non-immediate work to ksoftirqd. We need two lists to do this, one for hard irq, one for soft, so we can use the two existing lists, eliminating the -rt specific list and all of the ifdefery while we're at it. Strategy: Queue work tagged for hirq invocation to the raised_list, invoke via IPI as usual. If a work item being queued to lazy_list, which becomes our all others list, is not a lazy work item, or the tick is stopped, fire an IPI to raise SOFTIRQ_TIMER immediately, otherwise let ksofirqd find it when the tick comes along. Raising SOFTIRQ_TIMER via IPI even when queueing local ensures delegation. Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com> --- kernel/irq_work.c | 85 ++++++++++++++++++++---------------------------------- 1 file changed, 33 insertions(+), 52 deletions(-) --- a/kernel/irq_work.c +++ b/kernel/irq_work.c @@ -23,9 +23,7 @@ static DEFINE_PER_CPU(struct llist_head, raised_list); static DEFINE_PER_CPU(struct llist_head, lazy_list); -#ifdef CONFIG_PREEMPT_RT_FULL -static DEFINE_PER_CPU(struct llist_head, hirq_work_list); -#endif + /* * Claim the entry so that no one else will poke at it. */ @@ -68,7 +66,7 @@ void __weak arch_irq_work_raise(void) */ bool irq_work_queue_on(struct irq_work *work, int cpu) { - bool raise_irqwork; + struct llist_head *list; /* All work should have been flushed before going offline */ WARN_ON_ONCE(cpu_is_offline(cpu)); @@ -80,19 +78,12 @@ bool irq_work_queue_on(struct irq_work * if (!irq_work_claim(work)) return false; -#ifdef CONFIG_PREEMPT_RT_FULL - if (work->flags & IRQ_WORK_HARD_IRQ) - raise_irqwork = llist_add(&work->llnode, - &per_cpu(hirq_work_list, cpu)); + if (IS_ENABLED(CONFIG_PREEMPT_RT_FULL) && !(work->flags & IRQ_WORK_HARD_IRQ)) + list = &per_cpu(lazy_list, cpu); else - raise_irqwork = llist_add(&work->llnode, - &per_cpu(lazy_list, cpu)); -#else - raise_irqwork = llist_add(&work->llnode, - &per_cpu(raised_list, cpu)); -#endif + list = &per_cpu(raised_list, cpu); - if (raise_irqwork) + if (llist_add(&work->llnode, list)) arch_send_call_function_single_ipi(cpu); return true; @@ -103,6 +94,9 @@ EXPORT_SYMBOL_GPL(irq_work_queue_on); /* Enqueue the irq work @work on the current CPU */ bool irq_work_queue(struct irq_work *work) { + struct llist_head *list; + bool lazy_work, realtime = IS_ENABLED(CONFIG_PREEMPT_RT_FULL); + /* Only queue if not already pending */ if (!irq_work_claim(work)) return false; @@ -110,25 +104,17 @@ bool irq_work_queue(struct irq_work *wor /* Queue the entry and raise the IPI if needed. */ preempt_disable(); -#ifdef CONFIG_PREEMPT_RT_FULL - if (work->flags & IRQ_WORK_HARD_IRQ) { - if (llist_add(&work->llnode, this_cpu_ptr(&hirq_work_list))) - arch_irq_work_raise(); - } else { - if (llist_add(&work->llnode, this_cpu_ptr(&lazy_list)) && - tick_nohz_tick_stopped()) - raise_softirq(TIMER_SOFTIRQ); - } -#else - if (work->flags & IRQ_WORK_LAZY) { - if (llist_add(&work->llnode, this_cpu_ptr(&lazy_list)) && - tick_nohz_tick_stopped()) - arch_irq_work_raise(); - } else { - if (llist_add(&work->llnode, this_cpu_ptr(&raised_list))) + lazy_work = work->flags & IRQ_WORK_LAZY; + + if (lazy_work || (realtime && !(work->flags & IRQ_WORK_HARD_IRQ))) + list = this_cpu_ptr(&lazy_list); + else + list = this_cpu_ptr(&raised_list); + + if (llist_add(&work->llnode, list)) { + if (!lazy_work || tick_nohz_tick_stopped()) arch_irq_work_raise(); } -#endif preempt_enable(); @@ -143,12 +129,8 @@ bool irq_work_needs_cpu(void) raised = this_cpu_ptr(&raised_list); lazy = this_cpu_ptr(&lazy_list); - if (llist_empty(raised)) - if (llist_empty(lazy)) -#ifdef CONFIG_PREEMPT_RT_FULL - if (llist_empty(this_cpu_ptr(&hirq_work_list))) -#endif - return false; + if (llist_empty(raised) && llist_empty(lazy)) + return false; /* All work should have been flushed before going offline */ WARN_ON_ONCE(cpu_is_offline(smp_processor_id())); @@ -162,9 +144,7 @@ static void irq_work_run_list(struct lli struct irq_work *work; struct llist_node *llnode; -#ifndef CONFIG_PREEMPT_RT_FULL - BUG_ON(!irqs_disabled()); -#endif + BUG_ON(!IS_ENABLED(CONFIG_PREEMPT_RT_FULL) && !irqs_disabled()); if (llist_empty(list)) return; @@ -200,26 +180,27 @@ static void irq_work_run_list(struct lli */ void irq_work_run(void) { -#ifdef CONFIG_PREEMPT_RT_FULL - irq_work_run_list(this_cpu_ptr(&hirq_work_list)); -#else irq_work_run_list(this_cpu_ptr(&raised_list)); - irq_work_run_list(this_cpu_ptr(&lazy_list)); -#endif + if (IS_ENABLED(CONFIG_PREEMPT_RT_FULL)) { + /* + * NOTE: we raise softirq via IPI for safety, + * and execute in irq_work_tick() to move the + * overhead from hard to soft irq context. + */ + if (!llist_empty(this_cpu_ptr(&lazy_list))) + raise_softirq(TIMER_SOFTIRQ); + } else + irq_work_run_list(this_cpu_ptr(&lazy_list)); } EXPORT_SYMBOL_GPL(irq_work_run); void irq_work_tick(void) { -#ifdef CONFIG_PREEMPT_RT_FULL - irq_work_run_list(this_cpu_ptr(&lazy_list)); -#else - struct llist_head *raised = &__get_cpu_var(raised_list); + struct llist_head *raised = this_cpu_ptr(&raised_list); if (!llist_empty(raised) && !arch_irq_work_has_interrupt()) irq_work_run_list(raised); - irq_work_run_list(&__get_cpu_var(lazy_list)); -#endif + irq_work_run_list(this_cpu_ptr(&lazy_list)); } /* ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH RT 3.18] irq_work: Provide a soft-irq based queue 2015-04-25 7:20 ` Mike Galbraith @ 2015-04-25 7:26 ` Jan Kiszka 2015-05-18 19:52 ` Sebastian Andrzej Siewior 0 siblings, 1 reply; 26+ messages in thread From: Jan Kiszka @ 2015-04-25 7:26 UTC (permalink / raw) To: Mike Galbraith, Sebastian Andrzej Siewior Cc: Steven Rostedt, RT, Linux Kernel Mailing List [-- Attachment #1: Type: text/plain, Size: 6965 bytes --] On 2015-04-25 09:20, Mike Galbraith wrote: > On Fri, 2015-04-24 at 11:00 +0200, Jan Kiszka wrote: > >> The approach looks good to me, but the commit log deserves a rework now. > > Ok, we agree on the approach, and that the changelog wants a bit of > attention, so either you're gonna rewrite it to suit you, do a pretty > changelog, and I ack, or I take the blame for the posted form, scribble > something that I hope is a better log, and you ack. Either will work. > > Here's my changelog+blame-taking, if you're ok with it, ack, and we can > call it a day, otherwise onward to plan B. > > > > irq_work: Delegate non-immediate irq work to ksoftirqd > > Based on a patch from Jan Kiszka. > > Jan reported that ftrace queueing work from arbitrary contexts can > and does lead to deadlock. trace-cmd -e sched:* deadlocked in fact. > > Resolve the problem by delegating all non-immediate work to ksoftirqd. > > We need two lists to do this, one for hard irq, one for soft, so we > can use the two existing lists, eliminating the -rt specific list and > all of the ifdefery while we're at it. > > Strategy: Queue work tagged for hirq invocation to the raised_list, > invoke via IPI as usual. If a work item being queued to lazy_list, > which becomes our all others list, is not a lazy work item, or the > tick is stopped, fire an IPI to raise SOFTIRQ_TIMER immediately, > otherwise let ksofirqd find it when the tick comes along. Raising > SOFTIRQ_TIMER via IPI even when queueing local ensures delegation. > > Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com> > > --- > kernel/irq_work.c | 85 ++++++++++++++++++++---------------------------------- > 1 file changed, 33 insertions(+), 52 deletions(-) > > --- a/kernel/irq_work.c > +++ b/kernel/irq_work.c > @@ -23,9 +23,7 @@ > > static DEFINE_PER_CPU(struct llist_head, raised_list); > static DEFINE_PER_CPU(struct llist_head, lazy_list); > -#ifdef CONFIG_PREEMPT_RT_FULL > -static DEFINE_PER_CPU(struct llist_head, hirq_work_list); > -#endif > + > /* > * Claim the entry so that no one else will poke at it. > */ > @@ -68,7 +66,7 @@ void __weak arch_irq_work_raise(void) > */ > bool irq_work_queue_on(struct irq_work *work, int cpu) > { > - bool raise_irqwork; > + struct llist_head *list; > > /* All work should have been flushed before going offline */ > WARN_ON_ONCE(cpu_is_offline(cpu)); > @@ -80,19 +78,12 @@ bool irq_work_queue_on(struct irq_work * > if (!irq_work_claim(work)) > return false; > > -#ifdef CONFIG_PREEMPT_RT_FULL > - if (work->flags & IRQ_WORK_HARD_IRQ) > - raise_irqwork = llist_add(&work->llnode, > - &per_cpu(hirq_work_list, cpu)); > + if (IS_ENABLED(CONFIG_PREEMPT_RT_FULL) && !(work->flags & IRQ_WORK_HARD_IRQ)) > + list = &per_cpu(lazy_list, cpu); > else > - raise_irqwork = llist_add(&work->llnode, > - &per_cpu(lazy_list, cpu)); > -#else > - raise_irqwork = llist_add(&work->llnode, > - &per_cpu(raised_list, cpu)); > -#endif > + list = &per_cpu(raised_list, cpu); > > - if (raise_irqwork) > + if (llist_add(&work->llnode, list)) > arch_send_call_function_single_ipi(cpu); > > return true; > @@ -103,6 +94,9 @@ EXPORT_SYMBOL_GPL(irq_work_queue_on); > /* Enqueue the irq work @work on the current CPU */ > bool irq_work_queue(struct irq_work *work) > { > + struct llist_head *list; > + bool lazy_work, realtime = IS_ENABLED(CONFIG_PREEMPT_RT_FULL); > + > /* Only queue if not already pending */ > if (!irq_work_claim(work)) > return false; > @@ -110,25 +104,17 @@ bool irq_work_queue(struct irq_work *wor > /* Queue the entry and raise the IPI if needed. */ > preempt_disable(); > > -#ifdef CONFIG_PREEMPT_RT_FULL > - if (work->flags & IRQ_WORK_HARD_IRQ) { > - if (llist_add(&work->llnode, this_cpu_ptr(&hirq_work_list))) > - arch_irq_work_raise(); > - } else { > - if (llist_add(&work->llnode, this_cpu_ptr(&lazy_list)) && > - tick_nohz_tick_stopped()) > - raise_softirq(TIMER_SOFTIRQ); > - } > -#else > - if (work->flags & IRQ_WORK_LAZY) { > - if (llist_add(&work->llnode, this_cpu_ptr(&lazy_list)) && > - tick_nohz_tick_stopped()) > - arch_irq_work_raise(); > - } else { > - if (llist_add(&work->llnode, this_cpu_ptr(&raised_list))) > + lazy_work = work->flags & IRQ_WORK_LAZY; > + > + if (lazy_work || (realtime && !(work->flags & IRQ_WORK_HARD_IRQ))) > + list = this_cpu_ptr(&lazy_list); > + else > + list = this_cpu_ptr(&raised_list); > + > + if (llist_add(&work->llnode, list)) { > + if (!lazy_work || tick_nohz_tick_stopped()) > arch_irq_work_raise(); > } > -#endif > > preempt_enable(); > > @@ -143,12 +129,8 @@ bool irq_work_needs_cpu(void) > raised = this_cpu_ptr(&raised_list); > lazy = this_cpu_ptr(&lazy_list); > > - if (llist_empty(raised)) > - if (llist_empty(lazy)) > -#ifdef CONFIG_PREEMPT_RT_FULL > - if (llist_empty(this_cpu_ptr(&hirq_work_list))) > -#endif > - return false; > + if (llist_empty(raised) && llist_empty(lazy)) > + return false; > > /* All work should have been flushed before going offline */ > WARN_ON_ONCE(cpu_is_offline(smp_processor_id())); > @@ -162,9 +144,7 @@ static void irq_work_run_list(struct lli > struct irq_work *work; > struct llist_node *llnode; > > -#ifndef CONFIG_PREEMPT_RT_FULL > - BUG_ON(!irqs_disabled()); > -#endif > + BUG_ON(!IS_ENABLED(CONFIG_PREEMPT_RT_FULL) && !irqs_disabled()); > > if (llist_empty(list)) > return; > @@ -200,26 +180,27 @@ static void irq_work_run_list(struct lli > */ > void irq_work_run(void) > { > -#ifdef CONFIG_PREEMPT_RT_FULL > - irq_work_run_list(this_cpu_ptr(&hirq_work_list)); > -#else > irq_work_run_list(this_cpu_ptr(&raised_list)); > - irq_work_run_list(this_cpu_ptr(&lazy_list)); > -#endif > + if (IS_ENABLED(CONFIG_PREEMPT_RT_FULL)) { > + /* > + * NOTE: we raise softirq via IPI for safety, > + * and execute in irq_work_tick() to move the > + * overhead from hard to soft irq context. > + */ > + if (!llist_empty(this_cpu_ptr(&lazy_list))) > + raise_softirq(TIMER_SOFTIRQ); > + } else > + irq_work_run_list(this_cpu_ptr(&lazy_list)); > } > EXPORT_SYMBOL_GPL(irq_work_run); > > void irq_work_tick(void) > { > -#ifdef CONFIG_PREEMPT_RT_FULL > - irq_work_run_list(this_cpu_ptr(&lazy_list)); > -#else > - struct llist_head *raised = &__get_cpu_var(raised_list); > + struct llist_head *raised = this_cpu_ptr(&raised_list); > > if (!llist_empty(raised) && !arch_irq_work_has_interrupt()) > irq_work_run_list(raised); > - irq_work_run_list(&__get_cpu_var(lazy_list)); > -#endif > + irq_work_run_list(this_cpu_ptr(&lazy_list)); > } > > /* > > > Acked-by: Jan Kiszka <jan.kiszka@siemens.com> This way around makes more sense as you changed the patch significantly. Thanks, Jan [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 181 bytes --] ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH RT 3.18] irq_work: Provide a soft-irq based queue 2015-04-25 7:26 ` Jan Kiszka @ 2015-05-18 19:52 ` Sebastian Andrzej Siewior 0 siblings, 0 replies; 26+ messages in thread From: Sebastian Andrzej Siewior @ 2015-05-18 19:52 UTC (permalink / raw) To: Jan Kiszka; +Cc: Mike Galbraith, Steven Rostedt, RT, Linux Kernel Mailing List * Jan Kiszka | 2015-04-25 09:26:13 [+0200]: >Acked-by: Jan Kiszka <jan.kiszka@siemens.com> > >This way around makes more sense as you changed the patch significantly. Now I took the proper one. >Thanks, >Jan Sebastian ^ permalink raw reply [flat|nested] 26+ messages in thread
end of thread, other threads:[~2015-05-18 19:52 UTC | newest] Thread overview: 26+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-04-16 14:06 [PATCH RT 3.18] ring-buffer: Mark irq_work as HARD_IRQ to prevent deadlocks Jan Kiszka 2015-04-16 14:12 ` Steven Rostedt 2015-04-16 14:26 ` Sebastian Andrzej Siewior 2015-04-16 14:28 ` Jan Kiszka 2015-04-16 14:57 ` Sebastian Andrzej Siewior 2015-04-16 15:31 ` Jan Kiszka 2015-04-16 15:10 ` Steven Rostedt 2015-04-16 15:29 ` Jan Kiszka 2015-04-16 15:33 ` Sebastian Andrzej Siewior 2015-04-16 16:28 ` [PATCH RT 3.18] irq_work: Provide a soft-irq based queue Jan Kiszka 2015-04-20 8:03 ` Mike Galbraith 2015-04-23 6:11 ` Mike Galbraith 2015-04-23 6:29 ` Jan Kiszka 2015-04-23 6:58 ` Mike Galbraith 2015-04-23 7:14 ` Jan Kiszka 2015-04-23 6:50 ` Jan Kiszka 2015-04-23 7:01 ` Mike Galbraith 2015-04-23 7:12 ` Jan Kiszka 2015-04-23 7:19 ` Mike Galbraith 2015-04-23 21:00 ` Steven Rostedt 2015-04-24 6:54 ` Mike Galbraith 2015-04-24 9:00 ` Jan Kiszka 2015-04-24 9:59 ` Mike Galbraith 2015-04-25 7:20 ` Mike Galbraith 2015-04-25 7:26 ` Jan Kiszka 2015-05-18 19:52 ` Sebastian Andrzej Siewior
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).