From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751558AbcCXUrA (ORCPT ); Thu, 24 Mar 2016 16:47:00 -0400 Received: from e19.ny.us.ibm.com ([129.33.205.209]:35212 "EHLO e19.ny.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750938AbcCXUqz (ORCPT ); Thu, 24 Mar 2016 16:46:55 -0400 X-IBM-Helo: d01dlp02.pok.ibm.com X-IBM-MailFrom: paulmck@linux.vnet.ibm.com X-IBM-RcptTo: linux-kernel@vger.kernel.org Date: Thu, 24 Mar 2016 13:46:58 -0700 From: "Paul E. McKenney" To: Bart Van Assche Cc: "linux-kernel@vger.kernel.org" Subject: Re: RCU stall Message-ID: <20160324204658.GB4287@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <56F1A8F2.9000905@sandisk.com> <20160322204510.GS4287@linux.vnet.ibm.com> <56F1DAF6.3030804@sandisk.com> <20160323015932.GX4287@linux.vnet.ibm.com> <20160323022902.GA28227@linux.vnet.ibm.com> <56F44CE2.1060703@sandisk.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <56F44CE2.1060703@sandisk.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16032420-0057-0000-0000-000003DAD34C Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Mar 24, 2016 at 01:24:02PM -0700, Bart Van Assche wrote: > On 03/22/2016 07:29 PM, Paul E. McKenney wrote: > >Note that a soft lockup triggered at 10509.568010, well before the RCU > >CPU stall warning.. And you have a second soft lockup at 10537.567212, > >with the same funtion scsi_request_fn() at the top of the stack in both > >stack traces. That function has a nice big "for (;;)" loop that does > >not appear to have any iteration-limiting mechanism. > > Hello Paul, > > Your feedback is really appreciated. I have started with testing the > four attached patches. With the tests I ran so far these patches > were sufficient to avoid any soft lockup or RCU stall complaints. I > will submit these patches to the appropriate maintainers once I have > finished testing these patches. Hello, Bart, Glad it helped! The patches look sane to me, but that should be taken with a large quantity of salt given that I know very little about Linux's storage subsystem. Thanx, Paul > Bart. > >From 0f72b28329342346980ae99c69d19b7adb0123bc Mon Sep 17 00:00:00 2001 > From: Bart Van Assche > Date: Thu, 24 Mar 2016 11:01:14 -0700 > Subject: [PATCH 1/4] IB/cm: Fix a recently introduced locking bug > > ib_cm_notify() can be called from interrupt context. Hence do not > reenable interrupts unconditionally in cm_establish(). > > This patch avoids that lockdep reports the following warning: > > WARNING: CPU: 0 PID: 23317 at kernel/locking/lockdep.c:2624 trace _hardirqs_on_caller+0x112/0x1b0 > DEBUG_LOCKS_WARN_ON(current->hardirq_context) > Call Trace: > [] dump_stack+0x67/0x92 > [] __warn+0xc1/0xe0 > [] warn_slowpath_fmt+0x4a/0x50 > [] trace_hardirqs_on_caller+0x112/0x1b0 > [] trace_hardirqs_on+0xd/0x10 > [] _raw_spin_unlock_irq+0x27/0x40 > [] ib_cm_notify+0x25c/0x290 [ib_cm] > [] srpt_qp_event+0xa1/0xf0 [ib_srpt] > [] mlx4_ib_qp_event+0x67/0xd0 [mlx4_ib] > [] mlx4_qp_event+0x5a/0xc0 [mlx4_core] > [] mlx4_eq_int+0x3d8/0xcf0 [mlx4_core] > [] mlx4_msi_x_interrupt+0xc/0x20 [mlx4_core] > [] handle_irq_event_percpu+0x64/0x100 > [] handle_irq_event+0x34/0x60 > [] handle_edge_irq+0x6a/0x150 > [] handle_irq+0x15/0x20 > [] do_IRQ+0x5c/0x110 > [] common_interrupt+0x89/0x89 > [] blk_run_queue_async+0x37/0x40 > [] rq_completed+0x43/0x70 [dm_mod] > [] dm_softirq_done+0x176/0x280 [dm_mod] > [] blk_done_softirq+0x52/0x90 > [] __do_softirq+0x10f/0x230 > [] irq_exit+0xa8/0xb0 > [] smp_trace_call_function_single_interrupt+0x2e/0x30 > [] smp_call_function_single_interrupt+0x9/0x10 > [] call_function_single_interrupt+0x89/0x90 > > > Fixes: commit be4b499323bf ("IB/cm: Do not queue work to a device that's going away") > Signed-off-by: Bart Van Assche > Cc: Erez Shitrit > Cc: Doug Ledford > Cc: stable # v4.2+ > --- > drivers/infiniband/core/cm.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c > index 1d92e09..c995255 100644 > --- a/drivers/infiniband/core/cm.c > +++ b/drivers/infiniband/core/cm.c > @@ -3452,14 +3452,14 @@ static int cm_establish(struct ib_cm_id *cm_id) > work->cm_event.event = IB_CM_USER_ESTABLISHED; > > /* Check if the device started its remove_one */ > - spin_lock_irq(&cm.lock); > + spin_lock_irqsave(&cm.lock, flags); > if (!cm_dev->going_down) { > queue_delayed_work(cm.wq, &work->work, 0); > } else { > kfree(work); > ret = -ENODEV; > } > - spin_unlock_irq(&cm.lock); > + spin_unlock_irqrestore(&cm.lock, flags); > > out: > return ret; > -- > 2.7.3 > > >From 5fd6aedadc04d102cd261507ff61071071455fb6 Mon Sep 17 00:00:00 2001 > From: Bart Van Assche > Date: Thu, 24 Mar 2016 12:04:01 -0700 > Subject: [PATCH 2/4] kernel/kthread.c: Avoid CPU lockups > > Avoid that complaints similar to the one below are reported against > a debug kernel: > > NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [kdmwork-25 4:2:23313] > irq event stamp: 16320042 > hardirqs last enabled at (16320041): [] _raw_spin_unlock_irq+0x27/0x40 > hardirqs last disabled at (16320042): [] 0xffff8803ffbe3cd8 > softirqs last enabled at (16319960): [] __do_softirq+0x1cb/0x230 > softirqs last disabled at (16319715): [] irq_exit+0xa8/0xb0 > CPU: 1 PID: 23313 Comm: kdmwork-254:2 > RIP: 0010:[] [] _raw_spin_unlock_irq+0x2f/0x40 > Call Trace: > [] scsi_request_fn+0x11f/0x630 > [] __blk_run_queue+0x2e/0x40 > [] __elv_add_request+0x75/0x1f0 > [] blk_insert_cloned_request+0x101/0x190 > [] map_request+0x16a/0x1b0 [dm_mod] > [] map_tio_request+0x1d/0x40 [dm_mod] > [] kthread_worker_fn+0x82/0x1a0 > [] kthread+0xea/0x100 > [] ret_from_fork+0x22/0x40 > > Signed-off-by: Bart Van Assche > --- > kernel/kthread.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/kernel/kthread.c b/kernel/kthread.c > index 9ff173d..516ca6b 100644 > --- a/kernel/kthread.c > +++ b/kernel/kthread.c > @@ -593,6 +593,7 @@ repeat: > if (work) { > __set_current_state(TASK_RUNNING); > work->func(work); > + cond_resched_rcu_qs(); > } else if (!freezing(current)) > schedule(); > > -- > 2.7.3 > > >From 44985e4b2f3124bf87e84a4c7572efa00ac28d3b Mon Sep 17 00:00:00 2001 > From: Bart Van Assche > Date: Wed, 23 Mar 2016 17:14:57 -0700 > Subject: [PATCH 3/4] block: Limit work processed in softirq context > > Avoid that complaints like the one below are reported against a > debug kernel: > > NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [disk11_0:2708] > irq event stamp: 17120809 > hardirqs last enabled at (17120808): [] _raw_spin_unlock_irqrestore+0x31/0x50 > hardirqs last disabled at (17120809): [] 0xffff88046f223bd0 > softirqs last enabled at (17120794): [] scst_check_blocked_dev+0x77/0x1c0 [scst] > softirqs last disabled at (17120795): [] do_softirq_own_stack+0x1c/0x30 > RIP: 0010:[] [] _raw_spin_unlock_irqrestore+0x33/0x50 > Call Trace: > > [] free_debug_processing+0x270/0x3a0 > [] __slab_free+0x17a/0x2c0 > [] kmem_cache_free+0x1b4/0x1d0 > [] mempool_free_slab+0x12/0x20 > [] mempool_free+0x26/0x80 > [] bio_free+0x49/0x60 > [] bio_put+0x1e/0x30 > [] end_clone_bio+0x21/0x70 [dm_mod] > [] bio_endio+0x52/0x60 > [] blk_update_request+0x7c/0x2a0 > [] scsi_end_request+0x2e/0x1d0 > [] scsi_io_completion+0xb4/0x610 > [] scsi_finish_command+0xca/0x120 > [] scsi_softirq_done+0x120/0x140 > [] blk_done_softirq+0x76/0x90 > [] __do_softirq+0x10f/0x230 > [] do_softirq_own_stack+0x1c/0x30 > > --- > block/blk-softirq.c | 24 +++++++++++++++--------- > 1 file changed, 15 insertions(+), 9 deletions(-) > > diff --git a/block/blk-softirq.c b/block/blk-softirq.c > index 53b1737..d739949 100644 > --- a/block/blk-softirq.c > +++ b/block/blk-softirq.c > @@ -20,20 +20,26 @@ static DEFINE_PER_CPU(struct list_head, blk_cpu_done); > */ > static void blk_done_softirq(struct softirq_action *h) > { > - struct list_head *cpu_list, local_list; > + struct list_head *cpu_list = this_cpu_ptr(&blk_cpu_done); > + struct request *rq; > + int i; > > local_irq_disable(); > - cpu_list = this_cpu_ptr(&blk_cpu_done); > - list_replace_init(cpu_list, &local_list); > - local_irq_enable(); > - > - while (!list_empty(&local_list)) { > - struct request *rq; > - > - rq = list_entry(local_list.next, struct request, ipi_list); > + for (i = 64; i > 0; i--) { > + if (list_empty(cpu_list)) > + goto done; > + rq = list_first_entry(cpu_list, struct request, ipi_list); > list_del_init(&rq->ipi_list); > + local_irq_enable(); > + > rq->q->softirq_done_fn(rq); > + > + local_irq_disable(); > } > + raise_softirq_irqoff(BLOCK_SOFTIRQ); > + > +done: > + local_irq_enable(); > } > > #ifdef CONFIG_SMP > -- > 2.7.3 > > >From a73fdf710b98922fa02d464af96b499ea2740832 Mon Sep 17 00:00:00 2001 > From: Bart Van Assche > Date: Wed, 23 Mar 2016 14:38:13 -0700 > Subject: [PATCH 4/4] Avoid that I/O completion processing triggers lockup > complaints > > Avoid that I/O completion processing triggers the following complaints > if kernel debug options that slow down the kernel significantly are > enabled: > > NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [kdmwork-254:2:358] > irq event stamp: 15233868 > hardirqs last enabled at (15233867): [] _raw_spin_unlock_irq+0x27/0x40 > hardirqs last disabled at (15233868): [] apic_timer_interrupt+0x84/0x90 > softirqs last enabled at (15233850): [] __do_softirq+0x1cb/0x230 > softirqs last disabled at (15233743): [] irq_exit+0xa8/0xb0 > CPU: 3 PID: 358 Comm: kdmwork-254:2 > RIP: 0010:[] [] _raw_spin_unlock_irq+0x2f/0x40 > Call Trace: > [] scsi_request_fn+0x118/0x600 > [] __blk_run_queue+0x2e/0x40 > [] __elv_add_request+0x75/0x1f0 > [] blk_insert_cloned_request+0x101/0x190 > [] map_request+0x18e/0x210 [dm_mod] > [] map_tio_request+0x1d/0x40 [dm_mod] > [] kthread_worker_fn+0x7d/0x1a0 > [] kthread+0xea/0x100 > [] ret_from_fork+0x3f/0x70 > > INFO: rcu_sched self-detected stall on CPU > 3-...: (6497 ticks this GP) idle=fb9/140000000000002/0 softirq=2044956/2045037 fqs=5414 > (t=6500 jiffies g=219289 c=219288 q=7233211) > Task dump for CPU 3: > kdmwork-254:2 R running task 0 358 2 0x00000008 > Call Trace: > [] sched_show_task+0xbf/0x150 > [] dump_cpu_task+0x32/0x40 > [] rcu_dump_cpu_stacks+0x89/0xe0 > [] rcu_check_callbacks+0x439/0x730 > [] update_process_times+0x34/0x60 > [] tick_sched_handle.isra.18+0x20/0x50 > [] tick_sched_timer+0x38/0x70 > [] __hrtimer_run_queues+0xa5/0x1c0 > [] hrtimer_interrupt+0xa6/0x1b0 > [] smp_trace_apic_timer_interrupt+0x63/0x90 > [] smp_apic_timer_interrupt+0x9/0x10 > [] apic_timer_interrupt+0x89/0x90 > [] __slab_free+0xc6/0x270 > [] kmem_cache_free+0x159/0x160 > [] kiocb_free+0x32/0x40 > [] aio_complete+0x1e5/0x3c0 > [] dio_complete+0x75/0x1d0 > [] dio_bio_end_aio+0x7a/0x130 > [] bio_endio+0x3a/0x60 > [] blk_update_request+0x7c/0x2a0 > [] end_clone_bio+0x41/0x70 [dm_mod] > [] bio_endio+0x3a/0x60 > [] blk_update_request+0x7c/0x2a0 > [] scsi_end_request+0x2e/0x1d0 > [] scsi_io_completion+0xb4/0x610 > [] scsi_finish_command+0xca/0x120 > [] scsi_softirq_done+0x120/0x140 > [] blk_done_softirq+0x72/0x90 > [] __do_softirq+0x10f/0x230 > [] irq_exit+0xa8/0xb0 > [] do_IRQ+0x65/0x110 > [] common_interrupt+0x89/0x89 > > [] __multipath_map.isra.16+0x145/0x260 [dm_multipath] > [] multipath_map+0x12/0x20 [dm_multipath] > [] map_request+0x43/0x210 [dm_mod] > [] map_tio_request+0x1d/0x40 [dm_mod] > [] kthread_worker_fn+0x7d/0x1a0 > [] kthread+0xea/0x100 > [] ret_from_fork+0x3f/0x70 > > Signed-off-by: Bart Van Assche > Cc: Paul E. McKenney > --- > drivers/scsi/scsi_lib.c | 10 +++++++--- > 1 file changed, 7 insertions(+), 3 deletions(-) > > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c > index 8106515..8f264a0 100644 > --- a/drivers/scsi/scsi_lib.c > +++ b/drivers/scsi/scsi_lib.c > @@ -1770,13 +1770,14 @@ static void scsi_request_fn(struct request_queue *q) > struct Scsi_Host *shost; > struct scsi_cmnd *cmd; > struct request *req; > + int i; > > /* > - * To start with, we keep looping until the queue is empty, or until > - * the host is no longer able to accept any more requests. > + * Loop until the queue is empty, until the host is no longer able to > + * accept any more requests or until 64 requests have been processed. > */ > shost = sdev->host; > - for (;;) { > + for (i = 64; i > 0; i--) { > int rtn; > /* > * get next queueable request. We do this early to make sure > @@ -1861,6 +1862,9 @@ static void scsi_request_fn(struct request_queue *q) > spin_lock_irq(q->queue_lock); > } > > + if (unlikely(i == 0)) > + blk_delay_queue(q, 0); > + > return; > > host_not_ready: > -- > 2.7.3 >