From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jens Axboe Subject: Re: calling scsi_adjust_queue_depth() during I/O... Date: Fri, 5 Aug 2005 09:57:52 +0200 Message-ID: <20050805075751.GN9369@suse.de> References: <20050804234155.GJ13374@plap.qlogic.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Received: from ns.virtualhost.dk ([195.184.98.160]:13513 "EHLO virtualhost.dk") by vger.kernel.org with ESMTP id S262903AbVHEH4A (ORCPT ); Fri, 5 Aug 2005 03:56:00 -0400 Content-Disposition: inline In-Reply-To: <20050804234155.GJ13374@plap.qlogic.org> Sender: linux-scsi-owner@vger.kernel.org List-Id: linux-scsi@vger.kernel.org To: Andrew Vasquez Cc: Linux-SCSI Mailing List , Tejun Heo On Thu, Aug 04 2005, Andrew Vasquez wrote: > All, > > While adding support for the new change_queue_depth/type() callbacks, > > static int > qla2x00_change_queue_depth(struct scsi_device *sdev, int qdepth) > { > scsi_adjust_queue_depth(sdev, scsi_get_tag_type(sdev), qdepth); > return sdev->queue_depth; > } > > and updating the queue-depth: > > # echo 16 > /sys/class/scsi_device/3:0:0:0/device/queue_depth > > while I/O is running, I'm hitting a reproducible WARN_ON() triggering > within as_completed_request(): > > static void as_completed_request(request_queue_t *q, struct request *rq) > { > struct as_data *ad = q->elevator->elevator_data; > struct as_rq *arq = RQ_DATA(rq); > > WARN_ON(!list_empty(&rq->queuelist)); Tejun, can you take a look at this please? > ... > > and a subsequent panic: > > Badness in as_completed_request at drivers/block/as-iosched.c:951 > > Call Trace: ffff8024883a>{as_completed_request+63} {elv_completed_request+44} > {__blk_put_request+73} {scsi_end_request+164} > {scsi_io_completion+584} {sd_rw_intr+709} > {scsi_finish_command+182} {scsi_softirq+255} > {__do_softirq+110} {call_softirq+31} > {do_softirq+54} {do_IRQ+74} > {ret_from_intr+0} {mwait_idle+86} > {acpi_processor_idle+310} {cpu_idle+79} > {start_secondary+1017} > ----------- [cut here ] --------- [please bite here ] --------- > Kernel BUG at "drivers/block/ll_rw_blk.c":2361 > invalid operand: 0000 [1] SMP > CPU 2 > Modules linked in: qla2xxx > Pid: 0, comm: swapper Not tainted 2.6.13-rc5 > RIP: 0010:[] {__blk_put_request+83} > RSP: 0018:ffff8100021bbde8 EFLAGS: 00010087 > RAX: 0000000000000000 RBX: ffff81002dc738b0 RCX: 0000000000008000 > RDX: 0000000000004e6b RSI: 0000000000000004 RDI: ffff81003e091778 > RBP: ffff81003f8fa600 R08: 0000000000000000 R09: 0000000000000003 > R10: 00000000ffffffff R11: 0000000000000000 R12: 0000000000000000 > R13: 0000000000000001 R14: ffff81003f8fa600 R15: ffff81003f8fa600 > FS: 0000000000000000(0000) GS:ffffffff804b6900(0000) knlGS:0000000000000000 > CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b > CR2: 00002aaaaaac1000 CR3: 0000000037f05000 CR4: 00000000000006e0 > Process swapper (pid: 0, threadinfo ffff8100021b6000, task ffff8100021b54f0) > Stack: ffff81002dc738b0 ffff81002c1cd7c0 0000000000000286 ffffffff80280781 > 0000000000000001 ffff81002c1cd7c0 ffff81002dc738b0 0000000000000000 > 0000000000080000 ffffffff802809eb > Call Trace: {scsi_end_request+164} {scsi_io_completion+584} > {sd_rw_intr+709} {scsi_finish_command+182} > {scsi_softirq+255} {__do_softirq+110} > {call_softirq+31} {do_softirq+54} > {do_IRQ+74} {ret_from_intr+0} > {mwait_idle+86} {acpi_processor_idle+310} > {cpu_idle+79} {start_secondary+1017} > > Code: 0f 0b a3 0b f2 32 80 ff ff ff ff c2 39 09 48 89 de 48 89 ef > RIP {__blk_put_request+83} RSP > <3>Debug: sleeping function called from invalid context at include/linux/rwsem.h:43 > in_atomic():1, irqs_disabled():1 > > Call Trace: {__might_sleep+199} {profile_task_exit+34} > {do_exit+34} {vgacon_cursor+231} > {kernel_math_error+0} {do_trap+264} > {do_invalid_op+145} {__blk_put_request+83} > {printk+141} {error_exit+0} > {__blk_put_request+83} {__blk_put_request+73} > {scsi_end_request+164} {scsi_io_completion+584} > {sd_rw_intr+709} {scsi_finish_command+182} > {scsi_softirq+255} {__do_softirq+110} > {call_softirq+31} {do_softirq+54} > {do_IRQ+74} {ret_from_intr+0} > {mwait_idle+86} {acpi_processor_idle+310} > {cpu_idle+79} {start_secondary+1017} > > Kernel panic - not syncing: Aiee, killing interrupt handler! > > Adding scsi_target_quiesce() and scsi_target_resume() barriers around > the scsi_adjust_target_queue_depth() call appears to help (i.e. > dropping from 32 -> 24): > > # echo 24 > /sys/class/scsi_device/3\:0\:0\:0/device/queue_depth > > and dropping down again to 16: > > # echo 16 > /sys/class/scsi_device/3\:0\:0\:0/device/queue_depth > > but occasionally, while trying another depth drop: > > # echo 10 > /sys/class/scsi_device/3\:0\:0\:0/device/queue_depth > > I'll either get a panic (haven't captured a good one yet (only a > couple of line within the trace): > > eip: ffffffff80248a62 > ----------- [cut here ] --------- [please bite here ] --------- > Kernel BUG at "include/asm/spinlock.h":121 > > or I get the following slab-error: > > slab error in cache_free_debugcheck(): cache `size-128': double free, or memory outside object was overwritten > > Call Trace:{cache_free_debugcheck+290} {kfree+136} > {blk_queue_resize_tags+119} {scsi_adjust_queue_depth+68} > {:qla2xxx:qla2x00_change_queue_depth+71} > {sdev_store_queue_depth_rw+82} {dev_attr_store+31} > {sysfs_write_file+200} {vfs_write+172} > {sys_write+69} {system_call+126} > > ffff8100389baba8: redzone 1: 0x170fc2a5, redzone 2: 0x0. > > I'm using a fairly recent snapshot of Linus' GIT tree (sync done > earlier today). > > Two questions: > > - must the target be quiesced before adjusting the queue-depth? > > - any ideas on where why successive lowering of the depth borks the > machine? > > Thanks, > Andrew Vasquez > - > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Jens Axboe