* calling scsi_adjust_queue_depth() during I/O...
@ 2005-08-04 23:41 Andrew Vasquez
2005-08-05 7:57 ` Jens Axboe
0 siblings, 1 reply; 12+ messages in thread
From: Andrew Vasquez @ 2005-08-04 23:41 UTC (permalink / raw)
To: Linux-SCSI Mailing List
All,
While adding support for the new change_queue_depth/type() callbacks,
static int
qla2x00_change_queue_depth(struct scsi_device *sdev, int qdepth)
{
scsi_adjust_queue_depth(sdev, scsi_get_tag_type(sdev), qdepth);
return sdev->queue_depth;
}
and updating the queue-depth:
# echo 16 > /sys/class/scsi_device/3:0:0:0/device/queue_depth
while I/O is running, I'm hitting a reproducible WARN_ON() triggering
within as_completed_request():
static void as_completed_request(request_queue_t *q, struct request *rq)
{
struct as_data *ad = q->elevator->elevator_data;
struct as_rq *arq = RQ_DATA(rq);
WARN_ON(!list_empty(&rq->queuelist));
...
and a subsequent panic:
Badness in as_completed_request at drivers/block/as-iosched.c:951
Call Trace: <IRQ> ffff8024883a>{as_completed_request+63} <ffffffff8024098d>{elv_completed_request+44}
<ffffffff8024272a>{__blk_put_request+73} <ffffffff80280781>{scsi_end_request+164}
<ffffffff802809eb>{scsi_io_completion+584} <ffffffff80297059>{sd_rw_intr+709}
<ffffffff8027aa08>{scsi_finish_command+182} <ffffffff8027b2dc>{scsi_softirq+255}
<ffffffff801291ea>{__do_softirq+110} <ffffffff8010eb13>{call_softirq+31}
<ffffffff801101be>{do_softirq+54} <ffffffff80110211>{do_IRQ+74}
<ffffffff8010deba>{ret_from_intr+0} <EOI> <ffffffff8010c2fd>{mwait_idle+86}
<ffffffff8021aef0>{acpi_processor_idle+310} <ffffffff8010cacb>{cpu_idle+79}
<ffffffff804cecbf>{start_secondary+1017}
----------- [cut here ] --------- [please bite here ] ---------
Kernel BUG at "drivers/block/ll_rw_blk.c":2361
invalid operand: 0000 [1] SMP
CPU 2
Modules linked in: qla2xxx
Pid: 0, comm: swapper Not tainted 2.6.13-rc5
RIP: 0010:[<ffffffff80242734>] <ffffffff80242734>{__blk_put_request+83}
RSP: 0018:ffff8100021bbde8 EFLAGS: 00010087
RAX: 0000000000000000 RBX: ffff81002dc738b0 RCX: 0000000000008000
RDX: 0000000000004e6b RSI: 0000000000000004 RDI: ffff81003e091778
RBP: ffff81003f8fa600 R08: 0000000000000000 R09: 0000000000000003
R10: 00000000ffffffff R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000001 R14: ffff81003f8fa600 R15: ffff81003f8fa600
FS: 0000000000000000(0000) GS:ffffffff804b6900(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00002aaaaaac1000 CR3: 0000000037f05000 CR4: 00000000000006e0
Process swapper (pid: 0, threadinfo ffff8100021b6000, task ffff8100021b54f0)
Stack: ffff81002dc738b0 ffff81002c1cd7c0 0000000000000286 ffffffff80280781
0000000000000001 ffff81002c1cd7c0 ffff81002dc738b0 0000000000000000
0000000000080000 ffffffff802809eb
Call Trace: <IRQ> <ffffffff80280781>{scsi_end_request+164} <ffffffff802809eb>{scsi_io_completion+584}
<ffffffff80297059>{sd_rw_intr+709} <ffffffff8027aa08>{scsi_finish_command+182}
<ffffffff8027b2dc>{scsi_softirq+255} <ffffffff801291ea>{__do_softirq+110}
<ffffffff8010eb13>{call_softirq+31} <ffffffff801101be>{do_softirq+54}
<ffffffff80110211>{do_IRQ+74} <ffffffff8010deba>{ret_from_intr+0}
<EOI> <ffffffff8010c2fd>{mwait_idle+86} <ffffffff8021aef0>{acpi_processor_idle+310}
<ffffffff8010cacb>{cpu_idle+79} <ffffffff804cecbf>{start_secondary+1017}
Code: 0f 0b a3 0b f2 32 80 ff ff ff ff c2 39 09 48 89 de 48 89 ef
RIP <ffffffff80242734>{__blk_put_request+83} RSP <ffff8100021bbde8>
<3>Debug: sleeping function called from invalid context at include/linux/rwsem.h:43
in_atomic():1, irqs_disabled():1
Call Trace: <IRQ> <ffffffff8011e2d7>{__might_sleep+199} <ffffffff80125316>{profile_task_exit+34}
<ffffffff80126fe2>{do_exit+34} <ffffffff801fc7d0>{vgacon_cursor+231}
<ffffffff8010f653>{kernel_math_error+0} <ffffffff8010fa09>{do_trap+264}
<ffffffff8010feb9>{do_invalid_op+145} <ffffffff80242734>{__blk_put_request+83}
<ffffffff801245d7>{printk+141} <ffffffff8010e415>{error_exit+0}
<ffffffff80242734>{__blk_put_request+83} <ffffffff8024272a>{__blk_put_request+73}
<ffffffff80280781>{scsi_end_request+164} <ffffffff802809eb>{scsi_io_completion+584}
<ffffffff80297059>{sd_rw_intr+709} <ffffffff8027aa08>{scsi_finish_command+182}
<ffffffff8027b2dc>{scsi_softirq+255} <ffffffff801291ea>{__do_softirq+110}
<ffffffff8010eb13>{call_softirq+31} <ffffffff801101be>{do_softirq+54}
<ffffffff80110211>{do_IRQ+74} <ffffffff8010deba>{ret_from_intr+0}
<EOI> <ffffffff8010c2fd>{mwait_idle+86} <ffffffff8021aef0>{acpi_processor_idle+310}
<ffffffff8010cacb>{cpu_idle+79} <ffffffff804cecbf>{start_secondary+1017}
Kernel panic - not syncing: Aiee, killing interrupt handler!
Adding scsi_target_quiesce() and scsi_target_resume() barriers around
the scsi_adjust_target_queue_depth() call appears to help (i.e.
dropping from 32 -> 24):
# echo 24 > /sys/class/scsi_device/3\:0\:0\:0/device/queue_depth
and dropping down again to 16:
# echo 16 > /sys/class/scsi_device/3\:0\:0\:0/device/queue_depth
but occasionally, while trying another depth drop:
# echo 10 > /sys/class/scsi_device/3\:0\:0\:0/device/queue_depth
I'll either get a panic (haven't captured a good one yet (only a
couple of line within the trace):
eip: ffffffff80248a62
----------- [cut here ] --------- [please bite here ] ---------
Kernel BUG at "include/asm/spinlock.h":121
or I get the following slab-error:
slab error in cache_free_debugcheck(): cache `size-128': double free, or memory outside object was overwritten
Call Trace:<ffffffff8014930c>{cache_free_debugcheck+290} <ffffffff8014975c>{kfree+136}
<ffffffff80244e65>{blk_queue_resize_tags+119} <ffffffff8027a826>{scsi_adjust_queue_depth+68}
<ffffffff88000133>{:qla2xxx:qla2x00_change_queue_depth+71}
<ffffffff80283666>{sdev_store_queue_depth_rw+82} <ffffffff8023a9a2>{dev_attr_store+31}
<ffffffff80191e95>{sysfs_write_file+200} <ffffffff80160dba>{vfs_write+172}
<ffffffff80160ed8>{sys_write+69} <ffffffff8010d8f6>{system_call+126}
ffff8100389baba8: redzone 1: 0x170fc2a5, redzone 2: 0x0.
I'm using a fairly recent snapshot of Linus' GIT tree (sync done
earlier today).
Two questions:
- must the target be quiesced before adjusting the queue-depth?
- any ideas on where why successive lowering of the depth borks the
machine?
Thanks,
Andrew Vasquez
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: calling scsi_adjust_queue_depth() during I/O... 2005-08-04 23:41 calling scsi_adjust_queue_depth() during I/O Andrew Vasquez @ 2005-08-05 7:57 ` Jens Axboe 2005-08-05 11:09 ` Tejun Heo 0 siblings, 1 reply; 12+ messages in thread From: Jens Axboe @ 2005-08-05 7:57 UTC (permalink / raw) To: Andrew Vasquez; +Cc: Linux-SCSI Mailing List, Tejun Heo On Thu, Aug 04 2005, Andrew Vasquez wrote: > All, > > While adding support for the new change_queue_depth/type() callbacks, > > static int > qla2x00_change_queue_depth(struct scsi_device *sdev, int qdepth) > { > scsi_adjust_queue_depth(sdev, scsi_get_tag_type(sdev), qdepth); > return sdev->queue_depth; > } > > and updating the queue-depth: > > # echo 16 > /sys/class/scsi_device/3:0:0:0/device/queue_depth > > while I/O is running, I'm hitting a reproducible WARN_ON() triggering > within as_completed_request(): > > static void as_completed_request(request_queue_t *q, struct request *rq) > { > struct as_data *ad = q->elevator->elevator_data; > struct as_rq *arq = RQ_DATA(rq); > > WARN_ON(!list_empty(&rq->queuelist)); Tejun, can you take a look at this please? > ... > > and a subsequent panic: > > Badness in as_completed_request at drivers/block/as-iosched.c:951 > > Call Trace: <IRQ> ffff8024883a>{as_completed_request+63} <ffffffff8024098d>{elv_completed_request+44} > <ffffffff8024272a>{__blk_put_request+73} <ffffffff80280781>{scsi_end_request+164} > <ffffffff802809eb>{scsi_io_completion+584} <ffffffff80297059>{sd_rw_intr+709} > <ffffffff8027aa08>{scsi_finish_command+182} <ffffffff8027b2dc>{scsi_softirq+255} > <ffffffff801291ea>{__do_softirq+110} <ffffffff8010eb13>{call_softirq+31} > <ffffffff801101be>{do_softirq+54} <ffffffff80110211>{do_IRQ+74} > <ffffffff8010deba>{ret_from_intr+0} <EOI> <ffffffff8010c2fd>{mwait_idle+86} > <ffffffff8021aef0>{acpi_processor_idle+310} <ffffffff8010cacb>{cpu_idle+79} > <ffffffff804cecbf>{start_secondary+1017} > ----------- [cut here ] --------- [please bite here ] --------- > Kernel BUG at "drivers/block/ll_rw_blk.c":2361 > invalid operand: 0000 [1] SMP > CPU 2 > Modules linked in: qla2xxx > Pid: 0, comm: swapper Not tainted 2.6.13-rc5 > RIP: 0010:[<ffffffff80242734>] <ffffffff80242734>{__blk_put_request+83} > RSP: 0018:ffff8100021bbde8 EFLAGS: 00010087 > RAX: 0000000000000000 RBX: ffff81002dc738b0 RCX: 0000000000008000 > RDX: 0000000000004e6b RSI: 0000000000000004 RDI: ffff81003e091778 > RBP: ffff81003f8fa600 R08: 0000000000000000 R09: 0000000000000003 > R10: 00000000ffffffff R11: 0000000000000000 R12: 0000000000000000 > R13: 0000000000000001 R14: ffff81003f8fa600 R15: ffff81003f8fa600 > FS: 0000000000000000(0000) GS:ffffffff804b6900(0000) knlGS:0000000000000000 > CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b > CR2: 00002aaaaaac1000 CR3: 0000000037f05000 CR4: 00000000000006e0 > Process swapper (pid: 0, threadinfo ffff8100021b6000, task ffff8100021b54f0) > Stack: ffff81002dc738b0 ffff81002c1cd7c0 0000000000000286 ffffffff80280781 > 0000000000000001 ffff81002c1cd7c0 ffff81002dc738b0 0000000000000000 > 0000000000080000 ffffffff802809eb > Call Trace: <IRQ> <ffffffff80280781>{scsi_end_request+164} <ffffffff802809eb>{scsi_io_completion+584} > <ffffffff80297059>{sd_rw_intr+709} <ffffffff8027aa08>{scsi_finish_command+182} > <ffffffff8027b2dc>{scsi_softirq+255} <ffffffff801291ea>{__do_softirq+110} > <ffffffff8010eb13>{call_softirq+31} <ffffffff801101be>{do_softirq+54} > <ffffffff80110211>{do_IRQ+74} <ffffffff8010deba>{ret_from_intr+0} > <EOI> <ffffffff8010c2fd>{mwait_idle+86} <ffffffff8021aef0>{acpi_processor_idle+310} > <ffffffff8010cacb>{cpu_idle+79} <ffffffff804cecbf>{start_secondary+1017} > > Code: 0f 0b a3 0b f2 32 80 ff ff ff ff c2 39 09 48 89 de 48 89 ef > RIP <ffffffff80242734>{__blk_put_request+83} RSP <ffff8100021bbde8> > <3>Debug: sleeping function called from invalid context at include/linux/rwsem.h:43 > in_atomic():1, irqs_disabled():1 > > Call Trace: <IRQ> <ffffffff8011e2d7>{__might_sleep+199} <ffffffff80125316>{profile_task_exit+34} > <ffffffff80126fe2>{do_exit+34} <ffffffff801fc7d0>{vgacon_cursor+231} > <ffffffff8010f653>{kernel_math_error+0} <ffffffff8010fa09>{do_trap+264} > <ffffffff8010feb9>{do_invalid_op+145} <ffffffff80242734>{__blk_put_request+83} > <ffffffff801245d7>{printk+141} <ffffffff8010e415>{error_exit+0} > <ffffffff80242734>{__blk_put_request+83} <ffffffff8024272a>{__blk_put_request+73} > <ffffffff80280781>{scsi_end_request+164} <ffffffff802809eb>{scsi_io_completion+584} > <ffffffff80297059>{sd_rw_intr+709} <ffffffff8027aa08>{scsi_finish_command+182} > <ffffffff8027b2dc>{scsi_softirq+255} <ffffffff801291ea>{__do_softirq+110} > <ffffffff8010eb13>{call_softirq+31} <ffffffff801101be>{do_softirq+54} > <ffffffff80110211>{do_IRQ+74} <ffffffff8010deba>{ret_from_intr+0} > <EOI> <ffffffff8010c2fd>{mwait_idle+86} <ffffffff8021aef0>{acpi_processor_idle+310} > <ffffffff8010cacb>{cpu_idle+79} <ffffffff804cecbf>{start_secondary+1017} > > Kernel panic - not syncing: Aiee, killing interrupt handler! > > Adding scsi_target_quiesce() and scsi_target_resume() barriers around > the scsi_adjust_target_queue_depth() call appears to help (i.e. > dropping from 32 -> 24): > > # echo 24 > /sys/class/scsi_device/3\:0\:0\:0/device/queue_depth > > and dropping down again to 16: > > # echo 16 > /sys/class/scsi_device/3\:0\:0\:0/device/queue_depth > > but occasionally, while trying another depth drop: > > # echo 10 > /sys/class/scsi_device/3\:0\:0\:0/device/queue_depth > > I'll either get a panic (haven't captured a good one yet (only a > couple of line within the trace): > > eip: ffffffff80248a62 > ----------- [cut here ] --------- [please bite here ] --------- > Kernel BUG at "include/asm/spinlock.h":121 > > or I get the following slab-error: > > slab error in cache_free_debugcheck(): cache `size-128': double free, or memory outside object was overwritten > > Call Trace:<ffffffff8014930c>{cache_free_debugcheck+290} <ffffffff8014975c>{kfree+136} > <ffffffff80244e65>{blk_queue_resize_tags+119} <ffffffff8027a826>{scsi_adjust_queue_depth+68} > <ffffffff88000133>{:qla2xxx:qla2x00_change_queue_depth+71} > <ffffffff80283666>{sdev_store_queue_depth_rw+82} <ffffffff8023a9a2>{dev_attr_store+31} > <ffffffff80191e95>{sysfs_write_file+200} <ffffffff80160dba>{vfs_write+172} > <ffffffff80160ed8>{sys_write+69} <ffffffff8010d8f6>{system_call+126} > > ffff8100389baba8: redzone 1: 0x170fc2a5, redzone 2: 0x0. > > I'm using a fairly recent snapshot of Linus' GIT tree (sync done > earlier today). > > Two questions: > > - must the target be quiesced before adjusting the queue-depth? > > - any ideas on where why successive lowering of the depth borks the > machine? > > Thanks, > Andrew Vasquez > - > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Jens Axboe ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: calling scsi_adjust_queue_depth() during I/O... 2005-08-05 7:57 ` Jens Axboe @ 2005-08-05 11:09 ` Tejun Heo 2005-08-05 11:43 ` Tejun Heo 0 siblings, 1 reply; 12+ messages in thread From: Tejun Heo @ 2005-08-05 11:09 UTC (permalink / raw) To: Jens Axboe, Andrew Vasquez; +Cc: Linux-SCSI Mailing List Hello, Andrew. Hello, Jens. On Fri, Aug 05, 2005 at 09:57:52AM +0200, Jens Axboe wrote: > On Thu, Aug 04 2005, Andrew Vasquez wrote: > > All, > > > > While adding support for the new change_queue_depth/type() callbacks, > > > > static int > > qla2x00_change_queue_depth(struct scsi_device *sdev, int qdepth) > > { > > scsi_adjust_queue_depth(sdev, scsi_get_tag_type(sdev), qdepth); > > return sdev->queue_depth; > > } > > > > and updating the queue-depth: > > > > # echo 16 > /sys/class/scsi_device/3:0:0:0/device/queue_depth > > > > while I/O is running, I'm hitting a reproducible WARN_ON() triggering > > within as_completed_request(): > > > > static void as_completed_request(request_queue_t *q, struct request *rq) > > { > > struct as_data *ad = q->elevator->elevator_data; > > struct as_rq *arq = RQ_DATA(rq); > > > > WARN_ON(!list_empty(&rq->queuelist)); > > Tejun, can you take a look at this please? > Sure. > > ... > > > > and a subsequent panic: > > > > Badness in as_completed_request at drivers/block/as-iosched.c:951 > > > > Call Trace: <IRQ> ffff8024883a>{as_completed_request+63} <ffffffff8024098d>{elv_completed_request+44} > > <ffffffff8024272a>{__blk_put_request+73} <ffffffff80280781>{scsi_end_request+164} > > <ffffffff802809eb>{scsi_io_completion+584} <ffffffff80297059>{sd_rw_intr+709} > > <ffffffff8027aa08>{scsi_finish_command+182} <ffffffff8027b2dc>{scsi_softirq+255} > > <ffffffff801291ea>{__do_softirq+110} <ffffffff8010eb13>{call_softirq+31} > > <ffffffff801101be>{do_softirq+54} <ffffffff80110211>{do_IRQ+74} > > <ffffffff8010deba>{ret_from_intr+0} <EOI> <ffffffff8010c2fd>{mwait_idle+86} > > <ffffffff8021aef0>{acpi_processor_idle+310} <ffffffff8010cacb>{cpu_idle+79} > > <ffffffff804cecbf>{start_secondary+1017} > > ----------- [cut here ] --------- [please bite here ] --------- > > Kernel BUG at "drivers/block/ll_rw_blk.c":2361 > > invalid operand: 0000 [1] SMP > > CPU 2 > > Modules linked in: qla2xxx > > Pid: 0, comm: swapper Not tainted 2.6.13-rc5 > > RIP: 0010:[<ffffffff80242734>] <ffffffff80242734>{__blk_put_request+83} > > RSP: 0018:ffff8100021bbde8 EFLAGS: 00010087 > > RAX: 0000000000000000 RBX: ffff81002dc738b0 RCX: 0000000000008000 > > RDX: 0000000000004e6b RSI: 0000000000000004 RDI: ffff81003e091778 > > RBP: ffff81003f8fa600 R08: 0000000000000000 R09: 0000000000000003 > > R10: 00000000ffffffff R11: 0000000000000000 R12: 0000000000000000 > > R13: 0000000000000001 R14: ffff81003f8fa600 R15: ffff81003f8fa600 > > FS: 0000000000000000(0000) GS:ffffffff804b6900(0000) knlGS:0000000000000000 > > CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b > > CR2: 00002aaaaaac1000 CR3: 0000000037f05000 CR4: 00000000000006e0 > > Process swapper (pid: 0, threadinfo ffff8100021b6000, task ffff8100021b54f0) > > Stack: ffff81002dc738b0 ffff81002c1cd7c0 0000000000000286 ffffffff80280781 > > 0000000000000001 ffff81002c1cd7c0 ffff81002dc738b0 0000000000000000 > > 0000000000080000 ffffffff802809eb > > Call Trace: <IRQ> <ffffffff80280781>{scsi_end_request+164} <ffffffff802809eb>{scsi_io_completion+584} > > <ffffffff80297059>{sd_rw_intr+709} <ffffffff8027aa08>{scsi_finish_command+182} > > <ffffffff8027b2dc>{scsi_softirq+255} <ffffffff801291ea>{__do_softirq+110} > > <ffffffff8010eb13>{call_softirq+31} <ffffffff801101be>{do_softirq+54} > > <ffffffff80110211>{do_IRQ+74} <ffffffff8010deba>{ret_from_intr+0} > > <EOI> <ffffffff8010c2fd>{mwait_idle+86} <ffffffff8021aef0>{acpi_processor_idle+310} > > <ffffffff8010cacb>{cpu_idle+79} <ffffffff804cecbf>{start_secondary+1017} > > > > Code: 0f 0b a3 0b f2 32 80 ff ff ff ff c2 39 09 48 89 de 48 89 ef > > RIP <ffffffff80242734>{__blk_put_request+83} RSP <ffff8100021bbde8> > > <3>Debug: sleeping function called from invalid context at include/linux/rwsem.h:43 > > in_atomic():1, irqs_disabled():1 > > > > Call Trace: <IRQ> <ffffffff8011e2d7>{__might_sleep+199} <ffffffff80125316>{profile_task_exit+34} > > <ffffffff80126fe2>{do_exit+34} <ffffffff801fc7d0>{vgacon_cursor+231} > > <ffffffff8010f653>{kernel_math_error+0} <ffffffff8010fa09>{do_trap+264} > > <ffffffff8010feb9>{do_invalid_op+145} <ffffffff80242734>{__blk_put_request+83} > > <ffffffff801245d7>{printk+141} <ffffffff8010e415>{error_exit+0} > > <ffffffff80242734>{__blk_put_request+83} <ffffffff8024272a>{__blk_put_request+73} > > <ffffffff80280781>{scsi_end_request+164} <ffffffff802809eb>{scsi_io_completion+584} > > <ffffffff80297059>{sd_rw_intr+709} <ffffffff8027aa08>{scsi_finish_command+182} > > <ffffffff8027b2dc>{scsi_softirq+255} <ffffffff801291ea>{__do_softirq+110} > > <ffffffff8010eb13>{call_softirq+31} <ffffffff801101be>{do_softirq+54} > > <ffffffff80110211>{do_IRQ+74} <ffffffff8010deba>{ret_from_intr+0} > > <EOI> <ffffffff8010c2fd>{mwait_idle+86} <ffffffff8021aef0>{acpi_processor_idle+310} > > <ffffffff8010cacb>{cpu_idle+79} <ffffffff804cecbf>{start_secondary+1017} > > > > Kernel panic - not syncing: Aiee, killing interrupt handler! > > > > Adding scsi_target_quiesce() and scsi_target_resume() barriers around > > the scsi_adjust_target_queue_depth() call appears to help (i.e. > > dropping from 32 -> 24): > > > > # echo 24 > /sys/class/scsi_device/3\:0\:0\:0/device/queue_depth > > > > and dropping down again to 16: > > > > # echo 16 > /sys/class/scsi_device/3\:0\:0\:0/device/queue_depth > > > > but occasionally, while trying another depth drop: > > > > # echo 10 > /sys/class/scsi_device/3\:0\:0\:0/device/queue_depth > > > > I'll either get a panic (haven't captured a good one yet (only a > > couple of line within the trace): > > > > eip: ffffffff80248a62 > > ----------- [cut here ] --------- [please bite here ] --------- > > Kernel BUG at "include/asm/spinlock.h":121 > > > > or I get the following slab-error: > > > > slab error in cache_free_debugcheck(): cache `size-128': double free, or memory outside object was overwritten > > > > Call Trace:<ffffffff8014930c>{cache_free_debugcheck+290} <ffffffff8014975c>{kfree+136} > > <ffffffff80244e65>{blk_queue_resize_tags+119} <ffffffff8027a826>{scsi_adjust_queue_depth+68} > > <ffffffff88000133>{:qla2xxx:qla2x00_change_queue_depth+71} > > <ffffffff80283666>{sdev_store_queue_depth_rw+82} <ffffffff8023a9a2>{dev_attr_store+31} > > <ffffffff80191e95>{sysfs_write_file+200} <ffffffff80160dba>{vfs_write+172} > > <ffffffff80160ed8>{sys_write+69} <ffffffff8010d8f6>{system_call+126} > > > > ffff8100389baba8: redzone 1: 0x170fc2a5, redzone 2: 0x0. > > > > I'm using a fairly recent snapshot of Linus' GIT tree (sync done > > earlier today). > > > > Two questions: > > > > - must the target be quiesced before adjusting the queue-depth? > > > > - any ideas on where why successive lowering of the depth borks the > > machine? I think it's caused by using tag_index over its end. The slab corruption supports that. I tried to fix this incorrectly in the following post. http://marc.theaimsgroup.com/?l=linux-kernel&m=111399756324813&w=2 Good thing it didn't make into the tree, as tag map should never be shrunk. Thanks for not commiting it, Jens. :-) Andrew, please try the following quick fix (only fixes shrinking) and let me know how it works. If this is the right fix. I'll generate a proper patch fixing both shrinking and enlarging (this word sounds weird these days w/ all those spams...). diff --git a/drivers/block/ll_rw_blk.c b/drivers/block/ll_rw_blk.c --- a/drivers/block/ll_rw_blk.c +++ b/drivers/block/ll_rw_blk.c @@ -784,16 +784,17 @@ init_tag_map(request_queue_t *q, struct __FUNCTION__, depth); } - tag_index = kmalloc(depth * sizeof(struct request *), GFP_ATOMIC); + bits = (depth / BLK_TAGS_PER_LONG) + 1; + + tag_index = kmalloc(bits * sizeof(struct request *), GFP_ATOMIC); if (!tag_index) goto fail; - bits = (depth / BLK_TAGS_PER_LONG) + 1; tag_map = kmalloc(bits * sizeof(unsigned long), GFP_ATOMIC); if (!tag_map) goto fail; - memset(tag_index, 0, depth * sizeof(struct request *)); + memset(tag_index, 0, bits * sizeof(struct request *)); memset(tag_map, 0, bits * sizeof(unsigned long)); tags->max_depth = depth; tags->real_max_depth = bits * BITS_PER_LONG; ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: calling scsi_adjust_queue_depth() during I/O... 2005-08-05 11:09 ` Tejun Heo @ 2005-08-05 11:43 ` Tejun Heo 2005-08-05 12:33 ` Tejun Heo 0 siblings, 1 reply; 12+ messages in thread From: Tejun Heo @ 2005-08-05 11:43 UTC (permalink / raw) To: Tejun Heo; +Cc: Jens Axboe, Andrew Vasquez, Linux-SCSI Mailing List Tejun Heo wrote: > Hello, Andrew. Hello, Jens. > > On Fri, Aug 05, 2005 at 09:57:52AM +0200, Jens Axboe wrote: > >>On Thu, Aug 04 2005, Andrew Vasquez wrote: >> >>>All, >>> >>>While adding support for the new change_queue_depth/type() callbacks, >>> >>> static int >>> qla2x00_change_queue_depth(struct scsi_device *sdev, int qdepth) >>> { >>> scsi_adjust_queue_depth(sdev, scsi_get_tag_type(sdev), qdepth); >>> return sdev->queue_depth; >>> } >>> >>>and updating the queue-depth: >>> >>> # echo 16 > /sys/class/scsi_device/3:0:0:0/device/queue_depth >>> >>>while I/O is running, I'm hitting a reproducible WARN_ON() triggering >>>within as_completed_request(): >>> >>> static void as_completed_request(request_queue_t *q, struct request *rq) >>> { >>> struct as_data *ad = q->elevator->elevator_data; >>> struct as_rq *arq = RQ_DATA(rq); >>> >>> WARN_ON(!list_empty(&rq->queuelist)); >> >>Tejun, can you take a look at this please? >> > > > Sure. > > >>> ... >>> >>>and a subsequent panic: >>> >>> Badness in as_completed_request at drivers/block/as-iosched.c:951 >>> >>> Call Trace: <IRQ> ffff8024883a>{as_completed_request+63} <ffffffff8024098d>{elv_completed_request+44} >>> <ffffffff8024272a>{__blk_put_request+73} <ffffffff80280781>{scsi_end_request+164} >>> <ffffffff802809eb>{scsi_io_completion+584} <ffffffff80297059>{sd_rw_intr+709} >>> <ffffffff8027aa08>{scsi_finish_command+182} <ffffffff8027b2dc>{scsi_softirq+255} >>> <ffffffff801291ea>{__do_softirq+110} <ffffffff8010eb13>{call_softirq+31} >>> <ffffffff801101be>{do_softirq+54} <ffffffff80110211>{do_IRQ+74} >>> <ffffffff8010deba>{ret_from_intr+0} <EOI> <ffffffff8010c2fd>{mwait_idle+86} >>> <ffffffff8021aef0>{acpi_processor_idle+310} <ffffffff8010cacb>{cpu_idle+79} >>> <ffffffff804cecbf>{start_secondary+1017} >>> ----------- [cut here ] --------- [please bite here ] --------- >>> Kernel BUG at "drivers/block/ll_rw_blk.c":2361 >>> invalid operand: 0000 [1] SMP >>> CPU 2 >>> Modules linked in: qla2xxx >>> Pid: 0, comm: swapper Not tainted 2.6.13-rc5 >>> RIP: 0010:[<ffffffff80242734>] <ffffffff80242734>{__blk_put_request+83} >>> RSP: 0018:ffff8100021bbde8 EFLAGS: 00010087 >>> RAX: 0000000000000000 RBX: ffff81002dc738b0 RCX: 0000000000008000 >>> RDX: 0000000000004e6b RSI: 0000000000000004 RDI: ffff81003e091778 >>> RBP: ffff81003f8fa600 R08: 0000000000000000 R09: 0000000000000003 >>> R10: 00000000ffffffff R11: 0000000000000000 R12: 0000000000000000 >>> R13: 0000000000000001 R14: ffff81003f8fa600 R15: ffff81003f8fa600 >>> FS: 0000000000000000(0000) GS:ffffffff804b6900(0000) knlGS:0000000000000000 >>> CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b >>> CR2: 00002aaaaaac1000 CR3: 0000000037f05000 CR4: 00000000000006e0 >>> Process swapper (pid: 0, threadinfo ffff8100021b6000, task ffff8100021b54f0) >>> Stack: ffff81002dc738b0 ffff81002c1cd7c0 0000000000000286 ffffffff80280781 >>> 0000000000000001 ffff81002c1cd7c0 ffff81002dc738b0 0000000000000000 >>> 0000000000080000 ffffffff802809eb >>> Call Trace: <IRQ> <ffffffff80280781>{scsi_end_request+164} <ffffffff802809eb>{scsi_io_completion+584} >>> <ffffffff80297059>{sd_rw_intr+709} <ffffffff8027aa08>{scsi_finish_command+182} >>> <ffffffff8027b2dc>{scsi_softirq+255} <ffffffff801291ea>{__do_softirq+110} >>> <ffffffff8010eb13>{call_softirq+31} <ffffffff801101be>{do_softirq+54} >>> <ffffffff80110211>{do_IRQ+74} <ffffffff8010deba>{ret_from_intr+0} >>> <EOI> <ffffffff8010c2fd>{mwait_idle+86} <ffffffff8021aef0>{acpi_processor_idle+310} >>> <ffffffff8010cacb>{cpu_idle+79} <ffffffff804cecbf>{start_secondary+1017} >>> >>> Code: 0f 0b a3 0b f2 32 80 ff ff ff ff c2 39 09 48 89 de 48 89 ef >>> RIP <ffffffff80242734>{__blk_put_request+83} RSP <ffff8100021bbde8> >>> <3>Debug: sleeping function called from invalid context at include/linux/rwsem.h:43 >>> in_atomic():1, irqs_disabled():1 >>> >>> Call Trace: <IRQ> <ffffffff8011e2d7>{__might_sleep+199} <ffffffff80125316>{profile_task_exit+34} >>> <ffffffff80126fe2>{do_exit+34} <ffffffff801fc7d0>{vgacon_cursor+231} >>> <ffffffff8010f653>{kernel_math_error+0} <ffffffff8010fa09>{do_trap+264} >>> <ffffffff8010feb9>{do_invalid_op+145} <ffffffff80242734>{__blk_put_request+83} >>> <ffffffff801245d7>{printk+141} <ffffffff8010e415>{error_exit+0} >>> <ffffffff80242734>{__blk_put_request+83} <ffffffff8024272a>{__blk_put_request+73} >>> <ffffffff80280781>{scsi_end_request+164} <ffffffff802809eb>{scsi_io_completion+584} >>> <ffffffff80297059>{sd_rw_intr+709} <ffffffff8027aa08>{scsi_finish_command+182} >>> <ffffffff8027b2dc>{scsi_softirq+255} <ffffffff801291ea>{__do_softirq+110} >>> <ffffffff8010eb13>{call_softirq+31} <ffffffff801101be>{do_softirq+54} >>> <ffffffff80110211>{do_IRQ+74} <ffffffff8010deba>{ret_from_intr+0} >>> <EOI> <ffffffff8010c2fd>{mwait_idle+86} <ffffffff8021aef0>{acpi_processor_idle+310} >>> <ffffffff8010cacb>{cpu_idle+79} <ffffffff804cecbf>{start_secondary+1017} >>> >>> Kernel panic - not syncing: Aiee, killing interrupt handler! >>> >>>Adding scsi_target_quiesce() and scsi_target_resume() barriers around >>>the scsi_adjust_target_queue_depth() call appears to help (i.e. >>>dropping from 32 -> 24): >>> >>> # echo 24 > /sys/class/scsi_device/3\:0\:0\:0/device/queue_depth >>> >>>and dropping down again to 16: >>> >>> # echo 16 > /sys/class/scsi_device/3\:0\:0\:0/device/queue_depth >>> >>>but occasionally, while trying another depth drop: >>> >>> # echo 10 > /sys/class/scsi_device/3\:0\:0\:0/device/queue_depth >>> >>>I'll either get a panic (haven't captured a good one yet (only a >>>couple of line within the trace): >>> >>> eip: ffffffff80248a62 >>> ----------- [cut here ] --------- [please bite here ] --------- >>> Kernel BUG at "include/asm/spinlock.h":121 >>> >>>or I get the following slab-error: >>> >>> slab error in cache_free_debugcheck(): cache `size-128': double free, or memory outside object was overwritten >>> >>> Call Trace:<ffffffff8014930c>{cache_free_debugcheck+290} <ffffffff8014975c>{kfree+136} >>> <ffffffff80244e65>{blk_queue_resize_tags+119} <ffffffff8027a826>{scsi_adjust_queue_depth+68} >>> <ffffffff88000133>{:qla2xxx:qla2x00_change_queue_depth+71} >>> <ffffffff80283666>{sdev_store_queue_depth_rw+82} <ffffffff8023a9a2>{dev_attr_store+31} >>> <ffffffff80191e95>{sysfs_write_file+200} <ffffffff80160dba>{vfs_write+172} >>> <ffffffff80160ed8>{sys_write+69} <ffffffff8010d8f6>{system_call+126} >>> >>> ffff8100389baba8: redzone 1: 0x170fc2a5, redzone 2: 0x0. >>> >>>I'm using a fairly recent snapshot of Linus' GIT tree (sync done >>>earlier today). >>> >>>Two questions: >>> >>> - must the target be quiesced before adjusting the queue-depth? >>> >>> - any ideas on where why successive lowering of the depth borks the >>> machine? > > > I think it's caused by using tag_index over its end. The slab > corruption supports that. I tried to fix this incorrectly in the > following post. > > http://marc.theaimsgroup.com/?l=linux-kernel&m=111399756324813&w=2 > Oops, forget about the previous mail. Above patch make it into the tree and it's the source of the problem. My git HEAD was pointing at the latest update but I haven't updated my cache, so I was looking at the old source tree. My apologies for the hassle and the bug. Original code was broken in the following two points. * tag_index wasn't allocated fully * tag_map's extra bits were always initialized w/ 1's. The first bug is critical and the second bug prevents proper enlarging of tag map. However, the second bug effectively masks the first bug avoiding critical problem. My above mentioned patch broke things seriously when reducing tag size on flight. Again, my apologies and patch will soon follow. -- tejun ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: calling scsi_adjust_queue_depth() during I/O... 2005-08-05 11:43 ` Tejun Heo @ 2005-08-05 12:33 ` Tejun Heo 2005-08-05 15:55 ` Andrew Vasquez 2005-08-05 16:32 ` James Bottomley 0 siblings, 2 replies; 12+ messages in thread From: Tejun Heo @ 2005-08-05 12:33 UTC (permalink / raw) To: Jens Axboe, Andrew Vasquez; +Cc: Linux-SCSI Mailing List > Oops, forget about the previous mail. Above patch make it into the > tree and it's the source of the problem. My git HEAD was pointing at > the latest update but I haven't updated my cache, so I was looking at > the old source tree. My apologies for the hassle and the bug. > > Original code was broken in the following two points. > > * tag_index wasn't allocated fully > * tag_map's extra bits were always initialized w/ 1's. > > The first bug is critical and the second bug prevents proper enlarging > of tag map. However, the second bug effectively masks the first bug > avoiding critical problem. My above mentioned patch broke things > seriously when reducing tag size on flight. > > Again, my apologies and patch will soon follow. Here's the fix. It basically revives bqt->real_max_depth sans allocation optimization in init_tag_map. I've also added a comment explicitly noting that tag map cannot be shrunk to prevent other morons like me. :-( Please try this one and let me know how it works. If this is the correct fix, I'll repost properly to Jens and lkml with detailed explanation on how it was broken in the original code and how I broke it with my previous patch. Sorry. diff --git a/drivers/block/ll_rw_blk.c b/drivers/block/ll_rw_blk.c --- a/drivers/block/ll_rw_blk.c +++ b/drivers/block/ll_rw_blk.c @@ -719,7 +719,7 @@ struct request *blk_queue_find_tag(reque { struct blk_queue_tag *bqt = q->queue_tags; - if (unlikely(bqt == NULL || tag >= bqt->max_depth)) + if (unlikely(bqt == NULL || tag >= bqt->real_max_depth)) return NULL; return bqt->tag_index[tag]; @@ -798,6 +798,7 @@ init_tag_map(request_queue_t *q, struct memset(tag_index, 0, depth * sizeof(struct request *)); memset(tag_map, 0, nr_ulongs * sizeof(unsigned long)); + tags->real_max_depth = depth; tags->max_depth = depth; tags->tag_index = tag_index; tags->tag_map = tag_map; @@ -872,11 +873,22 @@ int blk_queue_resize_tags(request_queue_ return -ENXIO; /* + * if we already have large enough real_max_depth. just + * adjust max_depth. *NOTE* as requests with tag value + * between new_depth and real_max_depth can be in-flight, tag + * map cannot be shrunk. + */ + if (new_depth <= bqt->real_max_depth) { + bqt->max_depth = new_depth; + return 0; + } + + /* * save the old state info, so we can copy it back */ tag_index = bqt->tag_index; tag_map = bqt->tag_map; - max_depth = bqt->max_depth; + max_depth = bqt->real_max_depth; if (init_tag_map(q, bqt, new_depth)) return -ENOMEM; @@ -913,7 +925,7 @@ void blk_queue_end_tag(request_queue_t * BUG_ON(tag == -1); - if (unlikely(tag >= bqt->max_depth)) + if (unlikely(tag >= bqt->real_max_depth)) /* * This can happen after tag depth has been reduced. * FIXME: how about a warning or info message here? diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -301,6 +301,7 @@ struct blk_queue_tag { struct list_head busy_list; /* fifo list of busy tags */ int busy; /* current depth */ int max_depth; /* what we will send to device */ + int real_max_depth; /* what the array can hold */ atomic_t refcnt; /* map can be shared */ }; ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: calling scsi_adjust_queue_depth() during I/O... 2005-08-05 12:33 ` Tejun Heo @ 2005-08-05 15:55 ` Andrew Vasquez 2005-08-05 15:59 ` Jens Axboe 2005-08-05 16:32 ` James Bottomley 1 sibling, 1 reply; 12+ messages in thread From: Andrew Vasquez @ 2005-08-05 15:55 UTC (permalink / raw) To: Tejun Heo; +Cc: Jens Axboe, Linux-SCSI Mailing List On Fri, 05 Aug 2005, Tejun Heo wrote: > > Oops, forget about the previous mail. Above patch make it into the > > tree and it's the source of the problem. My git HEAD was pointing at > > the latest update but I haven't updated my cache, so I was looking at > > the old source tree. My apologies for the hassle and the bug. > > > > Original code was broken in the following two points. > > > > * tag_index wasn't allocated fully > > * tag_map's extra bits were always initialized w/ 1's. > > > > The first bug is critical and the second bug prevents proper enlarging > > of tag map. However, the second bug effectively masks the first bug > > avoiding critical problem. My above mentioned patch broke things > > seriously when reducing tag size on flight. > > > > Again, my apologies and patch will soon follow. > > Here's the fix. It basically revives bqt->real_max_depth sans > allocation optimization in init_tag_map. I've also added a comment > explicitly noting that tag map cannot be shrunk to prevent other > morons like me. :-( Please try this one and let me know how it works. > If this is the correct fix, I'll repost properly to Jens and lkml with > detailed explanation on how it was broken in the original code and how > I broke it with my previous patch. Sorry. OK, 20 minutes into lowering and raising the queue-depth and everything appears to be working fine. I'll continue banging away with my configuration and let you know if anything else comes up. Looks good so far. Thanks, Andrew ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: calling scsi_adjust_queue_depth() during I/O... 2005-08-05 15:55 ` Andrew Vasquez @ 2005-08-05 15:59 ` Jens Axboe 2005-08-05 17:15 ` Tejun Heo 0 siblings, 1 reply; 12+ messages in thread From: Jens Axboe @ 2005-08-05 15:59 UTC (permalink / raw) To: Andrew Vasquez; +Cc: Tejun Heo, Linux-SCSI Mailing List On Fri, Aug 05 2005, Andrew Vasquez wrote: > On Fri, 05 Aug 2005, Tejun Heo wrote: > > > > Oops, forget about the previous mail. Above patch make it into the > > > tree and it's the source of the problem. My git HEAD was pointing at > > > the latest update but I haven't updated my cache, so I was looking at > > > the old source tree. My apologies for the hassle and the bug. > > > > > > Original code was broken in the following two points. > > > > > > * tag_index wasn't allocated fully > > > * tag_map's extra bits were always initialized w/ 1's. > > > > > > The first bug is critical and the second bug prevents proper enlarging > > > of tag map. However, the second bug effectively masks the first bug > > > avoiding critical problem. My above mentioned patch broke things > > > seriously when reducing tag size on flight. > > > > > > Again, my apologies and patch will soon follow. > > > > Here's the fix. It basically revives bqt->real_max_depth sans > > allocation optimization in init_tag_map. I've also added a comment > > explicitly noting that tag map cannot be shrunk to prevent other > > morons like me. :-( Please try this one and let me know how it works. > > If this is the correct fix, I'll repost properly to Jens and lkml with > > detailed explanation on how it was broken in the original code and how > > I broke it with my previous patch. Sorry. > > OK, 20 minutes into lowering and raising the queue-depth and > everything appears to be working fine. I'll continue banging away > with my configuration and let you know if anything else comes up. > Looks good so far. Thanks for fixing it so quickly, Tejun! I'll be on vacation next week, can you make sure it gets to Andrew? -- Jens Axboe ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: calling scsi_adjust_queue_depth() during I/O... 2005-08-05 15:59 ` Jens Axboe @ 2005-08-05 17:15 ` Tejun Heo 0 siblings, 0 replies; 12+ messages in thread From: Tejun Heo @ 2005-08-05 17:15 UTC (permalink / raw) To: Jens Axboe; +Cc: Andrew Vasquez, Linux-SCSI Mailing List On Fri, Aug 05, 2005 at 05:59:06PM +0200, Jens Axboe wrote: > On Fri, Aug 05 2005, Andrew Vasquez wrote: > > On Fri, 05 Aug 2005, Tejun Heo wrote: > > > > > > Oops, forget about the previous mail. Above patch make it into the > > > > tree and it's the source of the problem. My git HEAD was pointing at > > > > the latest update but I haven't updated my cache, so I was looking at > > > > the old source tree. My apologies for the hassle and the bug. > > > > > > > > Original code was broken in the following two points. > > > > > > > > * tag_index wasn't allocated fully > > > > * tag_map's extra bits were always initialized w/ 1's. > > > > > > > > The first bug is critical and the second bug prevents proper enlarging > > > > of tag map. However, the second bug effectively masks the first bug > > > > avoiding critical problem. My above mentioned patch broke things > > > > seriously when reducing tag size on flight. > > > > > > > > Again, my apologies and patch will soon follow. > > > > > > Here's the fix. It basically revives bqt->real_max_depth sans > > > allocation optimization in init_tag_map. I've also added a comment > > > explicitly noting that tag map cannot be shrunk to prevent other > > > morons like me. :-( Please try this one and let me know how it works. > > > If this is the correct fix, I'll repost properly to Jens and lkml with > > > detailed explanation on how it was broken in the original code and how > > > I broke it with my previous patch. Sorry. > > > > OK, 20 minutes into lowering and raising the queue-depth and > > everything appears to be working fine. I'll continue banging away > > with my configuration and let you know if anything else comes up. > > Looks good so far. > > Thanks for fixing it so quickly, Tejun! I'll be on vacation next week, > can you make sure it gets to Andrew? > Meaning... Andrew Morton, right? I'll make that sure. Have fun on your vacation. :-) -- tejun ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: calling scsi_adjust_queue_depth() during I/O... 2005-08-05 12:33 ` Tejun Heo 2005-08-05 15:55 ` Andrew Vasquez @ 2005-08-05 16:32 ` James Bottomley 2005-08-05 17:10 ` Tejun Heo 1 sibling, 1 reply; 12+ messages in thread From: James Bottomley @ 2005-08-05 16:32 UTC (permalink / raw) To: Tejun Heo; +Cc: Jens Axboe, Andrew Vasquez, Linux-SCSI Mailing List On Fri, 2005-08-05 at 21:33 +0900, Tejun Heo wrote: > Here's the fix. It basically revives bqt->real_max_depth sans > allocation optimization in init_tag_map. I've also added a comment > explicitly noting that tag map cannot be shrunk to prevent other > morons like me. :-( Please try this one and let me know how it works. > If this is the correct fix, I'll repost properly to Jens and lkml with > detailed explanation on how it was broken in the original code and how > I broke it with my previous patch. Sorry. Actually, if you really want to adjust the array size downwards, there's a way we can do it: - If the bits that would be lost on shrinkage are all zero at the time blk_queue_resize_tags() is called, that means that there are no outstanding tags up there and the array can be shrunk immediately. - If there are outstanding tags between the new and the old depth, the array can be shrunk when the last one of these returns, say in blk_rq_end_tag() James ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: calling scsi_adjust_queue_depth() during I/O... 2005-08-05 16:32 ` James Bottomley @ 2005-08-05 17:10 ` Tejun Heo 2005-08-05 17:20 ` James Bottomley 2005-08-05 17:24 ` Andrew Vasquez 0 siblings, 2 replies; 12+ messages in thread From: Tejun Heo @ 2005-08-05 17:10 UTC (permalink / raw) To: James Bottomley; +Cc: Jens Axboe, Andrew Vasquez, Linux-SCSI Mailing List On Fri, Aug 05, 2005 at 11:32:07AM -0500, James Bottomley wrote: > On Fri, 2005-08-05 at 21:33 +0900, Tejun Heo wrote: > > Here's the fix. It basically revives bqt->real_max_depth sans > > allocation optimization in init_tag_map. I've also added a comment > > explicitly noting that tag map cannot be shrunk to prevent other > > morons like me. :-( Please try this one and let me know how it works. > > If this is the correct fix, I'll repost properly to Jens and lkml with > > detailed explanation on how it was broken in the original code and how > > I broke it with my previous patch. Sorry. > > Actually, if you really want to adjust the array size downwards, there's > a way we can do it: > > - If the bits that would be lost on shrinkage are all zero at the time > blk_queue_resize_tags() is called, that means that there are no > outstanding tags up there and the array can be shrunk immediately. > > - If there are outstanding tags between the new and the old depth, the > array can be shrunk when the last one of these returns, say in > blk_rq_end_tag() > Hello, James. Yes, we can do that, but I'm not sure if that would be necessary. AFAIK, queues are normally not very deep and a tag only occupies one pointer and one bit. Also, the shrinking operation isn't very common, at least for traditional SPI devices and SATA drives, I think. Are newer SCSI devices (say, SAS/iSCSI) different? - like having very deep queue and needing dynamic queue depth adjustment? If that's the case, I think I can implement shrinking in a separate patch. (and try not to screw up this time ;-) Thank you. -- tejun ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: calling scsi_adjust_queue_depth() during I/O... 2005-08-05 17:10 ` Tejun Heo @ 2005-08-05 17:20 ` James Bottomley 2005-08-05 17:24 ` Andrew Vasquez 1 sibling, 0 replies; 12+ messages in thread From: James Bottomley @ 2005-08-05 17:20 UTC (permalink / raw) To: Tejun Heo; +Cc: Jens Axboe, Andrew Vasquez, Linux-SCSI Mailing List On Sat, 2005-08-06 at 02:10 +0900, Tejun Heo wrote: > Yes, we can do that, but I'm not sure if that would be necessary. > AFAIK, queues are normally not very deep and a tag only occupies one > pointer and one bit. Also, the shrinking operation isn't very common, > at least for traditional SPI devices and SATA drives, I think. > > Are newer SCSI devices (say, SAS/iSCSI) different? - like having very > deep queue and needing dynamic queue depth adjustment? If that's the > case, I think I can implement shrinking in a separate patch. (and try > not to screw up this time ;-) Well, yes, there are reasons for wanting deeper queues, but I'd leave it for the time being. What I'm looking into is support for aic7xxx/aic79xx queueing. There, the sequencer has to have a globally unique tag (from which it generates the device locally unique tag internally). That gives TCQ depths of up to 512 I believe. However, still probably not a significant waste of memory to worry about. James ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: calling scsi_adjust_queue_depth() during I/O... 2005-08-05 17:10 ` Tejun Heo 2005-08-05 17:20 ` James Bottomley @ 2005-08-05 17:24 ` Andrew Vasquez 1 sibling, 0 replies; 12+ messages in thread From: Andrew Vasquez @ 2005-08-05 17:24 UTC (permalink / raw) To: Tejun Heo; +Cc: James Bottomley, Jens Axboe, Linux-SCSI Mailing List On Sat, 06 Aug 2005, Tejun Heo wrote: > On Fri, Aug 05, 2005 at 11:32:07AM -0500, James Bottomley wrote: > > On Fri, 2005-08-05 at 21:33 +0900, Tejun Heo wrote: > > > Here's the fix. It basically revives bqt->real_max_depth sans > > > allocation optimization in init_tag_map. I've also added a comment > > > explicitly noting that tag map cannot be shrunk to prevent other > > > morons like me. :-( Please try this one and let me know how it works. > > > If this is the correct fix, I'll repost properly to Jens and lkml with > > > detailed explanation on how it was broken in the original code and how > > > I broke it with my previous patch. Sorry. > > > > Actually, if you really want to adjust the array size downwards, there's > > a way we can do it: > > > > - If the bits that would be lost on shrinkage are all zero at the time > > blk_queue_resize_tags() is called, that means that there are no > > outstanding tags up there and the array can be shrunk immediately. > > > > - If there are outstanding tags between the new and the old depth, the > > array can be shrunk when the last one of these returns, say in > > blk_rq_end_tag() > > > > Hello, James. > > Yes, we can do that, but I'm not sure if that would be necessary. > AFAIK, queues are normally not very deep and a tag only occupies one > pointer and one bit. Also, the shrinking operation isn't very common, > at least for traditional SPI devices and SATA drives, I think. > > Are newer SCSI devices (say, SAS/iSCSI) different? - like having very > deep queue and needing dynamic queue depth adjustment? If that's the > case, I think I can implement shrinking in a separate patch. (and try > not to screw up this time ;-) Well from the fibre-channel side of the storage world, a piece of storage (RAID box) is generally parcelled out to a large number of hosts. These boxes tend to have a finite amount of resources available to service requests to those hosts, so depending of course on the amount of traffic being directed to the storage, QUEUE_FULL cases may arise causing a particular host (or a set of hosts), to throttle down their queue-depths for some period of time. The trick though, is to dynamically throttle the depth up so as to fully utilise the shared resources of the storage. -- Andrew Vasquez ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2005-08-05 17:24 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2005-08-04 23:41 calling scsi_adjust_queue_depth() during I/O Andrew Vasquez 2005-08-05 7:57 ` Jens Axboe 2005-08-05 11:09 ` Tejun Heo 2005-08-05 11:43 ` Tejun Heo 2005-08-05 12:33 ` Tejun Heo 2005-08-05 15:55 ` Andrew Vasquez 2005-08-05 15:59 ` Jens Axboe 2005-08-05 17:15 ` Tejun Heo 2005-08-05 16:32 ` James Bottomley 2005-08-05 17:10 ` Tejun Heo 2005-08-05 17:20 ` James Bottomley 2005-08-05 17:24 ` Andrew Vasquez
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.