* Current git --> kaboom [bisect] seems IDE related. @ 2008-02-09 19:32 Sebastian Siewior 2008-02-09 20:28 ` Bartlomiej Zolnierkiewicz 0 siblings, 1 reply; 27+ messages in thread From: Sebastian Siewior @ 2008-02-09 19:32 UTC (permalink / raw) To: Bartlomiej Zolnierkiewicz; +Cc: Tejun Heo, Sergei Shtylyov, linux-ide Hello, current git results in a freeze. I tracked it down to | error = type->get_sb(type, flags, name, data, mnt); in vfs_kern_mount(), fs/super.c (what is called from mount_root() during the boot process). I use XFS on my rootfs partition. After this, I started to bisect. During the bisect process I stepped into two things. The commit |commit 813a0eb233ee67d7166241a8b389b6a76f2247f9 |Author: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> |Date: Fri Jan 25 22:17:10 2008 +0100 | | ide: switch idedisk_prepare_flush() to use REQ_TYPE_ATA_TASKFILE requests | | Based on the earlier work by Tejun Heo. | | There should be no functionality changes caused by this patch. | | Cc: Tejun Heo <htejun@gmail.com> | Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> | |:040000 040000 e4708c62aef233d2b07da20aedd644182258f1f1 66aea951e7c32aa82621d8460b0f97ff132f23a6 M drivers turned the working boot process into [1]. Since this isn't the freeze I was looking I continued my bisect process. The commit |10d90157c83d4b6743c9063c36f9e7f27aa254b6 is first bad commit |commit 10d90157c83d4b6743c9063c36f9e7f27aa254b6 |Author: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> |Date: Fri Jan 25 22:17:16 2008 +0100 | | ide: convert do_rw_taskfile() to use ->data_phase | | * Use task->data_phase in do_rw_taskfile() to decide what to do. | | * task->prehandler is only used by TASKFILE[_MULTI]_OUT so just | use pre_task_out_intr() directly and remove no longer needed | 'prehandler' field from ide_task_t. | | * Remove no longer needed ide_pre_handler_t type. | | There should be no functionality changes caused by this patch. | | Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> | Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> | |:040000 040000 4197f4ab343f54f9eab32000df33eb118121f9fa e38665b62f3fc315e5b3bd9bb9c49bec4a961d43 M drivers |:040000 040000 a9abbd138a5713e16406fd63be0799e109849c53 3ee5fade5fc65ea1b41b6ca55b3de2dbb904623b M include turned the Oops into the freeze. There was no working commit in between (well, git-bisect did not find one). My .config [2], my lspci and cpuinfo [3]. [1] http://download.breakpoint.cc/lnx-2.6.24-post-crash.jpg [2] http://download.breakpoint.cc/notebook-config.txt [3] http://download.breakpoint.cc/system.txt Sebastian ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Current git --> kaboom [bisect] seems IDE related. 2008-02-09 19:32 Current git --> kaboom [bisect] seems IDE related Sebastian Siewior @ 2008-02-09 20:28 ` Bartlomiej Zolnierkiewicz 2008-02-09 21:22 ` Sebastian Siewior 0 siblings, 1 reply; 27+ messages in thread From: Bartlomiej Zolnierkiewicz @ 2008-02-09 20:28 UTC (permalink / raw) To: Sebastian Siewior; +Cc: Tejun Heo, Sergei Shtylyov, linux-ide Hi, On Saturday 09 February 2008, Sebastian Siewior wrote: > Hello, > > current git results in a freeze. I tracked it down to > > | error = type->get_sb(type, flags, name, data, mnt); > in vfs_kern_mount(), fs/super.c (what is called from mount_root() during > the boot process). I use XFS on my rootfs partition. After this, I > started to bisect. > During the bisect process I stepped into two things. The commit > > |commit 813a0eb233ee67d7166241a8b389b6a76f2247f9 > |Author: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> > |Date: Fri Jan 25 22:17:10 2008 +0100 > | > | ide: switch idedisk_prepare_flush() to use REQ_TYPE_ATA_TASKFILE requests > | > | Based on the earlier work by Tejun Heo. > | > | There should be no functionality changes caused by this patch. > | > | Cc: Tejun Heo <htejun@gmail.com> > | Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> > | > |:040000 040000 e4708c62aef233d2b07da20aedd644182258f1f1 66aea951e7c32aa82621d8460b0f97ff132f23a6 M drivers > > turned the working boot process into [1]. Since this isn't the freeze I > was looking I continued my bisect process. The commit It could be that due to block layer changes to I/O barrier handling allocating ATA command structure on the stack is no longer safe. Please try booting with "hdx=noflush" kernel parameter or please try the attached patch which should fix the issue (if my theory is correct). [...] > turned the Oops into the freeze. There was no working commit in between > (well, git-bisect did not find one). > > My .config [2], my lspci and cpuinfo [3]. > > [1] http://download.breakpoint.cc/lnx-2.6.24-post-crash.jpg > [2] http://download.breakpoint.cc/notebook-config.txt > [3] http://download.breakpoint.cc/system.txt --- drivers/ide/ide-disk.c | 14 +-- include/linux/ide.h | 206 ++++++++++++++++++++++++------------------------- 2 files changed, 111 insertions(+), 109 deletions(-) Index: b/drivers/ide/ide-disk.c =================================================================== --- a/drivers/ide/ide-disk.c +++ b/drivers/ide/ide-disk.c @@ -590,20 +590,20 @@ static ide_proc_entry_t idedisk_proc[] = static void idedisk_prepare_flush(struct request_queue *q, struct request *rq) { ide_drive_t *drive = q->queuedata; - ide_task_t task; + ide_task_t *task = &drive->tf_task; - memset(&task, 0, sizeof(task)); + memset(task, 0, sizeof(*task)); if (ide_id_has_flush_cache_ext(drive->id) && (drive->capacity64 >= (1UL << 28))) - task.tf.command = WIN_FLUSH_CACHE_EXT; + task->tf.command = WIN_FLUSH_CACHE_EXT; else - task.tf.command = WIN_FLUSH_CACHE; - task.tf_flags = IDE_TFLAG_OUT_TF | IDE_TFLAG_OUT_DEVICE; - task.data_phase = TASKFILE_NO_DATA; + task->tf.command = WIN_FLUSH_CACHE; + task->tf_flags = IDE_TFLAG_OUT_TF | IDE_TFLAG_OUT_DEVICE; + task->data_phase = TASKFILE_NO_DATA; rq->cmd_type = REQ_TYPE_ATA_TASKFILE; rq->cmd_flags |= REQ_SOFTBARRIER; - rq->special = &task; + rq->special = task; } /* Index: b/include/linux/ide.h =================================================================== --- a/include/linux/ide.h +++ b/include/linux/ide.h @@ -32,6 +32,108 @@ # define SUPPORT_VLB_SYNC 1 #endif +enum { + IDE_TFLAG_LBA48 = (1 << 0), + IDE_TFLAG_NO_SELECT_MASK = (1 << 1), + IDE_TFLAG_FLAGGED = (1 << 2), + IDE_TFLAG_OUT_DATA = (1 << 3), + IDE_TFLAG_OUT_HOB_FEATURE = (1 << 4), + IDE_TFLAG_OUT_HOB_NSECT = (1 << 5), + IDE_TFLAG_OUT_HOB_LBAL = (1 << 6), + IDE_TFLAG_OUT_HOB_LBAM = (1 << 7), + IDE_TFLAG_OUT_HOB_LBAH = (1 << 8), + IDE_TFLAG_OUT_HOB = IDE_TFLAG_OUT_HOB_FEATURE | + IDE_TFLAG_OUT_HOB_NSECT | + IDE_TFLAG_OUT_HOB_LBAL | + IDE_TFLAG_OUT_HOB_LBAM | + IDE_TFLAG_OUT_HOB_LBAH, + IDE_TFLAG_OUT_FEATURE = (1 << 9), + IDE_TFLAG_OUT_NSECT = (1 << 10), + IDE_TFLAG_OUT_LBAL = (1 << 11), + IDE_TFLAG_OUT_LBAM = (1 << 12), + IDE_TFLAG_OUT_LBAH = (1 << 13), + IDE_TFLAG_OUT_TF = IDE_TFLAG_OUT_FEATURE | + IDE_TFLAG_OUT_NSECT | + IDE_TFLAG_OUT_LBAL | + IDE_TFLAG_OUT_LBAM | + IDE_TFLAG_OUT_LBAH, + IDE_TFLAG_OUT_DEVICE = (1 << 14), + IDE_TFLAG_WRITE = (1 << 15), + IDE_TFLAG_FLAGGED_SET_IN_FLAGS = (1 << 16), + IDE_TFLAG_IN_DATA = (1 << 17), + IDE_TFLAG_CUSTOM_HANDLER = (1 << 18), + IDE_TFLAG_DMA_PIO_FALLBACK = (1 << 19), + IDE_TFLAG_IN_HOB_FEATURE = (1 << 20), + IDE_TFLAG_IN_HOB_NSECT = (1 << 21), + IDE_TFLAG_IN_HOB_LBAL = (1 << 22), + IDE_TFLAG_IN_HOB_LBAM = (1 << 23), + IDE_TFLAG_IN_HOB_LBAH = (1 << 24), + IDE_TFLAG_IN_HOB_LBA = IDE_TFLAG_IN_HOB_LBAL | + IDE_TFLAG_IN_HOB_LBAM | + IDE_TFLAG_IN_HOB_LBAH, + IDE_TFLAG_IN_HOB = IDE_TFLAG_IN_HOB_FEATURE | + IDE_TFLAG_IN_HOB_NSECT | + IDE_TFLAG_IN_HOB_LBA, + IDE_TFLAG_IN_NSECT = (1 << 25), + IDE_TFLAG_IN_LBAL = (1 << 26), + IDE_TFLAG_IN_LBAM = (1 << 27), + IDE_TFLAG_IN_LBAH = (1 << 28), + IDE_TFLAG_IN_LBA = IDE_TFLAG_IN_LBAL | + IDE_TFLAG_IN_LBAM | + IDE_TFLAG_IN_LBAH, + IDE_TFLAG_IN_TF = IDE_TFLAG_IN_NSECT | + IDE_TFLAG_IN_LBA, + IDE_TFLAG_IN_DEVICE = (1 << 29), + IDE_TFLAG_HOB = IDE_TFLAG_OUT_HOB | + IDE_TFLAG_IN_HOB, + IDE_TFLAG_TF = IDE_TFLAG_OUT_TF | + IDE_TFLAG_IN_TF, + IDE_TFLAG_DEVICE = IDE_TFLAG_OUT_DEVICE | + IDE_TFLAG_IN_DEVICE, + /* force 16-bit I/O operations */ + IDE_TFLAG_IO_16BIT = (1 << 30), +}; + +struct ide_taskfile { + u8 hob_data; /* 0: high data byte (for TASKFILE IOCTL) */ + + u8 hob_feature; /* 1-5: additional data to support LBA48 */ + u8 hob_nsect; + u8 hob_lbal; + u8 hob_lbam; + u8 hob_lbah; + + u8 data; /* 6: low data byte (for TASKFILE IOCTL) */ + + union { /* 7: */ + u8 error; /* read: error */ + u8 feature; /* write: feature */ + }; + + u8 nsect; /* 8: number of sectors */ + u8 lbal; /* 9: LBA low */ + u8 lbam; /* 10: LBA mid */ + u8 lbah; /* 11: LBA high */ + + u8 device; /* 12: device select */ + + union { /* 13: */ + u8 status; /* read: status */ + u8 command; /* write: command */ + }; +}; + +typedef struct ide_task_s { + union { + struct ide_taskfile tf; + u8 tf_array[14]; + }; + u32 tf_flags; + int data_phase; + struct request *rq; /* copy of request */ + void *special; /* valid_t generally */ +} ide_task_t; + /* * Used to indicate "no IRQ", should be a value that cannot be an IRQ * number. @@ -386,6 +488,8 @@ typedef struct ide_drive_s { struct list_head list; struct device gendev; struct completion gendev_rel_comp; /* to deal with device release() */ + + struct ide_task_s tf_task; } ide_drive_t; #define to_ide_device(dev)container_of(dev, ide_drive_t, gendev) @@ -798,108 +902,6 @@ extern int ide_do_drive_cmd(ide_drive_t extern void ide_end_drive_cmd(ide_drive_t *, u8, u8); -enum { - IDE_TFLAG_LBA48 = (1 << 0), - IDE_TFLAG_NO_SELECT_MASK = (1 << 1), - IDE_TFLAG_FLAGGED = (1 << 2), - IDE_TFLAG_OUT_DATA = (1 << 3), - IDE_TFLAG_OUT_HOB_FEATURE = (1 << 4), - IDE_TFLAG_OUT_HOB_NSECT = (1 << 5), - IDE_TFLAG_OUT_HOB_LBAL = (1 << 6), - IDE_TFLAG_OUT_HOB_LBAM = (1 << 7), - IDE_TFLAG_OUT_HOB_LBAH = (1 << 8), - IDE_TFLAG_OUT_HOB = IDE_TFLAG_OUT_HOB_FEATURE | - IDE_TFLAG_OUT_HOB_NSECT | - IDE_TFLAG_OUT_HOB_LBAL | - IDE_TFLAG_OUT_HOB_LBAM | - IDE_TFLAG_OUT_HOB_LBAH, - IDE_TFLAG_OUT_FEATURE = (1 << 9), - IDE_TFLAG_OUT_NSECT = (1 << 10), - IDE_TFLAG_OUT_LBAL = (1 << 11), - IDE_TFLAG_OUT_LBAM = (1 << 12), - IDE_TFLAG_OUT_LBAH = (1 << 13), - IDE_TFLAG_OUT_TF = IDE_TFLAG_OUT_FEATURE | - IDE_TFLAG_OUT_NSECT | - IDE_TFLAG_OUT_LBAL | - IDE_TFLAG_OUT_LBAM | - IDE_TFLAG_OUT_LBAH, - IDE_TFLAG_OUT_DEVICE = (1 << 14), - IDE_TFLAG_WRITE = (1 << 15), - IDE_TFLAG_FLAGGED_SET_IN_FLAGS = (1 << 16), - IDE_TFLAG_IN_DATA = (1 << 17), - IDE_TFLAG_CUSTOM_HANDLER = (1 << 18), - IDE_TFLAG_DMA_PIO_FALLBACK = (1 << 19), - IDE_TFLAG_IN_HOB_FEATURE = (1 << 20), - IDE_TFLAG_IN_HOB_NSECT = (1 << 21), - IDE_TFLAG_IN_HOB_LBAL = (1 << 22), - IDE_TFLAG_IN_HOB_LBAM = (1 << 23), - IDE_TFLAG_IN_HOB_LBAH = (1 << 24), - IDE_TFLAG_IN_HOB_LBA = IDE_TFLAG_IN_HOB_LBAL | - IDE_TFLAG_IN_HOB_LBAM | - IDE_TFLAG_IN_HOB_LBAH, - IDE_TFLAG_IN_HOB = IDE_TFLAG_IN_HOB_FEATURE | - IDE_TFLAG_IN_HOB_NSECT | - IDE_TFLAG_IN_HOB_LBA, - IDE_TFLAG_IN_NSECT = (1 << 25), - IDE_TFLAG_IN_LBAL = (1 << 26), - IDE_TFLAG_IN_LBAM = (1 << 27), - IDE_TFLAG_IN_LBAH = (1 << 28), - IDE_TFLAG_IN_LBA = IDE_TFLAG_IN_LBAL | - IDE_TFLAG_IN_LBAM | - IDE_TFLAG_IN_LBAH, - IDE_TFLAG_IN_TF = IDE_TFLAG_IN_NSECT | - IDE_TFLAG_IN_LBA, - IDE_TFLAG_IN_DEVICE = (1 << 29), - IDE_TFLAG_HOB = IDE_TFLAG_OUT_HOB | - IDE_TFLAG_IN_HOB, - IDE_TFLAG_TF = IDE_TFLAG_OUT_TF | - IDE_TFLAG_IN_TF, - IDE_TFLAG_DEVICE = IDE_TFLAG_OUT_DEVICE | - IDE_TFLAG_IN_DEVICE, - /* force 16-bit I/O operations */ - IDE_TFLAG_IO_16BIT = (1 << 30), -}; - -struct ide_taskfile { - u8 hob_data; /* 0: high data byte (for TASKFILE IOCTL) */ - - u8 hob_feature; /* 1-5: additional data to support LBA48 */ - u8 hob_nsect; - u8 hob_lbal; - u8 hob_lbam; - u8 hob_lbah; - - u8 data; /* 6: low data byte (for TASKFILE IOCTL) */ - - union { /* 7: */ - u8 error; /* read: error */ - u8 feature; /* write: feature */ - }; - - u8 nsect; /* 8: number of sectors */ - u8 lbal; /* 9: LBA low */ - u8 lbam; /* 10: LBA mid */ - u8 lbah; /* 11: LBA high */ - - u8 device; /* 12: device select */ - - union { /* 13: */ - u8 status; /* read: status */ - u8 command; /* write: command */ - }; -}; - -typedef struct ide_task_s { - union { - struct ide_taskfile tf; - u8 tf_array[14]; - }; - u32 tf_flags; - int data_phase; - struct request *rq; /* copy of request */ - void *special; /* valid_t generally */ -} ide_task_t; - void ide_tf_load(ide_drive_t *, ide_task_t *); void ide_tf_read(ide_drive_t *, ide_task_t *); ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Current git --> kaboom [bisect] seems IDE related. 2008-02-09 20:28 ` Bartlomiej Zolnierkiewicz @ 2008-02-09 21:22 ` Sebastian Siewior 2008-02-09 23:06 ` Bartlomiej Zolnierkiewicz 0 siblings, 1 reply; 27+ messages in thread From: Sebastian Siewior @ 2008-02-09 21:22 UTC (permalink / raw) To: Bartlomiej Zolnierkiewicz; +Cc: Tejun Heo, Sergei Shtylyov, linux-ide * Bartlomiej Zolnierkiewicz | 2008-02-09 21:28:39 [+0100]: >It could be that due to block layer changes to I/O barrier handling >allocating ATA command structure on the stack is no longer safe. > >Please try booting with "hdx=noflush" kernel parameter or please try >the attached patch which should fix the issue (if my theory is correct). That was quick. hdx=noflush worked well. The patch delayed the disk access as you can see in [1]. After the IPI message the hd led enlightened and was active until "lost interrupt" message. The following boot process was verry slow. Basically everything what required disk access took a while and the hd-led was almost continuous on. I aborted the boot process via magic keys after about 15 minutes. [1] http://download.breakpoint.cc/lnx-2.6.24-post-long-boot.jpg Sebastian ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Current git --> kaboom [bisect] seems IDE related. 2008-02-09 21:22 ` Sebastian Siewior @ 2008-02-09 23:06 ` Bartlomiej Zolnierkiewicz 2008-02-10 5:26 ` Christoph Hellwig 0 siblings, 1 reply; 27+ messages in thread From: Bartlomiej Zolnierkiewicz @ 2008-02-09 23:06 UTC (permalink / raw) To: Sebastian Siewior Cc: Tejun Heo, Sergei Shtylyov, linux-ide, Christoph Hellwig, Jens Axboe On Saturday 09 February 2008, Sebastian Siewior wrote: > * Bartlomiej Zolnierkiewicz | 2008-02-09 21:28:39 [+0100]: > > >It could be that due to block layer changes to I/O barrier handling > >allocating ATA command structure on the stack is no longer safe. Update: it was never a very brilliant idea... > >Please try booting with "hdx=noflush" kernel parameter or please try > >the attached patch which should fix the issue (if my theory is correct). > That was quick. > hdx=noflush worked well. The patch delayed the disk access as you can > see in [1]. After the IPI message the hd led enlightened and was > active until "lost interrupt" message. The following boot process was > verry slow. Basically everything what required disk access took a while > and the hd-led was almost continuous on. I aborted the boot process via > magic keys after about 15 minutes. > > [1] http://download.breakpoint.cc/lnx-2.6.24-post-long-boot.jpg Thanks, I see now that there can be > 1 flush request queued at a given time. Please dump the old patch and try this one. [ Christoph: this may also fix your qemu/kvm+xfs problem. ] --- drivers/ide/ide-disk.c | 12 ++++++------ include/linux/blkdev.h | 5 +++++ 2 files changed, 11 insertions(+), 6 deletions(-) Index: b/drivers/ide/ide-disk.c =================================================================== --- a/drivers/ide/ide-disk.c +++ b/drivers/ide/ide-disk.c @@ -590,20 +590,20 @@ static ide_proc_entry_t idedisk_proc[] = static void idedisk_prepare_flush(struct request_queue *q, struct request *rq) { ide_drive_t *drive = q->queuedata; - ide_task_t task; + ide_task_t *task = (ide_task_t *)&rq->cmd[0]; memset(&task, 0, sizeof(task)); if (ide_id_has_flush_cache_ext(drive->id) && (drive->capacity64 >= (1UL << 28))) - task.tf.command = WIN_FLUSH_CACHE_EXT; + task->tf.command = WIN_FLUSH_CACHE_EXT; else - task.tf.command = WIN_FLUSH_CACHE; - task.tf_flags = IDE_TFLAG_OUT_TF | IDE_TFLAG_OUT_DEVICE; - task.data_phase = TASKFILE_NO_DATA; + task->tf.command = WIN_FLUSH_CACHE; + task->tf_flags = IDE_TFLAG_OUT_TF | IDE_TFLAG_OUT_DEVICE; + task->data_phase = TASKFILE_NO_DATA; rq->cmd_type = REQ_TYPE_ATA_TASKFILE; rq->cmd_flags |= REQ_SOFTBARRIER; - rq->special = &task; + rq->special = task; } /* Index: b/include/linux/blkdev.h =================================================================== --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -134,7 +134,12 @@ enum rq_flag_bits { #define REQ_ALLOCED (1 << __REQ_ALLOCED) #define REQ_RW_META (1 << __REQ_RW_META) +/* FIXME: temporary hack to make flush requests work */ +#ifdef CONFIG_IDE +#define BLK_MAX_CDB 48 /* max sizeof(ide_task_t) */ +#else #define BLK_MAX_CDB 16 +#endif /* * try to put the fields that are referenced together in the same cacheline. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Current git --> kaboom [bisect] seems IDE related. 2008-02-09 23:06 ` Bartlomiej Zolnierkiewicz @ 2008-02-10 5:26 ` Christoph Hellwig 2008-02-10 13:38 ` Bartlomiej Zolnierkiewicz 0 siblings, 1 reply; 27+ messages in thread From: Christoph Hellwig @ 2008-02-10 5:26 UTC (permalink / raw) To: Bartlomiej Zolnierkiewicz Cc: Sebastian Siewior, Tejun Heo, Sergei Shtylyov, linux-ide, Christoph Hellwig, Jens Axboe On Sun, Feb 10, 2008 at 12:06:10AM +0100, Bartlomiej Zolnierkiewicz wrote: > > >Please try booting with "hdx=noflush" kernel parameter or please try > > >the attached patch which should fix the issue (if my theory is correct). "hda=noflush hdb=noflush hdd=noflush" fixes the qemu setup for me. > Thanks, I see now that there can be > 1 flush request queued at a given time. > > Please dump the old patch and try this one. > > [ Christoph: this may also fix your qemu/kvm+xfs problem. ] It doesn't hang anymore but gives me the following oops instead (that is after fixing the build as the bigger request->cmd breaks the scsi build): debian:~/xfs-cmds/xfstests# sh check [ 95.710144] SGI XFS with ACLs, security attributes, realtime, large block numbers, no debug enabled [ 95.742472] SGI XFS Quota Management subsystem [ 95.762573] BUG: unable to handle kernel NULL pointer dereference at 0000000d [ 95.763889] IP: [<c02e0132>] idedisk_prepare_flush+0x4a/0x75 [ 95.763889] *pdpt = 000000002d87c001 *pde = 0000000000000000 [ 95.763889] Oops: 0002 [#1] SMP [ 95.763889] Modules linked in: xfs binfmt_misc dm_snapshot dm_mirror i2c_piix4 i2c_core [ 95.763889] [ 95.763889] Pid: 2331, comm: mount Not tainted (2.6.24-09479-g531021f-dirty #43) [ 95.763889] EIP: 0060:[<c02e0132>] EFLAGS: 00010093 CPU: 0 [ 95.763889] EIP is at idedisk_prepare_flush+0x4a/0x75 [ 95.763889] EAX: 00000000 EBX: eec92458 ECX: c083e9f4 EDX: eec92458 [ 95.763889] ESI: eec92000 EDI: c025ab93 EBP: ed897bec ESP: ed897be8 [ 95.763889] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 [ 95.763889] Process mount (pid: 2331, ti=ed896000 task=ed8a2000 task.ti=ed896000) [ 95.763889] Stack: 00000000 ed897c00 c025ab80 eec92000 eec9237c ed897c20 ed897c14 c025ad60 [ 95.763889] c083e9f4 eec92000 eec92000 ed897c30 c025783e c0107fb9 ed83d0e0 c083e9f4 [ 95.763889] c083e9f4 eec92000 ed897c90 c02d7467 ffffffff eecce2a0 c083e700 ed8a26f0 [ 95.763889] Call Trace: [ 95.763889] [<c025ab80>] ? queue_flush+0x5c/0x6f [ 95.763889] [<c025ad60>] ? blk_do_ordered+0xfb/0x223 [ 95.763889] [<c025783e>] ? elv_next_request+0x17b/0x1ac [ 95.763889] [<c0107fb9>] ? native_sched_clock+0x9a/0xaf [ 95.763889] [<c02d7467>] ? ide_do_request+0x243/0x829 [ 95.763889] [<c012a29c>] ? del_timer+0x4c/0x53 [ 95.763889] [<c03c546b>] ? _spin_unlock_irqrestore+0x2f/0x3c [ 95.763889] [<c012a29c>] ? del_timer+0x4c/0x53 [ 95.763889] [<c02d7d22>] ? do_ide_request+0x17/0x19 [ 95.763889] [<c025793c>] ? elv_insert+0xcd/0x1a3 [ 95.763889] [<c0258282>] ? drive_stat_acct+0x12c/0x140 [ 95.763889] [<c0257a98>] ? __elv_add_request+0x86/0x8b [ 95.763889] [<c0259f7c>] ? __make_request+0x2d9/0x31a [ 95.763889] [<c0104d3f>] ? dump_trace+0xcc/0xd8 [ 95.763889] [<c010a46e>] ? save_stack_address+0x0/0x28 [ 95.763889] [<c0258bda>] ? generic_make_request+0x360/0x38e [ 95.763889] [<c010a4e8>] ? save_stack_trace+0x1d/0x3b [ 95.763889] [<c015200e>] ? mempool_alloc_slab+0xe/0x10 [ 95.763889] [<c01520e6>] ? mempool_alloc+0x35/0xd0 [ 95.763889] [<c0259c9b>] ? submit_bio+0xc6/0xce [ 95.763889] [<c018ef05>] ? bio_add_page+0x2a/0x31 [ 95.763889] [<f0904841>] ? _xfs_buf_ioapply+0x1e6/0x20a [xfs] [ 95.763889] [<f09048a0>] ? xfs_buf_iorequest+0x3b/0x5e [xfs] [ 95.763889] [<f0907cf1>] ? xfsbdstrat+0x1e/0x2c [xfs] [ 95.763889] [<f090954e>] ? xfs_barrier_test+0x30/0x6a [xfs] [ 95.763889] [<f09095d8>] ? xfs_mountfs_check_barriers+0x50/0x7a [xfs] [ 95.763889] [<f08fbfee>] ? xfs_mount+0x249/0x2fb [xfs] [ 95.763889] [<f090a456>] ? xfs_fs_fill_super+0xb2/0x1c8 [xfs] [ 95.763889] [<c01718bd>] ? get_sb_bdev+0xe2/0x120 [ 95.763889] [<c016c613>] ? __kmalloc+0x38/0xae [ 95.763889] [<f0908dc2>] ? xfs_fs_get_sb+0x13/0x15 [xfs] [ 95.763889] [<f090a3a4>] ? xfs_fs_fill_super+0x0/0x1c8 [xfs] [ 95.763889] [<c01714cb>] ? vfs_kern_mount+0x3b/0x76 [ 95.763889] [<c017154a>] ? do_kern_mount+0x32/0xba [ 95.763889] [<c0183735>] ? do_new_mount+0x46/0x71 [ 95.763889] [<c01838c5>] ? do_mount+0x165/0x183 [ 95.763889] [<c0154755>] ? __get_free_pages+0x45/0x4c [ 95.763889] [<c0181bd3>] ? copy_mount_options+0x27/0x10b [ 95.763889] [<c018394b>] ? sys_mount+0x68/0xa4 [ 95.763889] [<c0103cae>] ? sysenter_past_esp+0x5f/0xa5 [ 95.763889] ======================= [ 95.763889] Code: 24 00 00 3d 00 24 00 00 75 1e 83 b9 8c 00 00 00 00 77 0c 81 b9 88 00 00 00 ff ff ff 0f 76 09 c6 05 0d 00 00 00 ea eb 07 8b 45 fc <c6> 40 0d e7 8b 45 fc c7 40 10 00 7e 00 00 8b 45 fc c7 40 14 00 [ 95.763889] EIP: [<c02e0132>] idedisk_prepare_flush+0x4a/0x75 SS:ESP 0068:ed897be8 [ 95.763889] ---[ end trace 3d3002e6db2539b3 ]--- ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Current git --> kaboom [bisect] seems IDE related. 2008-02-10 5:26 ` Christoph Hellwig @ 2008-02-10 13:38 ` Bartlomiej Zolnierkiewicz 2008-02-10 14:19 ` James Bottomley 2008-02-10 14:43 ` Christoph Hellwig 0 siblings, 2 replies; 27+ messages in thread From: Bartlomiej Zolnierkiewicz @ 2008-02-10 13:38 UTC (permalink / raw) To: Christoph Hellwig Cc: Sebastian Siewior, Tejun Heo, Sergei Shtylyov, linux-ide, Jens Axboe, James Bottomley On Sunday 10 February 2008, Christoph Hellwig wrote: > On Sun, Feb 10, 2008 at 12:06:10AM +0100, Bartlomiej Zolnierkiewicz wrote: > > > >Please try booting with "hdx=noflush" kernel parameter or please try > > > >the attached patch which should fix the issue (if my theory is correct). > > "hda=noflush hdb=noflush hdd=noflush" fixes the qemu setup for me. Thanks for testing. > > Thanks, I see now that there can be > 1 flush request queued at a given time. > > > > Please dump the old patch and try this one. > > > > [ Christoph: this may also fix your qemu/kvm+xfs problem. ] > > It doesn't hang anymore but gives me the following oops instead (that is > after fixing the build as the bigger request->cmd breaks the scsi > build): [...] The OOPS is most likely (again) my fault - I was rushing out to push out the fix and memset() line didn't get converted. I prepared the new patch, documented it and started looking into SCSI build breakage... and I no longer feel comfortable with the hack :( It seems that fixing IDE properly will be easier than auditing the whole SCSI for all the weird assumptions on rq->cmd[] size (James?) so I'm back to the code, in the meantime here's the updated patch: From: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Subject: [PATCH] ide-disk: fix flush requests commit 813a0eb233ee67d7166241a8b389b6a76f2247f9 Author: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Date: Fri Jan 25 22:17:10 2008 +0100 ide: switch idedisk_prepare_flush() to use REQ_TYPE_ATA_TASKFILE requests ... broke flush requests. Allocating IDE command structure on the stack for flush requests is not a very brilliant idea: - idedisk_prepare_flush() only prepares the request and it doesn't wait for it to be completed - there are can be multiple flush requests queued in the queue Fix it by temporarily increasing size of BLK_MAX_CDB to accomodate for ide_task_t structure if IDE subsystem is going to be used. [ Jens: This will be fixed properly before 2.6.25 but this bug is rather critical and the proper solution requires some more work + testing. ] Thanks to Sebastian Siewior for bisecting/reporting the problem. Cc: Sebastian Siewior <ide-bug@ml.breakpoint.cc> Cc: Christoph Hellwig <hch@infradead.org> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Tejun Heo <htejun@gmail.com> Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> --- drivers/ide/ide-disk.c | 14 +++++++------- include/linux/blkdev.h | 5 +++++ 2 files changed, 12 insertions(+), 7 deletions(-) Index: b/drivers/ide/ide-disk.c =================================================================== --- a/drivers/ide/ide-disk.c +++ b/drivers/ide/ide-disk.c @@ -590,20 +590,20 @@ static ide_proc_entry_t idedisk_proc[] = static void idedisk_prepare_flush(struct request_queue *q, struct request *rq) { ide_drive_t *drive = q->queuedata; - ide_task_t task; + ide_task_t *task = (ide_task_t *)&rq->cmd[0]; - memset(&task, 0, sizeof(task)); + memset(task, 0, sizeof(*task)); if (ide_id_has_flush_cache_ext(drive->id) && (drive->capacity64 >= (1UL << 28))) - task.tf.command = WIN_FLUSH_CACHE_EXT; + task->tf.command = WIN_FLUSH_CACHE_EXT; else - task.tf.command = WIN_FLUSH_CACHE; - task.tf_flags = IDE_TFLAG_OUT_TF | IDE_TFLAG_OUT_DEVICE; - task.data_phase = TASKFILE_NO_DATA; + task->tf.command = WIN_FLUSH_CACHE; + task->tf_flags = IDE_TFLAG_OUT_TF | IDE_TFLAG_OUT_DEVICE; + task->data_phase = TASKFILE_NO_DATA; rq->cmd_type = REQ_TYPE_ATA_TASKFILE; rq->cmd_flags |= REQ_SOFTBARRIER; - rq->special = &task; + rq->special = task; } /* Index: b/include/linux/blkdev.h =================================================================== --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -134,7 +134,12 @@ enum rq_flag_bits { #define REQ_ALLOCED (1 << __REQ_ALLOCED) #define REQ_RW_META (1 << __REQ_RW_META) +/* FIXME: temporary hack to make flush requests work */ +#ifdef CONFIG_IDE +#define BLK_MAX_CDB 48 /* max sizeof(ide_task_t) */ +#else #define BLK_MAX_CDB 16 +#endif /* * try to put the fields that are referenced together in the same cacheline. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Current git --> kaboom [bisect] seems IDE related. 2008-02-10 13:38 ` Bartlomiej Zolnierkiewicz @ 2008-02-10 14:19 ` James Bottomley 2008-02-10 18:32 ` Bartlomiej Zolnierkiewicz 2008-02-10 14:43 ` Christoph Hellwig 1 sibling, 1 reply; 27+ messages in thread From: James Bottomley @ 2008-02-10 14:19 UTC (permalink / raw) To: Bartlomiej Zolnierkiewicz Cc: Christoph Hellwig, Sebastian Siewior, Tejun Heo, Sergei Shtylyov, linux-ide, Jens Axboe On Sun, 2008-02-10 at 14:38 +0100, Bartlomiej Zolnierkiewicz wrote: > On Sunday 10 February 2008, Christoph Hellwig wrote: > > On Sun, Feb 10, 2008 at 12:06:10AM +0100, Bartlomiej Zolnierkiewicz wrote: > > > > >Please try booting with "hdx=noflush" kernel parameter or please try > > > > >the attached patch which should fix the issue (if my theory is correct). > > > > "hda=noflush hdb=noflush hdd=noflush" fixes the qemu setup for me. > > Thanks for testing. > > > > Thanks, I see now that there can be > 1 flush request queued at a given time. > > > > > > Please dump the old patch and try this one. > > > > > > [ Christoph: this may also fix your qemu/kvm+xfs problem. ] > > > > It doesn't hang anymore but gives me the following oops instead (that is > > after fixing the build as the bigger request->cmd breaks the scsi > > build): > > [...] > > The OOPS is most likely (again) my fault - I was rushing out to push out > the fix and memset() line didn't get converted. > > I prepared the new patch, documented it and started looking into SCSI > build breakage... and I no longer feel comfortable with the hack :( > > It seems that fixing IDE properly will be easier than auditing the whole > SCSI for all the weird assumptions on rq->cmd[] size (James?) so I'm back > to the code, in the meantime here's the updated patch: Doing something like this would have to be audited in SCSI ... we do assume sizeof(rq->cmd) == sizeof(scmd->cmnd) which will no longer be true. As long as sizeof(rq->cmd) is never used in SCSI code, it's probably safe. Although raising MAX_CDB by a factor of three has memory concerns as well, which aren't trivial and make this a bit too much of a hack. It's also incredibly fragile given that either ide_task_t could increase in size or someone could reduce MAX_CDB both with fatal consequences. Why not just use kmalloc(GFP_ATOMIC) instead? That will succeed 99% of the time and you can turn barriers off in a failure case. You'll have to free it in ide_end_drive_cmd(), but I think you've got (just) a spare tf_flag to mark a volatile task that needs kfree here. James ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Current git --> kaboom [bisect] seems IDE related. 2008-02-10 14:19 ` James Bottomley @ 2008-02-10 18:32 ` Bartlomiej Zolnierkiewicz 2008-02-10 19:51 ` Sebastian Siewior 2008-02-11 16:30 ` Sergei Shtylyov 0 siblings, 2 replies; 27+ messages in thread From: Bartlomiej Zolnierkiewicz @ 2008-02-10 18:32 UTC (permalink / raw) To: James Bottomley Cc: Christoph Hellwig, Sebastian Siewior, Tejun Heo, Sergei Shtylyov, linux-ide, Jens Axboe On Sunday 10 February 2008, James Bottomley wrote: > On Sun, 2008-02-10 at 14:38 +0100, Bartlomiej Zolnierkiewicz wrote: > > On Sunday 10 February 2008, Christoph Hellwig wrote: > > > On Sun, Feb 10, 2008 at 12:06:10AM +0100, Bartlomiej Zolnierkiewicz wrote: > > > > > >Please try booting with "hdx=noflush" kernel parameter or please try > > > > > >the attached patch which should fix the issue (if my theory is correct). > > > > > > "hda=noflush hdb=noflush hdd=noflush" fixes the qemu setup for me. > > > > Thanks for testing. > > > > > > Thanks, I see now that there can be > 1 flush request queued at a given time. > > > > > > > > Please dump the old patch and try this one. > > > > > > > > [ Christoph: this may also fix your qemu/kvm+xfs problem. ] > > > > > > It doesn't hang anymore but gives me the following oops instead (that is > > > after fixing the build as the bigger request->cmd breaks the scsi > > > build): > > > > [...] > > > > The OOPS is most likely (again) my fault - I was rushing out to push out > > the fix and memset() line didn't get converted. > > > > I prepared the new patch, documented it and started looking into SCSI > > build breakage... and I no longer feel comfortable with the hack :( > > > > It seems that fixing IDE properly will be easier than auditing the whole > > SCSI for all the weird assumptions on rq->cmd[] size (James?) so I'm back > > to the code, in the meantime here's the updated patch: > > Doing something like this would have to be audited in SCSI ... we do > assume sizeof(rq->cmd) == sizeof(scmd->cmnd) which will no longer be > true. As long as sizeof(rq->cmd) is never used in SCSI code, it's > probably safe. > > Although raising MAX_CDB by a factor of three has memory concerns as > well, which aren't trivial and make this a bit too much of a hack. It's > also incredibly fragile given that either ide_task_t could increase in > size or someone could reduce MAX_CDB both with fatal consequences. > > Why not just use kmalloc(GFP_ATOMIC) instead? That will succeed 99% of > the time and you can turn barriers off in a failure case. You'll have It seems to be too late to turn barriers off as all of the above happens _inside_ prepare_flush_fn function. Nevertheless this is a much nicer workaround and it should be sufficent for the time being - thanks James! > to free it in ide_end_drive_cmd(), but I think you've got (just) a spare > tf_flag to mark a volatile task that needs kfree here. My precious last tf_flag... fortunately some other ones can be recycled... Sebastian/Christoph, please test the final patch (after your ACK I'll push it to Linus together with the rest of pending IDE fixes). From: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Subject: [PATCH] ide-disk: fix flush requests (take 2) commit 813a0eb233ee67d7166241a8b389b6a76f2247f9 Author: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Date: Fri Jan 25 22:17:10 2008 +0100 ide: switch idedisk_prepare_flush() to use REQ_TYPE_ATA_TASKFILE requests ... broke flush requests. Allocating IDE command structure on the stack for flush requests is not a very brilliant idea: - idedisk_prepare_flush() only prepares the request and it doesn't wait for it to be completed - there are can be multiple flush requests queued in the queue Fix the problem (per hints from James Bottomley) by: - dynamically allocating ide_task_t instance using kmalloc(..., GFP_ATOMIC) - adding new taskfile flag (IDE_TFLAG_DYN) - calling kfree() in ide_end_drive_command() if IDE_TFLAG_DYN is set (while at it rename 'args' to 'task' and fix whitespace damage) [ This will be fixed properly before 2.6.25 but this bug is rather critical and the proper solution requires some more work + testing. ] Thanks to Sebastian Siewior and Christoph Hellwig for reporitng the problem and testing patches (extra thanks to Sebastian for bisecting it to the guilty commmit). Cc: Sebastian Siewior <ide-bug@ml.breakpoint.cc> Cc: Christoph Hellwig <hch@infradead.org> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Tejun Heo <htejun@gmail.com> Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> --- drivers/ide/ide-disk.c | 18 +++++++++++------- drivers/ide/ide-io.c | 16 ++++++++++------ include/linux/ide.h | 2 ++ 3 files changed, 23 insertions(+), 13 deletions(-) Index: b/drivers/ide/ide-disk.c =================================================================== --- a/drivers/ide/ide-disk.c +++ b/drivers/ide/ide-disk.c @@ -590,20 +590,24 @@ static ide_proc_entry_t idedisk_proc[] = static void idedisk_prepare_flush(struct request_queue *q, struct request *rq) { ide_drive_t *drive = q->queuedata; - ide_task_t task; + ide_task_t *task = kmalloc(sizeof(*task), GFP_ATOMIC); - memset(&task, 0, sizeof(task)); + /* FIXME: map struct ide_taskfile on rq->cmd[] */ + BUG_ON(task == NULL); + + memset(task, 0, sizeof(*task)); if (ide_id_has_flush_cache_ext(drive->id) && (drive->capacity64 >= (1UL << 28))) - task.tf.command = WIN_FLUSH_CACHE_EXT; + task->tf.command = WIN_FLUSH_CACHE_EXT; else - task.tf.command = WIN_FLUSH_CACHE; - task.tf_flags = IDE_TFLAG_OUT_TF | IDE_TFLAG_OUT_DEVICE; - task.data_phase = TASKFILE_NO_DATA; + task->tf.command = WIN_FLUSH_CACHE; + task->tf_flags = IDE_TFLAG_OUT_TF | IDE_TFLAG_OUT_DEVICE | + IDE_TFLAG_DYN; + task->data_phase = TASKFILE_NO_DATA; rq->cmd_type = REQ_TYPE_ATA_TASKFILE; rq->cmd_flags |= REQ_SOFTBARRIER; - rq->special = &task; + rq->special = task; } /* Index: b/drivers/ide/ide-io.c =================================================================== --- a/drivers/ide/ide-io.c +++ b/drivers/ide/ide-io.c @@ -361,17 +361,21 @@ void ide_end_drive_cmd (ide_drive_t *dri spin_unlock_irqrestore(&ide_lock, flags); if (rq->cmd_type == REQ_TYPE_ATA_TASKFILE) { - ide_task_t *args = (ide_task_t *) rq->special; + ide_task_t *task = (ide_task_t *)rq->special; + if (rq->errors == 0) - rq->errors = !OK_STAT(stat,READY_STAT,BAD_STAT); - - if (args) { - struct ide_taskfile *tf = &args->tf; + rq->errors = !OK_STAT(stat, READY_STAT, BAD_STAT); + + if (task) { + struct ide_taskfile *tf = &task->tf; tf->error = err; tf->status = stat; - ide_tf_read(drive, args); + ide_tf_read(drive, task); + + if (task->tf_flags & IDE_TFLAG_DYN) + kfree(task); } } else if (blk_pm_request(rq)) { struct request_pm_state *pm = rq->data; Index: b/include/linux/ide.h =================================================================== --- a/include/linux/ide.h +++ b/include/linux/ide.h @@ -906,6 +906,8 @@ enum { IDE_TFLAG_IN_DEVICE, /* force 16-bit I/O operations */ IDE_TFLAG_IO_16BIT = (1 << 30), + /* ide_task_t was allocated using kmalloc() */ + IDE_TFLAG_DYN = (1 << 31), }; struct ide_taskfile { ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Current git --> kaboom [bisect] seems IDE related. 2008-02-10 18:32 ` Bartlomiej Zolnierkiewicz @ 2008-02-10 19:51 ` Sebastian Siewior 2008-02-10 23:16 ` Bartlomiej Zolnierkiewicz 2008-02-11 16:30 ` Sergei Shtylyov 1 sibling, 1 reply; 27+ messages in thread From: Sebastian Siewior @ 2008-02-10 19:51 UTC (permalink / raw) To: Bartlomiej Zolnierkiewicz Cc: James Bottomley, Christoph Hellwig, Tejun Heo, Sergei Shtylyov, linux-ide, Jens Axboe * Bartlomiej Zolnierkiewicz | 2008-02-10 19:32:06 [+0100]: >Sebastian/Christoph, please test the final patch (after your ACK I'll push >it to Linus together with the rest of pending IDE fixes). This seems to work. Sebastian ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Current git --> kaboom [bisect] seems IDE related. 2008-02-10 19:51 ` Sebastian Siewior @ 2008-02-10 23:16 ` Bartlomiej Zolnierkiewicz 0 siblings, 0 replies; 27+ messages in thread From: Bartlomiej Zolnierkiewicz @ 2008-02-10 23:16 UTC (permalink / raw) To: Sebastian Siewior Cc: James Bottomley, Christoph Hellwig, Tejun Heo, Sergei Shtylyov, linux-ide, Jens Axboe On Sunday 10 February 2008, Sebastian Siewior wrote: > * Bartlomiej Zolnierkiewicz | 2008-02-10 19:32:06 [+0100]: > > >Sebastian/Christoph, please test the final patch (after your ACK I'll push > >it to Linus together with the rest of pending IDE fixes). > This seems to work. Thanks. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Current git --> kaboom [bisect] seems IDE related. 2008-02-10 18:32 ` Bartlomiej Zolnierkiewicz 2008-02-10 19:51 ` Sebastian Siewior @ 2008-02-11 16:30 ` Sergei Shtylyov 2008-02-11 19:41 ` Bartlomiej Zolnierkiewicz 1 sibling, 1 reply; 27+ messages in thread From: Sergei Shtylyov @ 2008-02-11 16:30 UTC (permalink / raw) To: Bartlomiej Zolnierkiewicz Cc: James Bottomley, Christoph Hellwig, Sebastian Siewior, Tejun Heo, linux-ide, Jens Axboe Bartlomiej Zolnierkiewicz wrote: > From: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> > Subject: [PATCH] ide-disk: fix flush requests (take 2) > commit 813a0eb233ee67d7166241a8b389b6a76f2247f9 > Author: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> > Date: Fri Jan 25 22:17:10 2008 +0100 > ide: switch idedisk_prepare_flush() to use REQ_TYPE_ATA_TASKFILE requests > ... > broke flush requests. > Allocating IDE command structure on the stack for flush requests is not > a very brilliant idea: > - idedisk_prepare_flush() only prepares the request and it doesn't wait > for it to be completed > - there are can be multiple flush requests queued in the queue > Fix the problem (per hints from James Bottomley) by: > - dynamically allocating ide_task_t instance using kmalloc(..., GFP_ATOMIC) > - adding new taskfile flag (IDE_TFLAG_DYN) > - calling kfree() in ide_end_drive_command() if IDE_TFLAG_DYN is set > (while at it rename 'args' to 'task' and fix whitespace damage) > [ This will be fixed properly before 2.6.25 but this bug is rather > critical and the proper solution requires some more work + testing. ] > Thanks to Sebastian Siewior and Christoph Hellwig for reporitng the > problem and testing patches (extra thanks to Sebastian for bisecting > it to the guilty commmit). > > Cc: Sebastian Siewior <ide-bug@ml.breakpoint.cc> > Cc: Christoph Hellwig <hch@infradead.org> > Cc: James Bottomley <James.Bottomley@HansenPartnership.com> > Cc: Jens Axboe <jens.axboe@oracle.com> > Cc: Tejun Heo <htejun@gmail.com> > Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com> > Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> > Index: b/drivers/ide/ide-disk.c > =================================================================== > --- a/drivers/ide/ide-disk.c > +++ b/drivers/ide/ide-disk.c > @@ -590,20 +590,24 @@ static ide_proc_entry_t idedisk_proc[] = > static void idedisk_prepare_flush(struct request_queue *q, struct request *rq) > { > ide_drive_t *drive = q->queuedata; > - ide_task_t task; > + ide_task_t *task = kmalloc(sizeof(*task), GFP_ATOMIC); > > - memset(&task, 0, sizeof(task)); > + /* FIXME: map struct ide_taskfile on rq->cmd[] */ > + BUG_ON(task == NULL); > + > + memset(task, 0, sizeof(*task)); Why not kzalloc() then? MBR, Sergei ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Current git --> kaboom [bisect] seems IDE related. 2008-02-11 16:30 ` Sergei Shtylyov @ 2008-02-11 19:41 ` Bartlomiej Zolnierkiewicz 0 siblings, 0 replies; 27+ messages in thread From: Bartlomiej Zolnierkiewicz @ 2008-02-11 19:41 UTC (permalink / raw) To: Sergei Shtylyov Cc: James Bottomley, Christoph Hellwig, Sebastian Siewior, Tejun Heo, linux-ide, Jens Axboe On Monday 11 February 2008, Sergei Shtylyov wrote: > Bartlomiej Zolnierkiewicz wrote: > > > From: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> > > Subject: [PATCH] ide-disk: fix flush requests (take 2) > > > commit 813a0eb233ee67d7166241a8b389b6a76f2247f9 > > Author: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> > > Date: Fri Jan 25 22:17:10 2008 +0100 > > > ide: switch idedisk_prepare_flush() to use REQ_TYPE_ATA_TASKFILE requests > > > ... > > > broke flush requests. > > > Allocating IDE command structure on the stack for flush requests is not > > a very brilliant idea: > > > - idedisk_prepare_flush() only prepares the request and it doesn't wait > > for it to be completed > > > - there are can be multiple flush requests queued in the queue > > > Fix the problem (per hints from James Bottomley) by: > > - dynamically allocating ide_task_t instance using kmalloc(..., GFP_ATOMIC) > > - adding new taskfile flag (IDE_TFLAG_DYN) > > - calling kfree() in ide_end_drive_command() if IDE_TFLAG_DYN is set > > (while at it rename 'args' to 'task' and fix whitespace damage) > > > [ This will be fixed properly before 2.6.25 but this bug is rather > > critical and the proper solution requires some more work + testing. ] > > > Thanks to Sebastian Siewior and Christoph Hellwig for reporitng the > > problem and testing patches (extra thanks to Sebastian for bisecting > > it to the guilty commmit). > > > > Cc: Sebastian Siewior <ide-bug@ml.breakpoint.cc> > > Cc: Christoph Hellwig <hch@infradead.org> > > Cc: James Bottomley <James.Bottomley@HansenPartnership.com> > > Cc: Jens Axboe <jens.axboe@oracle.com> > > Cc: Tejun Heo <htejun@gmail.com> > > Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com> > > Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> > > Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> > > > Index: b/drivers/ide/ide-disk.c > > =================================================================== > > --- a/drivers/ide/ide-disk.c > > +++ b/drivers/ide/ide-disk.c > > @@ -590,20 +590,24 @@ static ide_proc_entry_t idedisk_proc[] = > > static void idedisk_prepare_flush(struct request_queue *q, struct request *rq) > > { > > ide_drive_t *drive = q->queuedata; > > - ide_task_t task; > > + ide_task_t *task = kmalloc(sizeof(*task), GFP_ATOMIC); > > > > - memset(&task, 0, sizeof(task)); > > + /* FIXME: map struct ide_taskfile on rq->cmd[] */ > > + BUG_ON(task == NULL); > > + > > + memset(task, 0, sizeof(*task)); > > Why not kzalloc() then? Well, yes. However since the patch has been already merged upstream and it is only a temporary solution I don't think it is worth doing it now (OTOH if somebody sends me a patch I'll happily merge it). Thanks, Bart ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Current git --> kaboom [bisect] seems IDE related. 2008-02-10 13:38 ` Bartlomiej Zolnierkiewicz 2008-02-10 14:19 ` James Bottomley @ 2008-02-10 14:43 ` Christoph Hellwig 2008-02-10 15:07 ` Boaz Harrosh 1 sibling, 1 reply; 27+ messages in thread From: Christoph Hellwig @ 2008-02-10 14:43 UTC (permalink / raw) To: Bartlomiej Zolnierkiewicz Cc: Christoph Hellwig, Sebastian Siewior, Tejun Heo, Sergei Shtylyov, linux-ide, Jens Axboe, James Bottomley, linux-scsi On Sun, Feb 10, 2008 at 02:38:46PM +0100, Bartlomiej Zolnierkiewicz wrote: > The OOPS is most likely (again) my fault - I was rushing out to push out > the fix and memset() line didn't get converted. The new patch works fine for me. > I prepared the new patch, documented it and started looking into SCSI > build breakage... and I no longer feel comfortable with the hack :( > > It seems that fixing IDE properly will be easier than auditing the whole > SCSI for all the weird assumptions on rq->cmd[] size (James?) so I'm back > to the code, in the meantime here's the updated patch: Yeah, this is quite nasty. I'll attach the patch below which just rejects a command in scsi_setup_blk_pc_cmnd if it's too large for the scsi_cmnd cmnd array. This is probably enough but I haven't audited all of the scsi code yet. But as James said this is too much of a memory vastage to put it into the tree. Long-term the Panasas folks have looked into killing the scsi_cmnd.cmnd filed entirely and make the struct request.cmd field dynamically sized which would solve your problem, but probably won't be ready for 2.6.25. Index: linux-2.6/drivers/scsi/scsi_lib.c =================================================================== --- linux-2.6.orig/drivers/scsi/scsi_lib.c 2008-02-10 07:49:50.000000000 +0100 +++ linux-2.6/drivers/scsi/scsi_lib.c 2008-02-10 15:19:42.000000000 +0100 @@ -1129,7 +1129,12 @@ int scsi_setup_blk_pc_cmnd(struct scsi_d req->buffer = NULL; } - BUILD_BUG_ON(sizeof(req->cmd) > sizeof(cmd->cmnd)); + if (req->cmd_len > sizeof(cmd->cmnd)) { + scsi_release_buffers(cmd); + scsi_put_command(cmd); + return BLKPREP_KILL; + } + memcpy(cmd->cmnd, req->cmd, sizeof(cmd->cmnd)); cmd->cmd_len = req->cmd_len; if (!req->data_len) ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Current git --> kaboom [bisect] seems IDE related. 2008-02-10 14:43 ` Christoph Hellwig @ 2008-02-10 15:07 ` Boaz Harrosh 2008-02-10 18:59 ` [PATCHSET 0/3] varlen extended and vendor-specific cdbs Boaz Harrosh 0 siblings, 1 reply; 27+ messages in thread From: Boaz Harrosh @ 2008-02-10 15:07 UTC (permalink / raw) To: Christoph Hellwig Cc: Bartlomiej Zolnierkiewicz, Sebastian Siewior, Tejun Heo, Sergei Shtylyov, linux-ide, Jens Axboe, James Bottomley, linux-scsi On Sun, Feb 10 2008 at 16:43 +0200, Christoph Hellwig <hch@infradead.org> wrote: > On Sun, Feb 10, 2008 at 02:38:46PM +0100, Bartlomiej Zolnierkiewicz wrote: >> The OOPS is most likely (again) my fault - I was rushing out to push out >> the fix and memset() line didn't get converted. > > The new patch works fine for me. > >> I prepared the new patch, documented it and started looking into SCSI >> build breakage... and I no longer feel comfortable with the hack :( >> >> It seems that fixing IDE properly will be easier than auditing the whole >> SCSI for all the weird assumptions on rq->cmd[] size (James?) so I'm back >> to the code, in the meantime here's the updated patch: > > Yeah, this is quite nasty. I'll attach the patch below which just > rejects a command in scsi_setup_blk_pc_cmnd if it's too large for > the scsi_cmnd cmnd array. This is probably enough but I haven't > audited all of the scsi code yet. But as James said this is > too much of a memory vastage to put it into the tree. > > Long-term the Panasas folks have looked into killing the scsi_cmnd.cmnd > filed entirely and make the struct request.cmd field dynamically sized > which would solve your problem, but probably won't be ready for 2.6.25. > > <snip> As far as I'm concerned it is very ready, and I have sent a last version for inclusion into 2.6.25. - There is a very minor patch-ability problem between last patchset and scsi-misc I will resend the pachset as a reply to this mail. - Since I never got any comments from Jens or James, this code was never accepted into -mm. So it was not widely tested. Though I have thrown every test I can on these patches. But that is still, a very limited testing. If people have a bit of spare time, please review. For some of us it is very important Thanks Boaz ^ permalink raw reply [flat|nested] 27+ messages in thread
* [PATCHSET 0/3] varlen extended and vendor-specific cdbs 2008-02-10 15:07 ` Boaz Harrosh @ 2008-02-10 18:59 ` Boaz Harrosh 2008-02-10 19:05 ` Subject: [PATCH 1/3] Let scsi_cmnd->cmnd use request->cmd buffer Boaz Harrosh ` (2 more replies) 0 siblings, 3 replies; 27+ messages in thread From: Boaz Harrosh @ 2008-02-10 18:59 UTC (permalink / raw) To: Christoph Hellwig, Bartlomiej Zolnierkiewicz, Jens Axboe, James Bottomley Cc: Sebastian Siewior, Tejun Heo, Sergei Shtylyov, linux-ide, linux-scsi Submitted is a patchset for adding support for variable-length, extended, and vendor specific CDBs. It should now cover the entire range of the SCSI standard. (and/or any other use of command packets in block devices) They are based on scsi-misc. Difference from last time, is at struct request. I did a smallish hack to save the extra pointer space. The pointer now shares it's space with the request->cmd, as they are mutual exclusive. The flag to switch between them is when cmd_len == 0 and varlen_cdb_len != 0 I've added 2 accessors to hide the mess. I think this approach should be safe. I can easily go back to the separate pointer, but I figured a little hack is better then a bloat. This is on top of the size shrink to struct scsi_cmnd gained in the first patch. We save upto 12 bytes on 32 bit ARCHs So over all, this cleans things up, and add fixtures without any extra cost. [1/3] Let scsi_cmnd->cmnd use request->cmd buffer Here I let scsi_cmnd->cmnd point to the space allocated by request->cmd, instead of copying the data. The scsi_cmnd->cmd_len is guaranteed to contain the right length of the command. I have tried to go over every single place in the kernel that uses scsi_cmnd->cmnd and make sure it looks sane. Surprisingly to me, that was not at all bad. I hope I did not miss anything. I've tested on an x86_64 machine booting from a sata disk and ran the iscsi regression tests as well as my bidi and varlen tests on top of the complete patchset and all tests passed. [2/3] block layer varlen-cd Here I added an option to use a user supplied buffer for an arbitrary large command. Buffer must be kept valid until execution of request ends. The pointer to the buffer shares it's space with the fixed length command, so the size of struct request does not change. [3/3] scsi: varlen extended and vendor-specific cdbs Adds support for variable length, extended, and vendor-specific cdbs at the scsi mid-layer. Bart: If you need this infrastructure see second patch for an easy to use inline accessors on struct request. If you then need to use them in a scsi LLD, thats easy same as before only cmd_len will be bigger then 16. If you need to use them in a block LLD. You need to use the accessors and an extra if() statement. See 3rd patch on how scsi_lib.c uses it. Boaz ^ permalink raw reply [flat|nested] 27+ messages in thread
* Subject: [PATCH 1/3] Let scsi_cmnd->cmnd use request->cmd buffer 2008-02-10 18:59 ` [PATCHSET 0/3] varlen extended and vendor-specific cdbs Boaz Harrosh @ 2008-02-10 19:05 ` Boaz Harrosh 2008-02-12 17:45 ` Christoph Hellwig 2008-02-12 19:41 ` James Bottomley 2008-02-10 19:09 ` [PATCH 2/3] block layer varlen-cdb Boaz Harrosh 2008-02-10 19:12 ` [PATCH 3/3] scsi: varlen extended and vendor-specific cdbs Boaz Harrosh 2 siblings, 2 replies; 27+ messages in thread From: Boaz Harrosh @ 2008-02-10 19:05 UTC (permalink / raw) To: Christoph Hellwig, Bartlomiej Zolnierkiewicz, Jens Axboe, James Bottomley Cc: Sebastian Siewior, Tejun Heo, Sergei Shtylyov, linux-ide, linux-scsi - struct scsi_cmnd had a 16 bytes command buffer of its own. This is an unnecessary duplication and copy of request's cmd. It is probably left overs from the time that scsi_cmnd could function without a request attached. So clean that up. - Once above is done, few places, apart from scsi-ml, needed adjustments due to changing the data type of scsi_cmnd->cmnd. - Lots of drivers still use MAX_COMMAND_SIZE. So I have left that #define but equate it to BLK_MAX_CDB. The way I see it and is reflected in the patch below is. MAX_COMMAND_SIZE - means: The longest fixed-length (*) SCSI CDB as per the SCSI standard and is not related to the implementation. BLK_MAX_CDB. - The allocated space at the request level (*)fixed-length here means commands that their size can be determined by their opcode and the CDB does not carry a length specifier, like the VARIABLE_LENGTH_CMD(0x7f) command. This is actually not exactly true and the SCSI standard also defines extended commands and vendor specific commands that can be bigger than 16 bytes. The kernel will support these using the same infrastructure used for VARLEN CDB's. So in effect MAX_COMMAND_SIZE means the maximum size command scsi-ml supports without specifying a cmd_len by ULD's Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> --- drivers/firewire/fw-sbp2.c | 2 +- drivers/s390/scsi/zfcp_dbf.c | 2 +- drivers/s390/scsi/zfcp_fsf.c | 2 +- drivers/scsi/53c700.c | 6 +++--- drivers/scsi/a100u2w.c | 2 +- drivers/scsi/hptiop.c | 6 +++--- drivers/scsi/ibmvscsi/ibmvscsi.c | 2 +- drivers/scsi/initio.c | 2 +- drivers/scsi/qla1280.c | 4 ++-- drivers/scsi/scsi_error.c | 14 ++++++++------ drivers/scsi/scsi_lib.c | 5 +++-- drivers/scsi/scsi_tgt_lib.c | 2 ++ drivers/usb/storage/isd200.c | 2 ++ include/scsi/scsi_cmnd.h | 9 +++++++-- include/scsi/scsi_eh.h | 4 ++-- 15 files changed, 38 insertions(+), 26 deletions(-) diff --git a/drivers/firewire/fw-sbp2.c b/drivers/firewire/fw-sbp2.c index 19ece9b..1d9602b 100644 --- a/drivers/firewire/fw-sbp2.c +++ b/drivers/firewire/fw-sbp2.c @@ -1254,7 +1254,7 @@ static int sbp2_scsi_queuecommand(struct scsi_cmnd *cmd, scsi_done_fn_t done) memset(orb->request.command_block, 0, sizeof(orb->request.command_block)); - memcpy(orb->request.command_block, cmd->cmnd, COMMAND_SIZE(*cmd->cmnd)); + memcpy(orb->request.command_block, cmd->cmnd, cmd->cmd_len); orb->base.callback = complete_command_orb; orb->base.request_bus = diff --git a/drivers/s390/scsi/zfcp_dbf.c b/drivers/s390/scsi/zfcp_dbf.c index 701046c..7a7f619 100644 --- a/drivers/s390/scsi/zfcp_dbf.c +++ b/drivers/s390/scsi/zfcp_dbf.c @@ -724,7 +724,7 @@ _zfcp_scsi_dbf_event_common(const char *tag, const char *tag2, int level, rec->scsi_result = scsi_cmnd->result; rec->scsi_cmnd = (unsigned long)scsi_cmnd; rec->scsi_serial = scsi_cmnd->serial_number; - memcpy(rec->scsi_opcode, &scsi_cmnd->cmnd, + memcpy(rec->scsi_opcode, scsi_cmnd->cmnd, min((int)scsi_cmnd->cmd_len, ZFCP_DBF_SCSI_OPCODE)); rec->scsi_retries = scsi_cmnd->retries; diff --git a/drivers/s390/scsi/zfcp_fsf.c b/drivers/s390/scsi/zfcp_fsf.c index 0dff058..1abbac5 100644 --- a/drivers/s390/scsi/zfcp_fsf.c +++ b/drivers/s390/scsi/zfcp_fsf.c @@ -4220,7 +4220,7 @@ zfcp_fsf_send_fcp_command_task_handler(struct zfcp_fsf_req *fsf_req) ZFCP_LOG_TRACE("scpnt->result =0x%x, command was:\n", scpnt->result); ZFCP_HEX_DUMP(ZFCP_LOG_LEVEL_TRACE, - (void *) &scpnt->cmnd, scpnt->cmd_len); + scpnt->cmnd, scpnt->cmd_len); ZFCP_LOG_TRACE("%i bytes sense data provided by FCP\n", fcp_rsp_iu->fcp_sns_len); diff --git a/drivers/scsi/53c700.c b/drivers/scsi/53c700.c index f4c4fe9..f5a9add 100644 --- a/drivers/scsi/53c700.c +++ b/drivers/scsi/53c700.c @@ -599,7 +599,7 @@ NCR_700_scsi_done(struct NCR_700_Host_Parameters *hostdata, (struct NCR_700_command_slot *)SCp->host_scribble; dma_unmap_single(hostdata->dev, slot->pCmd, - sizeof(SCp->cmnd), DMA_TO_DEVICE); + MAX_COMMAND_SIZE, DMA_TO_DEVICE); if (slot->flags == NCR_700_FLAG_AUTOSENSE) { char *cmnd = NCR_700_get_sense_cmnd(SCp->device); #ifdef NCR_700_DEBUG @@ -1004,7 +1004,7 @@ process_script_interrupt(__u32 dsps, __u32 dsp, struct scsi_cmnd *SCp, * here */ NCR_700_unmap(hostdata, SCp, slot); dma_unmap_single(hostdata->dev, slot->pCmd, - sizeof(SCp->cmnd), + MAX_COMMAND_SIZE, DMA_TO_DEVICE); cmnd[0] = REQUEST_SENSE; @@ -1901,7 +1901,7 @@ NCR_700_queuecommand(struct scsi_cmnd *SCp, void (*done)(struct scsi_cmnd *)) } slot->resume_offset = 0; slot->pCmd = dma_map_single(hostdata->dev, SCp->cmnd, - sizeof(SCp->cmnd), DMA_TO_DEVICE); + MAX_COMMAND_SIZE, DMA_TO_DEVICE); NCR_700_start_command(SCp); return 0; } diff --git a/drivers/scsi/a100u2w.c b/drivers/scsi/a100u2w.c index f608d4a..6335637 100644 --- a/drivers/scsi/a100u2w.c +++ b/drivers/scsi/a100u2w.c @@ -894,7 +894,7 @@ static void inia100_build_scb(struct orc_host * host, struct orc_scb * scb, stru } else { scb->tag_msg = 0; /* No tag support */ } - memcpy(&scb->cdb[0], &cmd->cmnd, scb->cdb_len); + memcpy(scb->cdb, cmd->cmnd, scb->cdb_len); } /** diff --git a/drivers/scsi/hptiop.c b/drivers/scsi/hptiop.c index ff149ad..514f05b 100644 --- a/drivers/scsi/hptiop.c +++ b/drivers/scsi/hptiop.c @@ -761,9 +761,9 @@ static int hptiop_queuecommand(struct scsi_cmnd *scp, scp, host->host_no, scp->device->channel, scp->device->id, scp->device->lun, - *((u32 *)&scp->cmnd), - *((u32 *)&scp->cmnd + 1), - *((u32 *)&scp->cmnd + 2), + ((u32 *)scp->cmnd)[0], + ((u32 *)scp->cmnd)[1], + ((u32 *)scp->cmnd)[2], _req->index, _req->req_virt); scp->result = 0; diff --git a/drivers/scsi/ibmvscsi/ibmvscsi.c b/drivers/scsi/ibmvscsi/ibmvscsi.c index 78d46a9..df9b865 100644 --- a/drivers/scsi/ibmvscsi/ibmvscsi.c +++ b/drivers/scsi/ibmvscsi/ibmvscsi.c @@ -738,7 +738,7 @@ static int ibmvscsi_queuecommand(struct scsi_cmnd *cmnd, srp_cmd = &evt_struct->iu.srp.cmd; memset(srp_cmd, 0x00, SRP_MAX_IU_LEN); srp_cmd->opcode = SRP_CMD; - memcpy(srp_cmd->cdb, cmnd->cmnd, sizeof(cmnd->cmnd)); + memcpy(srp_cmd->cdb, cmnd->cmnd, sizeof(srp_cmd->cdb)); srp_cmd->lun = ((u64) lun) << 48; if (!map_data_for_srp_cmd(cmnd, evt_struct, srp_cmd, hostdata->dev)) { diff --git a/drivers/scsi/initio.c b/drivers/scsi/initio.c index 0cc8868..40e9875 100644 --- a/drivers/scsi/initio.c +++ b/drivers/scsi/initio.c @@ -2590,7 +2590,7 @@ static void initio_build_scb(struct initio_host * host, struct scsi_ctrl_blk * c cblk->hastat = 0; cblk->tastat = 0; /* Command the command */ - memcpy(&cblk->cdb[0], &cmnd->cmnd, cmnd->cmd_len); + memcpy(cblk->cdb, cmnd->cmnd, cmnd->cmd_len); /* Set up tags */ if (cmnd->device->tagged_supported) { /* Tag Support */ diff --git a/drivers/scsi/qla1280.c b/drivers/scsi/qla1280.c index 68c0d09..b53420b 100644 --- a/drivers/scsi/qla1280.c +++ b/drivers/scsi/qla1280.c @@ -2863,7 +2863,7 @@ qla1280_64bit_start_scsi(struct scsi_qla_host *ha, struct srb * sp) /* Load SCSI command packet. */ pkt->cdb_len = cpu_to_le16(CMD_CDBLEN(cmd)); - memcpy(pkt->scsi_cdb, &(CMD_CDBP(cmd)), CMD_CDBLEN(cmd)); + memcpy(pkt->scsi_cdb, CMD_CDBP(cmd), CMD_CDBLEN(cmd)); /* dprintk(1, "Build packet for command[0]=0x%x\n",pkt->scsi_cdb[0]); */ /* Set transfer direction. */ @@ -3132,7 +3132,7 @@ qla1280_32bit_start_scsi(struct scsi_qla_host *ha, struct srb * sp) /* Load SCSI command packet. */ pkt->cdb_len = cpu_to_le16(CMD_CDBLEN(cmd)); - memcpy(pkt->scsi_cdb, &(CMD_CDBP(cmd)), CMD_CDBLEN(cmd)); + memcpy(pkt->scsi_cdb, CMD_CDBP(cmd), CMD_CDBLEN(cmd)); /*dprintk(1, "Build packet for command[0]=0x%x\n",pkt->scsi_cdb[0]); */ /* Set transfer direction. */ diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index 045a086..98696ae 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -593,7 +593,7 @@ static void scsi_abort_eh_cmnd(struct scsi_cmnd *scmd) * @scmd: SCSI command structure to hijack * @ses: structure to save restore information * @cmnd: CDB to send. Can be NULL if no new cmnd is needed - * @cmnd_size: size in bytes of @cmnd + * @cmnd_size: size in bytes of @cmnd (must be <= BLK_MAX_CDB) * @sense_bytes: size of sense data to copy. or 0 (if != 0 @cmnd is ignored) * * This function is used to save a scsi command information before re-execution @@ -615,12 +615,14 @@ void scsi_eh_prep_cmnd(struct scsi_cmnd *scmd, struct scsi_eh_save *ses, * command. */ ses->cmd_len = scmd->cmd_len; - memcpy(ses->cmnd, scmd->cmnd, sizeof(scmd->cmnd)); + ses->cmnd = scmd->cmnd; ses->data_direction = scmd->sc_data_direction; ses->sdb = scmd->sdb; ses->next_rq = scmd->request->next_rq; ses->result = scmd->result; + scmd->cmnd = ses->eh_cmnd; + memset(scmd->cmnd, 0, BLK_MAX_CDB); memset(&scmd->sdb, 0, sizeof(scmd->sdb)); scmd->request->next_rq = NULL; @@ -632,14 +634,13 @@ void scsi_eh_prep_cmnd(struct scsi_cmnd *scmd, struct scsi_eh_save *ses, scmd->sdb.table.sgl = &ses->sense_sgl; scmd->sc_data_direction = DMA_FROM_DEVICE; scmd->sdb.table.nents = 1; - memset(scmd->cmnd, 0, sizeof(scmd->cmnd)); scmd->cmnd[0] = REQUEST_SENSE; scmd->cmnd[4] = scmd->sdb.length; scmd->cmd_len = COMMAND_SIZE(scmd->cmnd[0]); } else { scmd->sc_data_direction = DMA_NONE; if (cmnd) { - memset(scmd->cmnd, 0, sizeof(scmd->cmnd)); + BUG_ON(cmnd_size > BLK_MAX_CDB); memcpy(scmd->cmnd, cmnd, cmnd_size); scmd->cmd_len = COMMAND_SIZE(scmd->cmnd[0]); } @@ -672,7 +673,7 @@ void scsi_eh_restore_cmnd(struct scsi_cmnd* scmd, struct scsi_eh_save *ses) * Restore original data */ scmd->cmd_len = ses->cmd_len; - memcpy(scmd->cmnd, ses->cmnd, sizeof(scmd->cmnd)); + scmd->cmnd = ses->cmnd; scmd->sc_data_direction = ses->data_direction; scmd->sdb = ses->sdb; scmd->request->next_rq = ses->next_rq; @@ -1693,7 +1694,8 @@ scsi_reset_provider(struct scsi_device *dev, int flag) scmd->request = &req; memset(&scmd->eh_timeout, 0, sizeof(scmd->eh_timeout)); - memset(&scmd->cmnd, '\0', sizeof(scmd->cmnd)); + scmd->cmnd = req.cmd; + memset(scmd->cmnd, 0, BLK_MAX_CDB); scmd->scsi_done = scsi_reset_provider_done_command; memset(&scmd->sdb, 0, sizeof(scmd->sdb)); diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 135c1d0..426aa11 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -1092,6 +1092,8 @@ static struct scsi_cmnd *scsi_get_cmd_from_req(struct scsi_device *sdev, cmd->tag = req->tag; cmd->request = req; + cmd->cmnd = req->cmd; + return cmd; } @@ -1129,8 +1131,6 @@ int scsi_setup_blk_pc_cmnd(struct scsi_device *sdev, struct request *req) req->buffer = NULL; } - BUILD_BUG_ON(sizeof(req->cmd) > sizeof(cmd->cmnd)); - memcpy(cmd->cmnd, req->cmd, sizeof(cmd->cmnd)); cmd->cmd_len = req->cmd_len; if (!req->data_len) cmd->sc_data_direction = DMA_NONE; @@ -1167,6 +1167,7 @@ int scsi_setup_fs_cmnd(struct scsi_device *sdev, struct request *req) if (unlikely(!cmd)) return BLKPREP_DEFER; + memset(cmd->cmnd, 0, BLK_MAX_CDB); return scsi_init_io(cmd, GFP_ATOMIC); } EXPORT_SYMBOL(scsi_setup_fs_cmnd); diff --git a/drivers/scsi/scsi_tgt_lib.c b/drivers/scsi/scsi_tgt_lib.c index 3677fbb..72102d5 100644 --- a/drivers/scsi/scsi_tgt_lib.c +++ b/drivers/scsi/scsi_tgt_lib.c @@ -108,6 +108,8 @@ struct scsi_cmnd *scsi_host_get_command(struct Scsi_Host *shost, cmd->jiffies_at_alloc = jiffies; cmd->request = rq; + cmd->cmnd = rq->cmd; + rq->special = cmd; rq->cmd_type = REQ_TYPE_SPECIAL; rq->cmd_flags |= REQ_TYPE_BLOCK_PC; diff --git a/drivers/usb/storage/isd200.c b/drivers/usb/storage/isd200.c index 2ae1e86..8a761b6 100644 --- a/drivers/usb/storage/isd200.c +++ b/drivers/usb/storage/isd200.c @@ -292,6 +292,7 @@ struct isd200_info { /* maximum number of LUNs supported */ unsigned char MaxLUNs; + unsigned char cmnd[BLK_MAX_CDB]; struct scsi_cmnd srb; struct scatterlist sg; }; @@ -450,6 +451,7 @@ static int isd200_action( struct us_data *us, int action, memset(&ata, 0, sizeof(ata)); memset(&srb_dev, 0, sizeof(srb_dev)); + srb->cmnd = info->cmnd; srb->device = &srb_dev; ++srb->serial_number; diff --git a/include/scsi/scsi_cmnd.h b/include/scsi/scsi_cmnd.h index de28aab..5c21ddc 100644 --- a/include/scsi/scsi_cmnd.h +++ b/include/scsi/scsi_cmnd.h @@ -7,10 +7,16 @@ #include <linux/types.h> #include <linux/timer.h> #include <linux/scatterlist.h> +#include <linux/blkdev.h> struct Scsi_Host; struct scsi_device; +#define MAX_COMMAND_SIZE 16 +#if (MAX_COMMAND_SIZE > BLK_MAX_CDB) +# error MAX_COMMAND_SIZE can not be smaller than BLK_MAX_CDB +#endif + struct scsi_data_buffer { struct sg_table table; unsigned length; @@ -64,8 +70,7 @@ struct scsi_cmnd { enum dma_data_direction sc_data_direction; /* These elements define the operation we are about to perform */ -#define MAX_COMMAND_SIZE 16 - unsigned char cmnd[MAX_COMMAND_SIZE]; + unsigned char *cmnd; struct timer_list eh_timeout; /* Used to time out the command. */ diff --git a/include/scsi/scsi_eh.h b/include/scsi/scsi_eh.h index 25071d5..9438ea1 100644 --- a/include/scsi/scsi_eh.h +++ b/include/scsi/scsi_eh.h @@ -72,11 +72,11 @@ struct scsi_eh_save { int result; enum dma_data_direction data_direction; unsigned char cmd_len; - unsigned char cmnd[MAX_COMMAND_SIZE]; + unsigned char *cmnd; struct scsi_data_buffer sdb; struct request *next_rq; - /* new command support */ + unsigned char eh_cmnd[BLK_MAX_CDB]; struct scatterlist sense_sgl; }; -- 1.5.3.3 ^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: Subject: [PATCH 1/3] Let scsi_cmnd->cmnd use request->cmd buffer 2008-02-10 19:05 ` Subject: [PATCH 1/3] Let scsi_cmnd->cmnd use request->cmd buffer Boaz Harrosh @ 2008-02-12 17:45 ` Christoph Hellwig 2008-02-12 18:10 ` Boaz Harrosh 2008-02-12 19:41 ` James Bottomley 1 sibling, 1 reply; 27+ messages in thread From: Christoph Hellwig @ 2008-02-12 17:45 UTC (permalink / raw) To: Boaz Harrosh Cc: Christoph Hellwig, Bartlomiej Zolnierkiewicz, Jens Axboe, James Bottomley, Sebastian Siewior, Tejun Heo, Sergei Shtylyov, linux-ide, linux-scsi On Sun, Feb 10, 2008 at 09:05:17PM +0200, Boaz Harrosh wrote: > - Lots of drivers still use MAX_COMMAND_SIZE. So I have left > that #define but equate it to BLK_MAX_CDB. The way I see it > and is reflected in the patch below is. > MAX_COMMAND_SIZE - means: The longest fixed-length (*) SCSI CDB > as per the SCSI standard and is not related > to the implementation. > BLK_MAX_CDB. - The allocated space at the request level > > (*)fixed-length here means commands that their size can be determined > by their opcode and the CDB does not carry a length specifier, like > the VARIABLE_LENGTH_CMD(0x7f) command. This is actually not exactly > true and the SCSI standard also defines extended commands and > vendor specific commands that can be bigger than 16 bytes. The kernel > will support these using the same infrastructure used for VARLEN CDB's. > So in effect MAX_COMMAND_SIZE means the maximum size command > scsi-ml supports without specifying a cmd_len by ULD's A comment like this should be near the declaration of MAX_COMMAND_SIZE > +#define MAX_COMMAND_SIZE 16 > +#if (MAX_COMMAND_SIZE > BLK_MAX_CDB) > +# error MAX_COMMAND_SIZE can not be smaller than BLK_MAX_CDB > +#endif No tabs between the # and the rest of the cpp command, please. Either nothing or a single space as indentation instead. Except for those two small nitpicks this looks very good to me. Nice memory saving aswel. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Subject: [PATCH 1/3] Let scsi_cmnd->cmnd use request->cmd buffer 2008-02-12 17:45 ` Christoph Hellwig @ 2008-02-12 18:10 ` Boaz Harrosh 0 siblings, 0 replies; 27+ messages in thread From: Boaz Harrosh @ 2008-02-12 18:10 UTC (permalink / raw) To: Christoph Hellwig Cc: Bartlomiej Zolnierkiewicz, Jens Axboe, James Bottomley, Sebastian Siewior, Tejun Heo, Sergei Shtylyov, linux-ide, linux-scsi On Tue, Feb 12 2008 at 19:45 +0200, Christoph Hellwig <hch@infradead.org> wrote: > On Sun, Feb 10, 2008 at 09:05:17PM +0200, Boaz Harrosh wrote: >> - Lots of drivers still use MAX_COMMAND_SIZE. So I have left >> that #define but equate it to BLK_MAX_CDB. The way I see it >> and is reflected in the patch below is. >> MAX_COMMAND_SIZE - means: The longest fixed-length (*) SCSI CDB >> as per the SCSI standard and is not related >> to the implementation. >> BLK_MAX_CDB. - The allocated space at the request level >> >> (*)fixed-length here means commands that their size can be determined >> by their opcode and the CDB does not carry a length specifier, like >> the VARIABLE_LENGTH_CMD(0x7f) command. This is actually not exactly >> true and the SCSI standard also defines extended commands and >> vendor specific commands that can be bigger than 16 bytes. The kernel >> will support these using the same infrastructure used for VARLEN CDB's. >> So in effect MAX_COMMAND_SIZE means the maximum size command >> scsi-ml supports without specifying a cmd_len by ULD's > > A comment like this should be near the declaration of MAX_COMMAND_SIZE > >> +#define MAX_COMMAND_SIZE 16 >> +#if (MAX_COMMAND_SIZE > BLK_MAX_CDB) >> +# error MAX_COMMAND_SIZE can not be smaller than BLK_MAX_CDB >> +#endif > > No tabs between the # and the rest of the cpp command, please. Either > nothing or a single space as indentation instead. > > Except for those two small nitpicks this looks very good to me. Nice > memory saving aswel. Agree with both comments. Thanks for the review, will fix in the next submission. Boaz ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Subject: [PATCH 1/3] Let scsi_cmnd->cmnd use request->cmd buffer 2008-02-10 19:05 ` Subject: [PATCH 1/3] Let scsi_cmnd->cmnd use request->cmd buffer Boaz Harrosh 2008-02-12 17:45 ` Christoph Hellwig @ 2008-02-12 19:41 ` James Bottomley 2008-02-13 9:24 ` Boaz Harrosh 1 sibling, 1 reply; 27+ messages in thread From: James Bottomley @ 2008-02-12 19:41 UTC (permalink / raw) To: Boaz Harrosh Cc: Christoph Hellwig, Bartlomiej Zolnierkiewicz, Jens Axboe, Sebastian Siewior, Tejun Heo, Sergei Shtylyov, linux-ide, linux-scsi On Sun, 2008-02-10 at 21:05 +0200, Boaz Harrosh wrote: > - struct scsi_cmnd had a 16 bytes command buffer of its own. > This is an unnecessary duplication and copy of request's > cmd. It is probably left overs from the time that scsi_cmnd > could function without a request attached. So clean that up. > > - Once above is done, few places, apart from scsi-ml, needed > adjustments due to changing the data type of scsi_cmnd->cmnd. > > - Lots of drivers still use MAX_COMMAND_SIZE. So I have left > that #define but equate it to BLK_MAX_CDB. The way I see it > and is reflected in the patch below is. > MAX_COMMAND_SIZE - means: The longest fixed-length (*) SCSI CDB > as per the SCSI standard and is not related > to the implementation. > BLK_MAX_CDB. - The allocated space at the request level > > (*)fixed-length here means commands that their size can be determined > by their opcode and the CDB does not carry a length specifier, like > the VARIABLE_LENGTH_CMD(0x7f) command. This is actually not exactly > true and the SCSI standard also defines extended commands and > vendor specific commands that can be bigger than 16 bytes. The kernel > will support these using the same infrastructure used for VARLEN CDB's. > So in effect MAX_COMMAND_SIZE means the maximum size command > scsi-ml supports without specifying a cmd_len by ULD's When we do this, what happens to the minority of drivers that need the command in DMAable memory ... or have you audited them all and we can now dump the DMA pool allocation for SCSI commands? James ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Subject: [PATCH 1/3] Let scsi_cmnd->cmnd use request->cmd buffer 2008-02-12 19:41 ` James Bottomley @ 2008-02-13 9:24 ` Boaz Harrosh 0 siblings, 0 replies; 27+ messages in thread From: Boaz Harrosh @ 2008-02-13 9:24 UTC (permalink / raw) To: James Bottomley Cc: Christoph Hellwig, Bartlomiej Zolnierkiewicz, Jens Axboe, Sebastian Siewior, Tejun Heo, Sergei Shtylyov, linux-ide, linux-scsi On Tue, Feb 12 2008 at 21:41 +0200, James Bottomley <James.Bottomley@HansenPartnership.com> wrote: > On Sun, 2008-02-10 at 21:05 +0200, Boaz Harrosh wrote: >> - struct scsi_cmnd had a 16 bytes command buffer of its own. >> This is an unnecessary duplication and copy of request's >> cmd. It is probably left overs from the time that scsi_cmnd >> could function without a request attached. So clean that up. >> >> - Once above is done, few places, apart from scsi-ml, needed >> adjustments due to changing the data type of scsi_cmnd->cmnd. >> >> - Lots of drivers still use MAX_COMMAND_SIZE. So I have left >> that #define but equate it to BLK_MAX_CDB. The way I see it >> and is reflected in the patch below is. >> MAX_COMMAND_SIZE - means: The longest fixed-length (*) SCSI CDB >> as per the SCSI standard and is not related >> to the implementation. >> BLK_MAX_CDB. - The allocated space at the request level >> >> (*)fixed-length here means commands that their size can be determined >> by their opcode and the CDB does not carry a length specifier, like >> the VARIABLE_LENGTH_CMD(0x7f) command. This is actually not exactly >> true and the SCSI standard also defines extended commands and >> vendor specific commands that can be bigger than 16 bytes. The kernel >> will support these using the same infrastructure used for VARLEN CDB's. >> So in effect MAX_COMMAND_SIZE means the maximum size command >> scsi-ml supports without specifying a cmd_len by ULD's > > When we do this, what happens to the minority of drivers that need the > command in DMAable memory ... or have you audited them all and we can > now dump the DMA pool allocation for SCSI commands? > > James > > Am I right in assuming that I only need to audited the drivers that have .unchecked_isa_dma set? I will redo this audit again, and report back. Boaz ^ permalink raw reply [flat|nested] 27+ messages in thread
* [PATCH 2/3] block layer varlen-cdb 2008-02-10 18:59 ` [PATCHSET 0/3] varlen extended and vendor-specific cdbs Boaz Harrosh 2008-02-10 19:05 ` Subject: [PATCH 1/3] Let scsi_cmnd->cmnd use request->cmd buffer Boaz Harrosh @ 2008-02-10 19:09 ` Boaz Harrosh 2008-02-12 17:48 ` Christoph Hellwig 2008-02-10 19:12 ` [PATCH 3/3] scsi: varlen extended and vendor-specific cdbs Boaz Harrosh 2 siblings, 1 reply; 27+ messages in thread From: Boaz Harrosh @ 2008-02-10 19:09 UTC (permalink / raw) To: Christoph Hellwig, Bartlomiej Zolnierkiewicz, Jens Axboe, James Bottomley Cc: Sebastian Siewior, Tejun Heo, Sergei Shtylyov, linux-ide, linux-scsi - add varlen_cdb and varlen_cdb_len to hold a large user cdb if needed. They start as empty. Allocation of buffer must be done by user and held until request execution is done. - Since there can be either a fix_length command up to 16 bytes or a variable_length, larger then 16 bytes, commands but never both, we hold the two types in a union to save space. The presence of varlen_cdb_len and cmd_len==0 signals a varlen_cdb mode. - Please use added rq_{set,get}_varlen_cdb() to set every thing up in a way that will not confuse drivers that do not support varlen_cdb's. - Note that this patch does not add any size to struct request since the unsigned cmd_len is split here to 2 ushorts, which is more then enough. Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> --- block/blk-core.c | 2 ++ include/linux/blkdev.h | 27 +++++++++++++++++++++++++-- 2 files changed, 27 insertions(+), 2 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index 4afb39c..1c5cfa7 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -124,6 +124,8 @@ void rq_init(struct request_queue *q, struct request *rq) rq->end_io_data = NULL; rq->completion_data = NULL; rq->next_rq = NULL; + rq->varlen_cdb_len = 0; + rq->varlen_cdb = NULL; } static void req_bio_endio(struct request *rq, struct bio *bio, diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 90392a9..a8a6c20 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -211,8 +211,15 @@ struct request { /* * when request is used as a packet command carrier */ - unsigned int cmd_len; - unsigned char cmd[BLK_MAX_CDB]; + unsigned short cmd_len; + unsigned short varlen_cdb_len; /* length of varlen_cdb buffer */ + union { + unsigned char cmd[BLK_MAX_CDB]; + unsigned char *varlen_cdb;/* an optional variable-length cdb. + * points to a user buffer that must + * be valid until end of request + */ + }; unsigned int data_len; unsigned int sense_len; @@ -478,6 +485,22 @@ enum { #define rq_is_sync(rq) (rq_data_dir((rq)) == READ || (rq)->cmd_flags & REQ_RW_SYNC) #define rq_is_meta(rq) ((rq)->cmd_flags & REQ_RW_META) +static inline void rq_set_varlen_cdb(struct request *rq, u8 *cdb, short cdb_len) +{ + rq->cmd_len = 0; /* Make sure legacy drivers don't get confused */ + rq->varlen_cdb_len = cdb_len; + rq->varlen_cdb = cdb; +} + +/* If ! a varlen_cdb than return NULL */ +static inline u8 *rq_get_varlen_cdb(struct request *rq) +{ + if (!rq->cmd_len && rq->varlen_cdb_len) + return rq->varlen_cdb; + else + return NULL; +} + static inline int blk_queue_full(struct request_queue *q, int rw) { if (rw == READ) -- 1.5.3.3 ^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [PATCH 2/3] block layer varlen-cdb 2008-02-10 19:09 ` [PATCH 2/3] block layer varlen-cdb Boaz Harrosh @ 2008-02-12 17:48 ` Christoph Hellwig 2008-02-12 17:54 ` Boaz Harrosh 0 siblings, 1 reply; 27+ messages in thread From: Christoph Hellwig @ 2008-02-12 17:48 UTC (permalink / raw) To: Boaz Harrosh Cc: Christoph Hellwig, Bartlomiej Zolnierkiewicz, Jens Axboe, James Bottomley, Sebastian Siewior, Tejun Heo, Sergei Shtylyov, linux-ide, linux-scsi On Sun, Feb 10, 2008 at 09:09:41PM +0200, Boaz Harrosh wrote: > - add varlen_cdb and varlen_cdb_len to hold a large user cdb > if needed. They start as empty. Allocation of buffer must > be done by user and held until request execution is done. > - Since there can be either a fix_length command up to 16 bytes > or a variable_length, larger then 16 bytes, commands but never > both, we hold the two types in a union to save space. The > presence of varlen_cdb_len and cmd_len==0 signals a varlen_cdb > mode. this one I'm a bit confused by, why can't we just set the length of the variable length command in cmd_len aswell, and if cmd_len > the length of the cmd array it's a variable length command? Note that this is both to keep the logic simpler and not to grow struct request further. Especially for the rather rare case of a bidi command. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 2/3] block layer varlen-cdb 2008-02-12 17:48 ` Christoph Hellwig @ 2008-02-12 17:54 ` Boaz Harrosh 2008-02-12 18:07 ` Boaz Harrosh 0 siblings, 1 reply; 27+ messages in thread From: Boaz Harrosh @ 2008-02-12 17:54 UTC (permalink / raw) To: Christoph Hellwig Cc: Bartlomiej Zolnierkiewicz, Jens Axboe, James Bottomley, Sebastian Siewior, Tejun Heo, Sergei Shtylyov, linux-ide, linux-scsi On Tue, Feb 12 2008 at 19:48 +0200, Christoph Hellwig <hch@infradead.org> wrote: > On Sun, Feb 10, 2008 at 09:09:41PM +0200, Boaz Harrosh wrote: >> - add varlen_cdb and varlen_cdb_len to hold a large user cdb >> if needed. They start as empty. Allocation of buffer must >> be done by user and held until request execution is done. >> - Since there can be either a fix_length command up to 16 bytes >> or a variable_length, larger then 16 bytes, commands but never >> both, we hold the two types in a union to save space. The >> presence of varlen_cdb_len and cmd_len==0 signals a varlen_cdb >> mode. > > this one I'm a bit confused by, why can't we just set the length > of the variable length command in cmd_len aswell, and if cmd_len > > the length of the cmd array it's a variable length command? > > Note that this is both to keep the logic simpler and not to grow > struct request further. Especially for the rather rare case > of a bidi command. Because this will be dangerous for the Legacy block devices. Unlike scsi drivers block drivers do not have a .max_cmnd_len and upper layer will not check to make sure that the device supports the longer command. If such a command goes through, lets say bsg the drivers do blindly memcpy(,,rq->cmd_len) and will crash. Better safe then sorry, at no cost. Boaz ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 2/3] block layer varlen-cdb 2008-02-12 17:54 ` Boaz Harrosh @ 2008-02-12 18:07 ` Boaz Harrosh 0 siblings, 0 replies; 27+ messages in thread From: Boaz Harrosh @ 2008-02-12 18:07 UTC (permalink / raw) To: Christoph Hellwig Cc: Bartlomiej Zolnierkiewicz, Jens Axboe, James Bottomley, Sebastian Siewior, Tejun Heo, Sergei Shtylyov, linux-ide, linux-scsi On Tue, Feb 12 2008 at 19:54 +0200, Boaz Harrosh <bharrosh@panasas.com> wrote: > On Tue, Feb 12 2008 at 19:48 +0200, Christoph Hellwig <hch@infradead.org> wrote: >> On Sun, Feb 10, 2008 at 09:09:41PM +0200, Boaz Harrosh wrote: >>> - add varlen_cdb and varlen_cdb_len to hold a large user cdb >>> if needed. They start as empty. Allocation of buffer must >>> be done by user and held until request execution is done. >>> - Since there can be either a fix_length command up to 16 bytes >>> or a variable_length, larger then 16 bytes, commands but never >>> both, we hold the two types in a union to save space. The >>> presence of varlen_cdb_len and cmd_len==0 signals a varlen_cdb >>> mode. >> this one I'm a bit confused by, why can't we just set the length >> of the variable length command in cmd_len aswell, and if cmd_len > >> the length of the cmd array it's a variable length command? >> >> Note that this is both to keep the logic simpler and not to grow >> struct request further. Especially for the rather rare case >> of a bidi command. > > Because this will be dangerous for the Legacy block devices. > Unlike scsi drivers block drivers do not have a .max_cmnd_len > and upper layer will not check to make sure that the device supports > the longer command. If such a command goes through, lets say bsg > the drivers do blindly memcpy(,,rq->cmd_len) and will crash. > Better safe then sorry, at no cost. > > Boaz > > - PS: the struct request does not grow. Because before cmd_len was unsigned but now both cmd_len and varlen_cdb_len are short so it is the same as before. Boaz ^ permalink raw reply [flat|nested] 27+ messages in thread
* [PATCH 3/3] scsi: varlen extended and vendor-specific cdbs 2008-02-10 18:59 ` [PATCHSET 0/3] varlen extended and vendor-specific cdbs Boaz Harrosh 2008-02-10 19:05 ` Subject: [PATCH 1/3] Let scsi_cmnd->cmnd use request->cmd buffer Boaz Harrosh 2008-02-10 19:09 ` [PATCH 2/3] block layer varlen-cdb Boaz Harrosh @ 2008-02-10 19:12 ` Boaz Harrosh 2008-02-12 17:51 ` Christoph Hellwig 2 siblings, 1 reply; 27+ messages in thread From: Boaz Harrosh @ 2008-02-10 19:12 UTC (permalink / raw) To: Christoph Hellwig, Bartlomiej Zolnierkiewicz, Jens Axboe, James Bottomley Cc: Sebastian Siewior, Tejun Heo, Sergei Shtylyov, linux-ide, linux-scsi Add support for variable-length, extended, and vendor specific CDBs to scsi-ml. It is now possible for initiators and ULD's to issue these types of commands. LLDs need not change much. All they need is to raise the .max_cmd_len to the longest command they support (see iscsi patches). - clean-up some code paths that did not expect commands to be larger than 16, and change cmd_len members' type to short as char is not enough. - Add support for varlen_cdb in scsi_execute. Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> --- block/scsi_ioctl.c | 4 ++-- drivers/scsi/constants.c | 10 +++------- drivers/scsi/scsi.c | 22 +++++++++++----------- drivers/scsi/scsi_lib.c | 27 ++++++++++++++++++++++----- include/scsi/scsi.h | 40 +++++++++++++++++++++++++++++++++------- include/scsi/scsi_cmnd.h | 2 +- include/scsi/scsi_host.h | 8 +++----- 7 files changed, 75 insertions(+), 38 deletions(-) diff --git a/block/scsi_ioctl.c b/block/scsi_ioctl.c index 9675b34..a1d7070 100644 --- a/block/scsi_ioctl.c +++ b/block/scsi_ioctl.c @@ -33,13 +33,13 @@ #include <scsi/scsi_cmnd.h> /* Command group 3 is reserved and should never be used. */ -const unsigned char scsi_command_size[8] = +const unsigned char scsi_command_size_tbl[8] = { 6, 10, 10, 12, 16, 12, 10, 10 }; -EXPORT_SYMBOL(scsi_command_size); +EXPORT_SYMBOL(scsi_command_size_tbl); #include <scsi/sg.h> diff --git a/drivers/scsi/constants.c b/drivers/scsi/constants.c index 403a7f2..9785d73 100644 --- a/drivers/scsi/constants.c +++ b/drivers/scsi/constants.c @@ -28,7 +28,6 @@ #define SERVICE_ACTION_OUT_12 0xa9 #define SERVICE_ACTION_IN_16 0x9e #define SERVICE_ACTION_OUT_16 0x9f -#define VARIABLE_LENGTH_CMD 0x7f @@ -210,7 +209,7 @@ static void print_opcode_name(unsigned char * cdbp, int cdb_len) cdb0 = cdbp[0]; switch(cdb0) { case VARIABLE_LENGTH_CMD: - len = cdbp[7] + 8; + len = scsi_varlen_cdb_length(cdbp); if (len < 10) { printk("short variable length command, " "len=%d ext_len=%d", len, cdb_len); @@ -300,7 +299,7 @@ static void print_opcode_name(unsigned char * cdbp, int cdb_len) cdb0 = cdbp[0]; switch(cdb0) { case VARIABLE_LENGTH_CMD: - len = cdbp[7] + 8; + len = scsi_varlen_cdb_length(cdbp); if (len < 10) { printk("short opcode=0x%x command, len=%d " "ext_len=%d", cdb0, len, cdb_len); @@ -335,10 +334,7 @@ void __scsi_print_command(unsigned char *cdb) int k, len; print_opcode_name(cdb, 0); - if (VARIABLE_LENGTH_CMD == cdb[0]) - len = cdb[7] + 8; - else - len = COMMAND_SIZE(cdb[0]); + len = scsi_command_size(cdb); /* print out all bytes in cdb */ for (k = 0; k < len; ++k) printk(" %02x", cdb[k]); diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c index fecba05..944fafa 100644 --- a/drivers/scsi/scsi.c +++ b/drivers/scsi/scsi.c @@ -79,15 +79,6 @@ static void scsi_done(struct scsi_cmnd *cmd); #define MIN_RESET_PERIOD (15*HZ) /* - * Macro to determine the size of SCSI command. This macro takes vendor - * unique commands into account. SCSI commands in groups 6 and 7 are - * vendor unique and we will depend upon the command length being - * supplied correctly in cmd_len. - */ -#define CDB_SIZE(cmd) (((((cmd)->cmnd[0] >> 5) & 7) < 6) ? \ - COMMAND_SIZE((cmd)->cmnd[0]) : (cmd)->cmd_len) - -/* * Note - the initial logging level can be set here to log events at boot time. * After the system is up, you may enable logging via the /proc interface. */ @@ -525,6 +516,7 @@ int scsi_dispatch_cmd(struct scsi_cmnd *cmd) unsigned long flags = 0; unsigned long timeout; int rtn = 0; + unsigned cmd_len; /* check if the device is still usable */ if (unlikely(cmd->device->sdev_state == SDEV_DEL)) { @@ -606,9 +598,17 @@ int scsi_dispatch_cmd(struct scsi_cmnd *cmd) * Before we queue this command, check if the command * length exceeds what the host adapter can handle. */ - if (CDB_SIZE(cmd) > cmd->device->host->max_cmd_len) { + cmd_len = cmd->cmd_len; + if (!cmd_len) { + BUG_ON(cmd->cmnd[0] == VARIABLE_LENGTH_CMD); + cmd_len = COMMAND_SIZE((cmd)->cmnd[0]); + } + + if (cmd_len > cmd->device->host->max_cmd_len) { SCSI_LOG_MLQUEUE(3, - printk("queuecommand : command too long.\n")); + printk("queuecommand : command too long. " + "cdb_size=%d host->max_cmd_len=%d\n", + cmd->cmd_len, cmd->device->host->max_cmd_len)); cmd->result = (DID_ABORT << 16); scsi_done(cmd); diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 426aa11..3029b31 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -193,8 +193,16 @@ int scsi_execute(struct scsi_device *sdev, const unsigned char *cmd, buffer, bufflen, __GFP_WAIT)) goto out; - req->cmd_len = COMMAND_SIZE(cmd[0]); - memcpy(req->cmd, cmd, req->cmd_len); + if (cmd[0] == VARIABLE_LENGTH_CMD) + rq_set_varlen_cdb(req, (u8 *)cmd, scsi_varlen_cdb_length(cmd)); + else { + req->cmd_len = COMMAND_SIZE(cmd[0]); + memcpy(req->cmd, cmd, req->cmd_len); + if (req->cmd_len < MAX_COMMAND_SIZE) + memset(&req->cmd[req->cmd_len], 0, + MAX_COMMAND_SIZE - req->cmd_len); + } + req->sense = sense; req->sense_len = 0; req->retries = retries; @@ -445,7 +453,7 @@ static void scsi_init_cmd_errh(struct scsi_cmnd *cmd) scsi_set_resid(cmd, 0); memset(cmd->sense_buffer, 0, SCSI_SENSE_BUFFERSIZE); if (cmd->cmd_len == 0) - cmd->cmd_len = COMMAND_SIZE(cmd->cmnd[0]); + cmd->cmd_len = scsi_command_size(cmd->cmnd); } void scsi_device_unbusy(struct scsi_device *sdev) @@ -1100,6 +1108,7 @@ static struct scsi_cmnd *scsi_get_cmd_from_req(struct scsi_device *sdev, int scsi_setup_blk_pc_cmnd(struct scsi_device *sdev, struct request *req) { struct scsi_cmnd *cmd; + u8 *varlen_cdb; int ret = scsi_prep_state_check(sdev, req); if (ret != BLKPREP_OK) @@ -1131,14 +1140,22 @@ int scsi_setup_blk_pc_cmnd(struct scsi_device *sdev, struct request *req) req->buffer = NULL; } - cmd->cmd_len = req->cmd_len; + varlen_cdb = rq_get_varlen_cdb(req); + if (varlen_cdb) { + cmd->cmnd = varlen_cdb; + cmd->cmd_len = req->varlen_cdb_len; + } else if (req->cmd_len) + cmd->cmd_len = req->cmd_len; + else + cmd->cmd_len = scsi_command_size(cmd->cmnd); + if (!req->data_len) cmd->sc_data_direction = DMA_NONE; else if (rq_data_dir(req) == WRITE) cmd->sc_data_direction = DMA_TO_DEVICE; else cmd->sc_data_direction = DMA_FROM_DEVICE; - + cmd->transfersize = req->data_len; cmd->allowed = req->retries; cmd->timeout_per_command = req->timeout; diff --git a/include/scsi/scsi.h b/include/scsi/scsi.h index 1f74bcd..0ba902c 100644 --- a/include/scsi/scsi.h +++ b/include/scsi/scsi.h @@ -30,13 +30,6 @@ #endif /* - * SCSI command lengths - */ - -extern const unsigned char scsi_command_size[8]; -#define COMMAND_SIZE(opcode) scsi_command_size[((opcode) >> 5) & 7] - -/* * Special value for scanning to specify scanning or rescanning of all * possible channels, (target) ids, or luns on a given shost. */ @@ -109,6 +102,7 @@ extern const unsigned char scsi_command_size[8]; #define MODE_SENSE_10 0x5a #define PERSISTENT_RESERVE_IN 0x5e #define PERSISTENT_RESERVE_OUT 0x5f +#define VARIABLE_LENGTH_CMD 0x7f #define REPORT_LUNS 0xa0 #define MAINTENANCE_IN 0xa3 #define MOVE_MEDIUM 0xa5 @@ -136,6 +130,38 @@ extern const unsigned char scsi_command_size[8]; #define ATA_12 0xa1 /* 12-byte pass-thru */ /* + * SCSI command lengths + */ + +#define SCSI_MAX_VARLEN_CDB_SIZE 260 + +/* defined in T10 SCSI Primary Commands-2 (SPC2) */ +struct scsi_varlen_cdb_hdr { + u8 opcode; /* opcode always == VARIABLE_LENGTH_CMD */ + u8 control; + u8 misc[5]; + u8 additional_cdb_length; /* total cdb length - 8 */ + __be16 service_action; + /* service specific data follows */ +}; + +static inline unsigned +scsi_varlen_cdb_length(const void *hdr) +{ + return ((struct scsi_varlen_cdb_hdr*)hdr)->additional_cdb_length + 8; +} + +extern const unsigned char scsi_command_size_tbl[8]; +#define COMMAND_SIZE(opcode) scsi_command_size_tbl[((opcode) >> 5) & 7] + +static inline unsigned +scsi_command_size(const unsigned char *cmnd) +{ + return (cmnd[0] == VARIABLE_LENGTH_CMD) ? scsi_varlen_cdb_length(cmnd) : + COMMAND_SIZE(cmnd[0]); +} + +/* * SCSI Architecture Model (SAM) Status codes. Taken from SAM-3 draft * T10/1561-D Revision 4 Draft dated 7th November 2002. */ diff --git a/include/scsi/scsi_cmnd.h b/include/scsi/scsi_cmnd.h index 5c21ddc..c32d0da 100644 --- a/include/scsi/scsi_cmnd.h +++ b/include/scsi/scsi_cmnd.h @@ -66,7 +66,7 @@ struct scsi_cmnd { int allowed; int timeout_per_command; - unsigned char cmd_len; + unsigned short cmd_len; enum dma_data_direction sc_data_direction; /* These elements define the operation we are about to perform */ diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h index d1299e9..72e0f0d 100644 --- a/include/scsi/scsi_host.h +++ b/include/scsi/scsi_host.h @@ -568,13 +568,11 @@ struct Scsi_Host { /* * The maximum length of SCSI commands that this host can accept. * Probably 12 for most host adapters, but could be 16 for others. + * or 260 if the driver supports variable length cdbs. * For drivers that don't set this field, a value of 12 is - * assumed. I am leaving this as a number rather than a bit - * because you never know what subsequent SCSI standards might do - * (i.e. could there be a 20 byte or a 24-byte command a few years - * down the road?). + * assumed. */ - unsigned char max_cmd_len; + unsigned short max_cmd_len; int this_id; int can_queue; -- 1.5.3.3 ^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [PATCH 3/3] scsi: varlen extended and vendor-specific cdbs 2008-02-10 19:12 ` [PATCH 3/3] scsi: varlen extended and vendor-specific cdbs Boaz Harrosh @ 2008-02-12 17:51 ` Christoph Hellwig 2008-02-12 18:17 ` Boaz Harrosh 0 siblings, 1 reply; 27+ messages in thread From: Christoph Hellwig @ 2008-02-12 17:51 UTC (permalink / raw) To: Boaz Harrosh Cc: Christoph Hellwig, Bartlomiej Zolnierkiewicz, Jens Axboe, James Bottomley, Sebastian Siewior, Tejun Heo, Sergei Shtylyov, linux-ide, linux-scsi On Sun, Feb 10, 2008 at 09:12:02PM +0200, Boaz Harrosh wrote: > @@ -525,6 +516,7 @@ int scsi_dispatch_cmd(struct scsi_cmnd *cmd) > unsigned long flags = 0; > unsigned long timeout; > int rtn = 0; > + unsigned cmd_len; > > /* check if the device is still usable */ > if (unlikely(cmd->device->sdev_state == SDEV_DEL)) { > @@ -606,9 +598,17 @@ int scsi_dispatch_cmd(struct scsi_cmnd *cmd) > * Before we queue this command, check if the command > * length exceeds what the host adapter can handle. > */ > - if (CDB_SIZE(cmd) > cmd->device->host->max_cmd_len) { > + cmd_len = cmd->cmd_len; > + if (!cmd_len) { > + BUG_ON(cmd->cmnd[0] == VARIABLE_LENGTH_CMD); > + cmd_len = COMMAND_SIZE((cmd)->cmnd[0]); > + } This looks odd to me. Shouldn't we make sure cmd_len is always initialized in a single place for either varlen or fixed length commands and not have things like this? ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 3/3] scsi: varlen extended and vendor-specific cdbs 2008-02-12 17:51 ` Christoph Hellwig @ 2008-02-12 18:17 ` Boaz Harrosh 0 siblings, 0 replies; 27+ messages in thread From: Boaz Harrosh @ 2008-02-12 18:17 UTC (permalink / raw) To: Christoph Hellwig Cc: Bartlomiej Zolnierkiewicz, Jens Axboe, James Bottomley, Sebastian Siewior, Tejun Heo, Sergei Shtylyov, linux-ide, linux-scsi On Tue, Feb 12 2008 at 19:51 +0200, Christoph Hellwig <hch@infradead.org> wrote: > On Sun, Feb 10, 2008 at 09:12:02PM +0200, Boaz Harrosh wrote: >> @@ -525,6 +516,7 @@ int scsi_dispatch_cmd(struct scsi_cmnd *cmd) >> unsigned long flags = 0; >> unsigned long timeout; >> int rtn = 0; >> + unsigned cmd_len; >> >> /* check if the device is still usable */ >> if (unlikely(cmd->device->sdev_state == SDEV_DEL)) { >> @@ -606,9 +598,17 @@ int scsi_dispatch_cmd(struct scsi_cmnd *cmd) >> * Before we queue this command, check if the command >> * length exceeds what the host adapter can handle. >> */ >> - if (CDB_SIZE(cmd) > cmd->device->host->max_cmd_len) { >> + cmd_len = cmd->cmd_len; >> + if (!cmd_len) { >> + BUG_ON(cmd->cmnd[0] == VARIABLE_LENGTH_CMD); >> + cmd_len = COMMAND_SIZE((cmd)->cmnd[0]); >> + } > > This looks odd to me. Shouldn't we make sure cmd_len is always > initialized in a single place for either varlen or fixed length > commands and not have things like this? > I used to have a BUG_ON(!cmd_len) here at around the 2.6.20 kernels And it would trigger. I'm not sure exactly how. Through a retransmit or something. Reinspecting all command submission paths, I agree with you that it should not happen anymore. I will look at it some more and run tests. Perhaps if this code will sit in -mm tree for a while I can put the BUG_ON back and see if it triggers again. What do you recommend? Boaz ^ permalink raw reply [flat|nested] 27+ messages in thread
end of thread, other threads:[~2008-02-13 9:24 UTC | newest] Thread overview: 27+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-02-09 19:32 Current git --> kaboom [bisect] seems IDE related Sebastian Siewior 2008-02-09 20:28 ` Bartlomiej Zolnierkiewicz 2008-02-09 21:22 ` Sebastian Siewior 2008-02-09 23:06 ` Bartlomiej Zolnierkiewicz 2008-02-10 5:26 ` Christoph Hellwig 2008-02-10 13:38 ` Bartlomiej Zolnierkiewicz 2008-02-10 14:19 ` James Bottomley 2008-02-10 18:32 ` Bartlomiej Zolnierkiewicz 2008-02-10 19:51 ` Sebastian Siewior 2008-02-10 23:16 ` Bartlomiej Zolnierkiewicz 2008-02-11 16:30 ` Sergei Shtylyov 2008-02-11 19:41 ` Bartlomiej Zolnierkiewicz 2008-02-10 14:43 ` Christoph Hellwig 2008-02-10 15:07 ` Boaz Harrosh 2008-02-10 18:59 ` [PATCHSET 0/3] varlen extended and vendor-specific cdbs Boaz Harrosh 2008-02-10 19:05 ` Subject: [PATCH 1/3] Let scsi_cmnd->cmnd use request->cmd buffer Boaz Harrosh 2008-02-12 17:45 ` Christoph Hellwig 2008-02-12 18:10 ` Boaz Harrosh 2008-02-12 19:41 ` James Bottomley 2008-02-13 9:24 ` Boaz Harrosh 2008-02-10 19:09 ` [PATCH 2/3] block layer varlen-cdb Boaz Harrosh 2008-02-12 17:48 ` Christoph Hellwig 2008-02-12 17:54 ` Boaz Harrosh 2008-02-12 18:07 ` Boaz Harrosh 2008-02-10 19:12 ` [PATCH 3/3] scsi: varlen extended and vendor-specific cdbs Boaz Harrosh 2008-02-12 17:51 ` Christoph Hellwig 2008-02-12 18:17 ` Boaz Harrosh
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).