Linux Virtual SCSI HBAs and Virtual disks

public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed

* Linux Virtual SCSI HBAs and Virtual disks
@ 2007-01-16 10:22 Aboo Valappil
  2007-01-16 21:52 ` Erik Mouw
  2007-01-17  1:50 ` Douglas Gilbert
  0 siblings, 2 replies; 29+ messages in thread
From: Aboo Valappil @ 2007-01-16 10:22 UTC (permalink / raw)
  To: linux-scsi

Hi All,

I have tried this before but I guess I was unsuccessful in presenting it 
properly in the mailing list. I think it is really useful especially for 
prototyping and also for people who wants to develop their own scsi 
targets and transports.
There are few people told me about the SCSI target and initiator 
implementation of XEN. But I do not think it is this simple and might 
take a while to port it to normal linux kernel.  At the moment, there is 
nothing like this available in a simplest form.

Please visit this site http://vscsihba.aboo.org. I put a complete 
description of the project and the source code. I appreciate if you 
could go through it and put your thoughts....  This is my final attempt 
in this mailing list before I throw away whole of my work.

Thanks

Aboo

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Linux Virtual SCSI HBAs and Virtual disks
  2007-01-16 10:22 Linux Virtual SCSI HBAs and Virtual disks Aboo Valappil
@ 2007-01-16 21:52 ` Erik Mouw
  2007-01-16 23:01   ` aboo
  2007-01-17  1:50 ` Douglas Gilbert
  1 sibling, 1 reply; 29+ messages in thread
From: Erik Mouw @ 2007-01-16 21:52 UTC (permalink / raw)
  To: Aboo Valappil; +Cc: linux-scsi

On Tue, Jan 16, 2007 at 09:22:29PM +1100, Aboo Valappil wrote:
> I have tried this before but I guess I was unsuccessful in presenting it 
> properly in the mailing list. I think it is really useful especially for 
> prototyping and also for people who wants to develop their own scsi 
> targets and transports.
> There are few people told me about the SCSI target and initiator 
> implementation of XEN. But I do not think it is this simple and might 
> take a while to port it to normal linux kernel.  At the moment, there is 
> nothing like this available in a simplest form.
> 
> Please visit this site http://vscsihba.aboo.org. I put a complete 
> description of the project and the source code. I appreciate if you 
> could go through it and put your thoughts....  This is my final attempt 
> in this mailing list before I throw away whole of my work.

You make it hard to review, the hostname doesn't resolve:

  host -a vscsihba.aboo.org
  vscsihba.aboo.org does not exist, try again


Erik

-- 
+-- Erik Mouw -- www.harddisk-recovery.com -- +31 70 370 12 90 --
| Lab address: Delftechpark 26, 2628 XH, Delft, The Netherlands

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Linux Virtual SCSI HBAs and Virtual disks
  2007-01-16 21:52 ` Erik Mouw
@ 2007-01-16 23:01   ` aboo
  0 siblings, 0 replies; 29+ messages in thread
From: aboo @ 2007-01-16 23:01 UTC (permalink / raw)
  To: linux-scsi; +Cc: Erik Mouw

I am sorry for the host name resolution before. The link should be working now.
Please check it out.

Aboo


On Tue, 16 Jan 2007 22:52:24 +0100, Erik Mouw <erik@harddisk-recovery.com> wrote:
> On Tue, Jan 16, 2007 at 09:22:29PM +1100, Aboo Valappil wrote:
>> I have tried this before but I guess I was unsuccessful in presenting it
>> properly in the mailing list. I think it is really useful especially for
>> prototyping and also for people who wants to develop their own scsi
>> targets and transports.
>> There are few people told me about the SCSI target and initiator
>> implementation of XEN. But I do not think it is this simple and might
>> take a while to port it to normal linux kernel.  At the moment, there is
>> nothing like this available in a simplest form.
>>
>> Please visit this site http://vscsihba.aboo.org. I put a complete
>> description of the project and the source code. I appreciate if you
>> could go through it and put your thoughts....  This is my final attempt
>> in this mailing list before I throw away whole of my work.
> 
> You make it hard to review, the hostname doesn't resolve:
> 
>   host -a vscsihba.aboo.org
>   vscsihba.aboo.org does not exist, try again
> 
> 
> Erik
> 
> --
> +-- Erik Mouw -- www.harddisk-recovery.com -- +31 70 370 12 90 

-------------------------------------
Aboo.Org - Compliments From A & J :)
-------------------------------------



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Linux Virtual SCSI HBAs and Virtual disks
  2007-01-16 10:22 Linux Virtual SCSI HBAs and Virtual disks Aboo Valappil
  2007-01-16 21:52 ` Erik Mouw
@ 2007-01-17  1:50 ` Douglas Gilbert
  2007-01-17  8:36   ` Stefan Richter
  1 sibling, 1 reply; 29+ messages in thread
From: Douglas Gilbert @ 2007-01-17  1:50 UTC (permalink / raw)
  To: Aboo Valappil; +Cc: linux-scsi

Aboo Valappil wrote:
> Hi All,
> 
> I have tried this before but I guess I was unsuccessful in presenting it
> properly in the mailing list. I think it is really useful especially for
> prototyping and also for people who wants to develop their own scsi
> targets and transports.
> There are few people told me about the SCSI target and initiator
> implementation of XEN. But I do not think it is this simple and might
> take a while to port it to normal linux kernel.  At the moment, there is
> nothing like this available in a simplest form.
> 
> Please visit this site http://vscsihba.aboo.org. I put a complete
> description of the project and the source code. I appreciate if you
> could go through it and put your thoughts....  This is my final attempt
> in this mailing list before I throw away whole of my work.

Throwing it away sounds a bit drastic. It tooks me a
while finding the tarball on your site. Perhaps you
could put it in a table under a "Downloads" section.
The table would be for different versions as it looks
like you may need a new one for bleeding edge kernels.

I didn't get far trying to build the kernel module
against lk 2.6.20-rc5:

# make
make -C /lib/modules/2.6.20-rc5/build M=/home/upgrades/apps/vscsihba1/vscsihba1/kernel modules
make[1]: Entering directory `/usr/src/linux-2.6.19'
  CC [M]  /home/upgrades/apps/vscsihba1/vscsihba1/kernel/hba.o
/home/upgrades/apps/vscsihba1/vscsihba1/kernel/hba.c:26: warning: ‘kmem_cache_t’ is deprecated
  CC [M]  /home/upgrades/apps/vscsihba1/vscsihba1/kernel/device.o
/home/upgrades/apps/vscsihba1/vscsihba1/kernel/device.c:263:51: error: macro "INIT_WORK" passed 3 arguments, but takes just 2
/home/upgrades/apps/vscsihba1/vscsihba1/kernel/device.c: In function ‘scsitap_ctl_ioctl’:
/home/upgrades/apps/vscsihba1/vscsihba1/kernel/device.c:263: error: ‘INIT_WORK’ undeclared (first use in this function)
/home/upgrades/apps/vscsihba1/vscsihba1/kernel/device.c:263: error: (Each undeclared identifier is reported only once
/home/upgrades/apps/vscsihba1/vscsihba1/kernel/device.c:263: error: for each function it appears in.)
make[2]: *** [/home/upgrades/apps/vscsihba1/vscsihba1/kernel/device.o] Error 1
make[1]: *** [_module_/home/upgrades/apps/vscsihba1/vscsihba1/kernel] Error 2
make[1]: Leaving directory `/usr/src/linux-2.6.19'
make: *** [modules] Error 2

Doug Gilbert


-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Linux Virtual SCSI HBAs and Virtual disks
  2007-01-17  1:50 ` Douglas Gilbert
@ 2007-01-17  8:36   ` Stefan Richter
  2007-01-17 10:24     ` Aboo Valappil
  0 siblings, 1 reply; 29+ messages in thread
From: Stefan Richter @ 2007-01-17  8:36 UTC (permalink / raw)
  To: dougg; +Cc: Aboo Valappil, linux-scsi

Douglas Gilbert wrote:
> The table would be for different versions as it looks
> like you may need a new one for bleeding edge kernels.
> 
> I didn't get far trying to build the kernel module
> against lk 2.6.20-rc5:
> 
> # make
> make -C /lib/modules/2.6.20-rc5/build M=/home/upgrades/apps/vscsihba1/vscsihba1/kernel modules
> make[1]: Entering directory `/usr/src/linux-2.6.19'
>   CC [M]  /home/upgrades/apps/vscsihba1/vscsihba1/kernel/hba.o
> /home/upgrades/apps/vscsihba1/vscsihba1/kernel/hba.c:26: warning: ‘kmem_cache_t’ is deprecated
>   CC [M]  /home/upgrades/apps/vscsihba1/vscsihba1/kernel/device.o
> /home/upgrades/apps/vscsihba1/vscsihba1/kernel/device.c:263:51: error: macro "INIT_WORK" passed 3 arguments, but takes just 2
[...]

Aboo,

the workqueue API changes after 2.6.19 are for example explained here:
http://lwn.net/Articles/213149/
There are a lot of workqueue API conversion patches in 2.6.20-rc1 which
can be taken as example. The first step when converting to the new API
is to determine whether the work has to be delayed sometimes or can
always be queued as immediate work. In the latter case, a slimmed-down
variant of delayed work is used.

The conversion away from kmem_cache_t is trivial. There are also some
patches in 2.6.20-rc1 or later to use as example.
-- 
Stefan Richter
-=====-=-=== ---= =---=
http://arcgraph.de/sr/
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Linux Virtual SCSI HBAs and Virtual disks
  2007-01-17  8:36   ` Stefan Richter
@ 2007-01-17 10:24     ` Aboo Valappil
  2007-01-17 22:20       ` Douglas Gilbert
  0 siblings, 1 reply; 29+ messages in thread
From: Aboo Valappil @ 2007-01-17 10:24 UTC (permalink / raw)
  To: Stefan Richter; +Cc: dougg, linux-scsi

Hi All,
 
Thanks everyone to have a look at this.

I think i modified to have the latest kernel support. Unfortunately I 
could not test it with 2.6.20 kernel due to some issues in my laptop and 
2.6.20 kernel. But it should work with 2.6.20 with this modification.

The modified version is available through 
http://vscsihba.aboo.org/vscsihbav202.tgz.

1. I fixed the kmem_cache issue for sure.
2. I think i got around with INIT_WORK ... Made the following 
modifications ...


Here is the change ...

[root@goobu kernel]# diff device.c ../../vscsihba2/kernel/device.c
4,8c4
<
< struct scsitap_work_t {
<       struct scsitap_session *session;
<       struct work_struct work;
< } scsitap_work;
---
 > struct work_struct scsitap_work;
230c226
< void scsitap_scan_hba (void *work)
---
 > void scsitap_scan_hba (void *ptr)
232,233c228,229
<       struct scsitap_work_t *sw=container_of(work,struct 
scsitap_work_t, work)
;
<       struct scsitap_session *session=sw->session;
---
 >
 >       struct scsitap_session *session=(struct scsitap_session *)ptr;
239,240c235
<
<
---
 >
267,269c262,264
<               scsitap_work.session=session;
<               SCSITAP_INIT_WORK(&scsitap_work.work,scsitap_scan_hba);
<               schedule_work(&scsitap_work.work);
---
 >
 >               INIT_WORK(&scsitap_work,scsitap_scan_hba,session);
 >               schedule_work(&scsitap_work);




[root@goobu kernel]# diff ../include/vscsihba.h 
../../vscsihba2/include/vscsihba.h
16d15
< #include <linux/version.h>
51,56d49
< #if LINUX_VERSION_CODE < KERNEL_VERSION(2,6,20)
<         #define SCSITAP_INIT_WORK(work, func)     INIT_WORK((work), 
(func), (void *)(work));
< #else
<         #define SCSITAP_INIT_WORK(work, func)     INIT_WORK((work), ( 
void (*) (struct work_struct *))(func));
< #endif
<


Aboo



Aboo

Stefan Richter wrote:
> Douglas Gilbert wrote:
>   
>> The table would be for different versions as it looks
>> like you may need a new one for bleeding edge kernels.
>>
>> I didn't get far trying to build the kernel module
>> against lk 2.6.20-rc5:
>>
>> # make
>> make -C /lib/modules/2.6.20-rc5/build M=/home/upgrades/apps/vscsihba1/vscsihba1/kernel modules
>> make[1]: Entering directory `/usr/src/linux-2.6.19'
>>   CC [M]  /home/upgrades/apps/vscsihba1/vscsihba1/kernel/hba.o
>> /home/upgrades/apps/vscsihba1/vscsihba1/kernel/hba.c:26: warning: ‘kmem_cache_t’ is deprecated
>>   CC [M]  /home/upgrades/apps/vscsihba1/vscsihba1/kernel/device.o
>> /home/upgrades/apps/vscsihba1/vscsihba1/kernel/device.c:263:51: error: macro "INIT_WORK" passed 3 arguments, but takes just 2
>>     
> [...]
>
> Aboo,
>
> the workqueue API changes after 2.6.19 are for example explained here:
> http://lwn.net/Articles/213149/
> There are a lot of workqueue API conversion patches in 2.6.20-rc1 which
> can be taken as example. The first step when converting to the new API
> is to determine whether the work has to be delayed sometimes or can
> always be queued as immediate work. In the latter case, a slimmed-down
> variant of delayed work is used.
>
> The conversion away from kmem_cache_t is trivial. There are also some
> patches in 2.6.20-rc1 or later to use as example.
>   

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Linux Virtual SCSI HBAs and Virtual disks
  2007-01-17 10:24     ` Aboo Valappil
@ 2007-01-17 22:20       ` Douglas Gilbert
  2007-01-17 21:59         ` aboo
  2007-01-21  9:48         ` Aboo Valappil
  0 siblings, 2 replies; 29+ messages in thread
From: Douglas Gilbert @ 2007-01-17 22:20 UTC (permalink / raw)
  To: Aboo Valappil; +Cc: Stefan Richter, linux-scsi

Aboo Valappil wrote:
> Hi All,
> 
> Thanks everyone to have a look at this.
> 
> I think i modified to have the latest kernel support. Unfortunately I
> could not test it with 2.6.20 kernel due to some issues in my laptop and
> 2.6.20 kernel. But it should work with 2.6.20 with this modification.
> 
> The modified version is available through
> http://vscsihba.aboo.org/vscsihbav202.tgz.
> 
> 1. I fixed the kmem_cache issue for sure.
> 2. I think i got around with INIT_WORK ... Made the following
> modifications ...

Perhaps you could get some of my scsi tools (e.g.
sdparm and sg3_utils) and make sure that vscsihba
can handle everything they can throw at it.
If the user space doesn't support a SCSI command then
your driver should fail gracefully (i.e. CHECK CONDITION,
etc).

Here is a worrying example: sdparm sends an INQUIRY
and a couple of MODE SENSE(10) commands to a device.
/dev/sda was created by your script:
$ ./start_target.sh id=3 -files zz_lun0

$ sdparm /dev/sda
    /dev/sda: VirtualH  VHD               0
<long wait>
$


However dmesg showed this:

vscsihba:3: In Reset Device
vscsihba:3: In Reset Device
vscsihba:3: In Reset Device
vscsihba:3: In Reset Device
vscsihba:3: In Reset Device
vscsihba:3: In Reset Device
sd 0:0:0:0: SCSI error: return code = 0x00000002
end_request: I/O error, dev sda, sector 10240
Buffer I/O error on device sda, logical block 10240
vscsihba:3: In Reset Device
vscsihba:3: In Reset Device
vscsihba:3: In Reset Device
vscsihba:3: In Reset Device
vscsihba:3: In Reset Device
vscsihba:3: In Reset Device
sd 0:0:0:0: SCSI error: return code = 0x00000002
end_request: I/O error, dev sda, sector 10240
Buffer I/O error on device sda, logical block 10240
vscsihba:3: In Reset Device
vscsihba:3: In Reset Device
vscsihba:3: In Reset Device
vscsihba:3: In Reset Device
vscsihba:3: In Reset Device
vscsihba:3: In Reset Device
sd 0:0:0:0: SCSI error: return code = 0x00000002
end_request: I/O error, dev sda, sector 10240
Buffer I/O error on device sda, logical block 10240
vscsihba:3: In Reset Device
vscsihba:3: In Reset Device
vscsihba:3: In Reset Device
vscsihba:3: In Reset Device
vscsihba:3: In Reset Device
vscsihba:3: In Reset Device
sd 0:0:0:0: SCSI error: return code = 0x00000002
end_request: I/O error, dev sda, sector 10240
Buffer I/O error on device sda, logical block 10240
vscsihba:3: In Reset Device
vscsihba:3: In Reset Device
vscsihba:3: In Reset Device
vscsihba:3: In Reset Device
vscsihba:3: In Reset Device
vscsihba:3: In Reset Device
sd 0:0:0:0: SCSI error: return code = 0x00000002
end_request: I/O error, dev sda, sector 10240
Buffer I/O error on device sda, logical block 10240
vscsihba:3: In Reset Device
vscsihba:3: In Reset Device
vscsihba:3: In Reset Device
vscsihba:3: In Reset Device
vscsihba:3: In Reset Device
vscsihba:3: In Reset Device
sd 0:0:0:0: SCSI error: return code = 0x00000002
end_request: I/O error, dev sda, sector 10240
Buffer I/O error on device sda, logical block 10240
BUG: at kernel/sched.c:3388 sub_preempt_count()
 [<e1bf029c>] scsitap_eh_abort+0x1c/0x90 [vscsihba]
 [<c024fe22>] scsi_error_handler+0x3e2/0xbe0
 [<c02d74f1>] __sched_text_start+0x2f1/0x660
 [<c024fa40>] scsi_error_handler+0x0/0xbe0
 [<c0131679>] kthread+0xa9/0xe0
 [<c01315d0>] kthread+0x0/0xe0
 [<c0103d0f>] kernel_thread_helper+0x7/0x18
 =======================
vscsihba:3: Abortng command serial number : 94
BUG: scheduling while atomic: scsi_eh_0/0x00000001/4749
 [<c02d7684>] __sched_text_start+0x484/0x660
 [<c013183b>] autoremove_wake_function+0x1b/0x50
 [<c01264a8>] lock_timer_base+0x28/0x70
 [<c01265f2>] __mod_timer+0x92/0xd0
 [<c02d826b>] schedule_timeout+0x4b/0xd0
 [<c01269c0>] process_timeout+0x0/0x10
 [<c02d7bbc>] wait_for_completion_timeout+0x9c/0x130
 [<c0119ee0>] default_wake_function+0x0/0x10
 [<c024f3c9>] scsi_send_eh_cmnd+0x1b9/0x390
 [<c011df3e>] vprintk+0x1fe/0x3a0
 [<c024f805>] scsi_delete_timer+0x15/0x60
 [<c024f624>] scsi_eh_tur+0x34/0xa0
 [<c024fe69>] scsi_error_handler+0x429/0xbe0
 [<c02d74f1>] __sched_text_start+0x2f1/0x660
 [<c024fa40>] scsi_error_handler+0x0/0xbe0
 [<c0131679>] kthread+0xa9/0xe0
 [<c01315d0>] kthread+0x0/0xe0
 [<c0103d0f>] kernel_thread_helper+0x7/0x18
 =======================
vscsihba:3: Abortng command serial number : 95
vscsihba:3: In Reset Device
BUG: scheduling while atomic: scsi_eh_0/0x00000001/4749
 [<c02d7684>] __sched_text_start+0x484/0x660
 [<c011df3e>] vprintk+0x1fe/0x3a0
 [<c01264a8>] lock_timer_base+0x28/0x70
 [<c01265f2>] __mod_timer+0x92/0xd0
 [<c02d826b>] schedule_timeout+0x4b/0xd0
 [<c01269c0>] process_timeout+0x0/0x10
 [<c02d7bbc>] wait_for_completion_timeout+0x9c/0x130
 [<c0119ee0>] default_wake_function+0x0/0x10
 [<c024f3c9>] scsi_send_eh_cmnd+0x1b9/0x390
 [<c024f805>] scsi_delete_timer+0x15/0x60
 [<c024f624>] scsi_eh_tur+0x34/0xa0
 [<e1bf00cd>] scsitap_eh_device_reset+0x1d/0x30 [vscsihba]
 [<c02503a8>] scsi_error_handler+0x968/0xbe0
 [<c02d74f1>] __sched_text_start+0x2f1/0x660
 [<c024fa40>] scsi_error_handler+0x0/0xbe0
 [<c0131679>] kthread+0xa9/0xe0
 [<c01315d0>] kthread+0x0/0xe0
 [<c0103d0f>] kernel_thread_helper+0x7/0x18
 =======================
vscsihba:3: Abortng command serial number : 96
vscsihba:3: In Reset Host
BUG: scheduling while atomic: scsi_eh_0/0x00000001/4749
 [<c02d7684>] __sched_text_start+0x484/0x660
 [<c01264a8>] lock_timer_base+0x28/0x70
 [<c01265f2>] __mod_timer+0x92/0xd0
 [<c02d826b>] schedule_timeout+0x4b/0xd0
 [<c01269c0>] process_timeout+0x0/0x10
 [<c01273d5>] msleep+0x25/0x30
 [<c024efb1>] scsi_try_host_reset+0xa1/0xd0
 [<c0250150>] scsi_error_handler+0x710/0xbe0
 [<c02d74f1>] __sched_text_start+0x2f1/0x660
 [<c024fa40>] scsi_error_handler+0x0/0xbe0
 [<c0131679>] kthread+0xa9/0xe0
 [<c01315d0>] kthread+0x0/0xe0
 [<c0103d0f>] kernel_thread_helper+0x7/0x18
 =======================
BUG: scheduling while atomic: scsi_eh_0/0x00000001/4749
 [<c02d7684>] __sched_text_start+0x484/0x660
 [<c01264a8>] lock_timer_base+0x28/0x70
 [<c01265f2>] __mod_timer+0x92/0xd0
 [<c02d826b>] schedule_timeout+0x4b/0xd0
 [<c01269c0>] process_timeout+0x0/0x10
 [<c02d7bbc>] wait_for_completion_timeout+0x9c/0x130
 [<c0119ee0>] default_wake_function+0x0/0x10
 [<c024f3c9>] scsi_send_eh_cmnd+0x1b9/0x390
 [<c02d74f1>] __sched_text_start+0x2f1/0x660
 [<c01265f2>] __mod_timer+0x92/0xd0
 [<c02d8272>] schedule_timeout+0x52/0xd0
 [<c024f624>] scsi_eh_tur+0x34/0xa0
 [<c02501a0>] scsi_error_handler+0x760/0xbe0
 [<c02d74f1>] __sched_text_start+0x2f1/0x660
 [<c024fa40>] scsi_error_handler+0x0/0xbe0
 [<c0131679>] kthread+0xa9/0xe0
 [<c01315d0>] kthread+0x0/0xe0
 [<c0103d0f>] kernel_thread_helper+0x7/0x18
 =======================
vscsihba:3: Abortng command serial number : 97
sd 0:0:0:0: scsi: Device offlined - not ready after error recovery
BUG: scheduling while atomic: scsi_eh_0/0x00000001/4749
 [<c02d7684>] __sched_text_start+0x484/0x660
 [<c013aa11>] module_put+0x31/0x60
 [<c024bd6e>] scsi_device_put+0x3e/0x40
 [<c024be5f>] __scsi_iterate_devices+0x6f/0x90
 [<c024fa86>] scsi_error_handler+0x46/0xbe0
 [<c024fa40>] scsi_error_handler+0x0/0xbe0
 [<c0131679>] kthread+0xa9/0xe0
 [<c01315d0>] kthread+0x0/0xe0
 [<c0103d0f>] kernel_thread_helper+0x7/0x18
 =======================


Doug Gilbert

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Linux Virtual SCSI HBAs and Virtual disks
  2007-01-17 22:20       ` Douglas Gilbert
@ 2007-01-17 21:59         ` aboo
  2007-01-18  0:38           ` Stefan Richter
  2007-01-21  9:48         ` Aboo Valappil
  1 sibling, 1 reply; 29+ messages in thread
From: aboo @ 2007-01-17 21:59 UTC (permalink / raw)
  To: dougg; +Cc: Stefan Richter, linux-scsi

Thanks Doug Gilber for looking in to this one. I will fix this. This is really useful input. At the moment, the kernel driver can not deal with sense codes. If a SCSI command fails, it returns non 0 return code to kernel SCSI driver. But it does not return any sense codes (It just makes all zero making the SCSI driver think that there is no sense information). I will impliment the sense codes inside the driver so that the kernel part will be stable even if the user space does not support the commands.

I just wanted to pitch this idea infront of everyone. I will make further modification to impliment sense codes.

Also, I am facing another challenge which I wanted to ask. If the kernel or user space opens  a scsi device, there is no healthy way of identifying if a scsi device is open or not. The struct scsi_device does not have a field. I can see a openers field in the scsi_disk structure inside sd.c. I can see it incrimenting in sd_open(). What is the best to way to identify if a scsi_device id open by kernel (mount or volumes manager) or use application (fsck or dd)? When a destroy a scsi_host, it kills all the scsi_devices associated with it irrespective of those devices are open or not. Any thoughts?

I looked at the other scsi drivers like buslogic, qlogic, etc. But they open only one scsi_host per module (Created during module load and destroyed during module unload). Since the module usage goes up when someone opens the scsi_device under, it wont let them unload the module and hence can not destory the scsi_host. But i am creating the scsi host dunamically and need to check if any one opened the scsi_device underneath before destroying the host. Any thoughts?

Aboo

On Wed, 17 Jan 2007 17:20:08 -0500, Douglas Gilbert <dougg@torque.net> wrote:
> Aboo Valappil wrote:
>> Hi All,
>>
>> Thanks everyone to have a look at this.
>>
>> I think i modified to have the latest kernel support. Unfortunately I
>> could not test it with 2.6.20 kernel due to some issues in my laptop and
>> 2.6.20 kernel. But it should work with 2.6.20 with this modification.
>>
>> The modified version is available through
>> http://vscsihba.aboo.org/vscsihbav202.tgz.
>>
>> 1. I fixed the kmem_cache issue for sure.
>> 2. I think i got around with INIT_WORK ... Made the following
>> modifications ...
> 
> Perhaps you could get some of my scsi tools (e.g.
> sdparm and sg3_utils) and make sure that vscsihba
> can handle everything they can throw at it.
> If the user space doesn't support a SCSI command then
> your driver should fail gracefully (i.e. CHECK CONDITION,
> etc).
> 
> Here is a worrying example: sdparm sends an INQUIRY
> and a couple of MODE SENSE(10) commands to a device.
> /dev/sda was created by your script:
> $ ./start_target.sh id=3 -files zz_lun0
> 
> $ sdparm /dev/sda
>     /dev/sda: VirtualH  VHD               0
> <long wait>
> $
> 
> 
> However dmesg showed this:
> 
> vscsihba:3: In Reset Device
> vscsihba:3: In Reset Device
> vscsihba:3: In Reset Device
> vscsihba:3: In Reset Device
> vscsihba:3: In Reset Device
> vscsihba:3: In Reset Device
> sd 0:0:0:0: SCSI error: return code = 0x00000002
> end_request: I/O error, dev sda, sector 10240
> Buffer I/O error on device sda, logical block 10240
> vscsihba:3: In Reset Device
> vscsihba:3: In Reset Device
> vscsihba:3: In Reset Device
> vscsihba:3: In Reset Device
> vscsihba:3: In Reset Device
> vscsihba:3: In Reset Device
> sd 0:0:0:0: SCSI error: return code = 0x00000002
> end_request: I/O error, dev sda, sector 10240
> Buffer I/O error on device sda, logical block 10240
> vscsihba:3: In Reset Device
> vscsihba:3: In Reset Device
> vscsihba:3: In Reset Device
> vscsihba:3: In Reset Device
> vscsihba:3: In Reset Device
> vscsihba:3: In Reset Device
> sd 0:0:0:0: SCSI error: return code = 0x00000002
> end_request: I/O error, dev sda, sector 10240
> Buffer I/O error on device sda, logical block 10240
> vscsihba:3: In Reset Device
> vscsihba:3: In Reset Device
> vscsihba:3: In Reset Device
> vscsihba:3: In Reset Device
> vscsihba:3: In Reset Device
> vscsihba:3: In Reset Device
> sd 0:0:0:0: SCSI error: return code = 0x00000002
> end_request: I/O error, dev sda, sector 10240
> Buffer I/O error on device sda, logical block 10240
> vscsihba:3: In Reset Device
> vscsihba:3: In Reset Device
> vscsihba:3: In Reset Device
> vscsihba:3: In Reset Device
> vscsihba:3: In Reset Device
> vscsihba:3: In Reset Device
> sd 0:0:0:0: SCSI error: return code = 0x00000002
> end_request: I/O error, dev sda, sector 10240
> Buffer I/O error on device sda, logical block 10240
> vscsihba:3: In Reset Device
> vscsihba:3: In Reset Device
> vscsihba:3: In Reset Device
> vscsihba:3: In Reset Device
> vscsihba:3: In Reset Device
> vscsihba:3: In Reset Device
> sd 0:0:0:0: SCSI error: return code = 0x00000002
> end_request: I/O error, dev sda, sector 10240
> Buffer I/O error on device sda, logical block 10240
> BUG: at kernel/sched.c:3388 sub_preempt_count()
>  [<e1bf029c>] scsitap_eh_abort+0x1c/0x90 [vscsihba]
>  [<c024fe22>] scsi_error_handler+0x3e2/0xbe0
>  [<c02d74f1>] __sched_text_start+0x2f1/0x660
>  [<c024fa40>] scsi_error_handler+0x0/0xbe0
>  [<c0131679>] kthread+0xa9/0xe0
>  [<c01315d0>] kthread+0x0/0xe0
>  [<c0103d0f>] kernel_thread_helper+0x7/0x18
>  =======================
> vscsihba:3: Abortng command serial number : 94
> BUG: scheduling while atomic: scsi_eh_0/0x00000001/4749
>  [<c02d7684>] __sched_text_start+0x484/0x660
>  [<c013183b>] autoremove_wake_function+0x1b/0x50
>  [<c01264a8>] lock_timer_base+0x28/0x70
>  [<c01265f2>] __mod_timer+0x92/0xd0
>  [<c02d826b>] schedule_timeout+0x4b/0xd0
>  [<c01269c0>] process_timeout+0x0/0x10
>  [<c02d7bbc>] wait_for_completion_timeout+0x9c/0x130
>  [<c0119ee0>] default_wake_function+0x0/0x10
>  [<c024f3c9>] scsi_send_eh_cmnd+0x1b9/0x390
>  [<c011df3e>] vprintk+0x1fe/0x3a0
>  [<c024f805>] scsi_delete_timer+0x15/0x60
>  [<c024f624>] scsi_eh_tur+0x34/0xa0
>  [<c024fe69>] scsi_error_handler+0x429/0xbe0
>  [<c02d74f1>] __sched_text_start+0x2f1/0x660
>  [<c024fa40>] scsi_error_handler+0x0/0xbe0
>  [<c0131679>] kthread+0xa9/0xe0
>  [<c01315d0>] kthread+0x0/0xe0
>  [<c0103d0f>] kernel_thread_helper+0x7/0x18
>  =======================
> vscsihba:3: Abortng command serial number : 95
> vscsihba:3: In Reset Device
> BUG: scheduling while atomic: scsi_eh_0/0x00000001/4749
>  [<c02d7684>] __sched_text_start+0x484/0x660
>  [<c011df3e>] vprintk+0x1fe/0x3a0
>  [<c01264a8>] lock_timer_base+0x28/0x70
>  [<c01265f2>] __mod_timer+0x92/0xd0
>  [<c02d826b>] schedule_timeout+0x4b/0xd0
>  [<c01269c0>] process_timeout+0x0/0x10
>  [<c02d7bbc>] wait_for_completion_timeout+0x9c/0x130
>  [<c0119ee0>] default_wake_function+0x0/0x10
>  [<c024f3c9>] scsi_send_eh_cmnd+0x1b9/0x390
>  [<c024f805>] scsi_delete_timer+0x15/0x60
>  [<c024f624>] scsi_eh_tur+0x34/0xa0
>  [<e1bf00cd>] scsitap_eh_device_reset+0x1d/0x30 [vscsihba]
>  [<c02503a8>] scsi_error_handler+0x968/0xbe0
>  [<c02d74f1>] __sched_text_start+0x2f1/0x660
>  [<c024fa40>] scsi_error_handler+0x0/0xbe0
>  [<c0131679>] kthread+0xa9/0xe0
>  [<c01315d0>] kthread+0x0/0xe0
>  [<c0103d0f>] kernel_thread_helper+0x7/0x18
>  =======================
> vscsihba:3: Abortng command serial number : 96
> vscsihba:3: In Reset Host
> BUG: scheduling while atomic: scsi_eh_0/0x00000001/4749
>  [<c02d7684>] __sched_text_start+0x484/0x660
>  [<c01264a8>] lock_timer_base+0x28/0x70
>  [<c01265f2>] __mod_timer+0x92/0xd0
>  [<c02d826b>] schedule_timeout+0x4b/0xd0
>  [<c01269c0>] process_timeout+0x0/0x10
>  [<c01273d5>] msleep+0x25/0x30
>  [<c024efb1>] scsi_try_host_reset+0xa1/0xd0
>  [<c0250150>] scsi_error_handler+0x710/0xbe0
>  [<c02d74f1>] __sched_text_start+0x2f1/0x660
>  [<c024fa40>] scsi_error_handler+0x0/0xbe0
>  [<c0131679>] kthread+0xa9/0xe0
>  [<c01315d0>] kthread+0x0/0xe0
>  [<c0103d0f>] kernel_thread_helper+0x7/0x18
>  =======================
> BUG: scheduling while atomic: scsi_eh_0/0x00000001/4749
>  [<c02d7684>] __sched_text_start+0x484/0x660
>  [<c01264a8>] lock_timer_base+0x28/0x70
>  [<c01265f2>] __mod_timer+0x92/0xd0
>  [<c02d826b>] schedule_timeout+0x4b/0xd0
>  [<c01269c0>] process_timeout+0x0/0x10
>  [<c02d7bbc>] wait_for_completion_timeout+0x9c/0x130
>  [<c0119ee0>] default_wake_function+0x0/0x10
>  [<c024f3c9>] scsi_send_eh_cmnd+0x1b9/0x390
>  [<c02d74f1>] __sched_text_start+0x2f1/0x660
>  [<c01265f2>] __mod_timer+0x92/0xd0
>  [<c02d8272>] schedule_timeout+0x52/0xd0
>  [<c024f624>] scsi_eh_tur+0x34/0xa0
>  [<c02501a0>] scsi_error_handler+0x760/0xbe0
>  [<c02d74f1>] __sched_text_start+0x2f1/0x660
>  [<c024fa40>] scsi_error_handler+0x0/0xbe0
>  [<c0131679>] kthread+0xa9/0xe0
>  [<c01315d0>] kthread+0x0/0xe0
>  [<c0103d0f>] kernel_thread_helper+0x7/0x18
>  =======================
> vscsihba:3: Abortng command serial number : 97
> sd 0:0:0:0: scsi: Device offlined - not ready after error recovery
> BUG: scheduling while atomic: scsi_eh_0/0x00000001/4749
>  [<c02d7684>] __sched_text_start+0x484/0x660
>  [<c013aa11>] module_put+0x31/0x60
>  [<c024bd6e>] scsi_device_put+0x3e/0x40
>  [<c024be5f>] __scsi_iterate_devices+0x6f/0x90
>  [<c024fa86>] scsi_error_handler+0x46/0xbe0
>  [<c024fa40>] scsi_error_handler+0x0/0xbe0
>  [<c0131679>] kthread+0xa9/0xe0
>  [<c01315d0>] kthread+0x0/0xe0
>  [<c0103d0f>] kernel_thread_helper+0x7/0x18
>  =======================
> 
> 
> Doug Gilbert

-------------------------------------
Aboo.Org - Compliments From A & J :)
-------------------------------------



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Linux Virtual SCSI HBAs and Virtual disks
  2007-01-17 21:59         ` aboo
@ 2007-01-18  0:38           ` Stefan Richter
  0 siblings, 0 replies; 29+ messages in thread
From: Stefan Richter @ 2007-01-18  0:38 UTC (permalink / raw)
  To: aboo; +Cc: dougg, linux-scsi

aboo wrote:
> Also, I am facing another challenge which I wanted to ask. If the kernel or
> user space opens  a scsi device, there is no healthy way of identifying if a
> scsi device is open or not. The struct scsi_device does not have a field. I
> can see a openers field in the scsi_disk structure inside sd.c. I can see it
> incrimenting in sd_open(). What is the best to way to identify if a
> scsi_device id open by kernel (mount or volumes manager) or use application
> (fsck or dd)? When a destroy a scsi_host, it kills all the scsi_devices
> associated with it irrespective of those devices are open or not. Any
> thoughts?

I thought about this myself too while working on the sbp2 driver. This
is a low-level driver from the SCSI perspective and a high-level driver
from the FireWire perspective. It was possible to unload a FireWire
low-level driver which provided the interconnect to a SBP-2 LU while
that LU's device file was in use --- with the obvious result of sudden
loss of connection. As a simple solution, sbp2 now blocks unloading of
the FireWire low-level driver as long as it is logged in into a SBP-2
target (i.e. independent of actual usage of the device; longer than
actually necessary).

My idea for a more fine-grained solution was:

  - Add something like .slave_get(sdev) and .slave_put(sdev) to the
    scsi_host_template.
  - Let scsi_device_get() call shost->hostt->slave_get() if it exists.
    Let scsi_device_put() call shost->hostt->slave_put() if it exists.
  - In the two new hooks, an LLD can get and put whatever transport- or
    interconnect-related resources it needs, per LU.

However I never found this matter pressing enough to post a respective
patch here. From my perspective, it would have only been justified if
other LLDs besides sbp2 would have profited from it, and I was too busy
elsewhere to research that.
-- 
Stefan Richter
-=====-=-=== ---= =--=-
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Linux Virtual SCSI HBAs and Virtual disks
  2007-01-17 22:20       ` Douglas Gilbert
  2007-01-17 21:59         ` aboo
@ 2007-01-21  9:48         ` Aboo Valappil
  2007-01-21  9:53           ` Aboo Valappil
  1 sibling, 1 reply; 29+ messages in thread
From: Aboo Valappil @ 2007-01-21  9:48 UTC (permalink / raw)
  To: dougg; +Cc: Stefan Richter, linux-scsi

Hi Doug Gilbert,

sorry for the late reply. I am in the process of implementing sense code 
and I will make it available.

I tried the sdparms and it failed not due to lack of sense code and 
status. What happened was that the user space SCSI target died due to a 
unsupported SCSI command (bug in user space target). When it crashed, 
the SCSI disk served by that user space target was opened by sdparms. 
The driver removed the scsi_host which was attached to user space 
target, thinking that the last registered user space part died. I think 
those stack traces are due to the EH thread trying perform some sort of 
recovery on SCSI command, but the scsi host has been removed!

To prevent this, I implemented some checks. When the last user space 
application attached to the scsi_host, the driver will check to make 
sure that there is no open SCSI devices on that scsi_host. If there is 
some devices open, it will not remove the scsi_host. This kind of 
approach should be ok because my design supports re-attaching a new user 
space target with an existing scsi_host. The design also allows to start 
multiple user space target to serve one scsi_host in the kernel and 
there is no issue at all even if one dies. Whoever gets the SCSI command 
will serve it and sleep if nothing available.

Thanks

Aboo


Douglas Gilbert wrote:
> Aboo Valappil wrote:
>   
>> Hi All,
>>
>> Thanks everyone to have a look at this.
>>
>> I think i modified to have the latest kernel support. Unfortunately I
>> could not test it with 2.6.20 kernel due to some issues in my laptop and
>> 2.6.20 kernel. But it should work with 2.6.20 with this modification.
>>
>> The modified version is available through
>> http://vscsihba.aboo.org/vscsihbav202.tgz.
>>
>> 1. I fixed the kmem_cache issue for sure.
>> 2. I think i got around with INIT_WORK ... Made the following
>> modifications ...
>>     
>
> Perhaps you could get some of my scsi tools (e.g.
> sdparm and sg3_utils) and make sure that vscsihba
> can handle everything they can throw at it.
> If the user space doesn't support a SCSI command then
> your driver should fail gracefully (i.e. CHECK CONDITION,
> etc).
>
> Here is a worrying example: sdparm sends an INQUIRY
> and a couple of MODE SENSE(10) commands to a device.
> /dev/sda was created by your script:
> $ ./start_target.sh id=3 -files zz_lun0
>
> $ sdparm /dev/sda
>     /dev/sda: VirtualH  VHD               0
> <long wait>
> $
>
>
> However dmesg showed this:
>
> vscsihba:3: In Reset Device
> vscsihba:3: In Reset Device
> vscsihba:3: In Reset Device
> vscsihba:3: In Reset Device
> vscsihba:3: In Reset Device
> vscsihba:3: In Reset Device
> sd 0:0:0:0: SCSI error: return code = 0x00000002
> end_request: I/O error, dev sda, sector 10240
> Buffer I/O error on device sda, logical block 10240
> vscsihba:3: In Reset Device
> vscsihba:3: In Reset Device
> vscsihba:3: In Reset Device
> vscsihba:3: In Reset Device
> vscsihba:3: In Reset Device
> vscsihba:3: In Reset Device
> sd 0:0:0:0: SCSI error: return code = 0x00000002
> end_request: I/O error, dev sda, sector 10240
> Buffer I/O error on device sda, logical block 10240
> vscsihba:3: In Reset Device
> vscsihba:3: In Reset Device
> vscsihba:3: In Reset Device
> vscsihba:3: In Reset Device
> vscsihba:3: In Reset Device
> vscsihba:3: In Reset Device
> sd 0:0:0:0: SCSI error: return code = 0x00000002
> end_request: I/O error, dev sda, sector 10240
> Buffer I/O error on device sda, logical block 10240
> vscsihba:3: In Reset Device
> vscsihba:3: In Reset Device
> vscsihba:3: In Reset Device
> vscsihba:3: In Reset Device
> vscsihba:3: In Reset Device
> vscsihba:3: In Reset Device
> sd 0:0:0:0: SCSI error: return code = 0x00000002
> end_request: I/O error, dev sda, sector 10240
> Buffer I/O error on device sda, logical block 10240
> vscsihba:3: In Reset Device
> vscsihba:3: In Reset Device
> vscsihba:3: In Reset Device
> vscsihba:3: In Reset Device
> vscsihba:3: In Reset Device
> vscsihba:3: In Reset Device
> sd 0:0:0:0: SCSI error: return code = 0x00000002
> end_request: I/O error, dev sda, sector 10240
> Buffer I/O error on device sda, logical block 10240
> vscsihba:3: In Reset Device
> vscsihba:3: In Reset Device
> vscsihba:3: In Reset Device
> vscsihba:3: In Reset Device
> vscsihba:3: In Reset Device
> vscsihba:3: In Reset Device
> sd 0:0:0:0: SCSI error: return code = 0x00000002
> end_request: I/O error, dev sda, sector 10240
> Buffer I/O error on device sda, logical block 10240
> BUG: at kernel/sched.c:3388 sub_preempt_count()
>  [<e1bf029c>] scsitap_eh_abort+0x1c/0x90 [vscsihba]
>  [<c024fe22>] scsi_error_handler+0x3e2/0xbe0
>  [<c02d74f1>] __sched_text_start+0x2f1/0x660
>  [<c024fa40>] scsi_error_handler+0x0/0xbe0
>  [<c0131679>] kthread+0xa9/0xe0
>  [<c01315d0>] kthread+0x0/0xe0
>  [<c0103d0f>] kernel_thread_helper+0x7/0x18
>  =======================
> vscsihba:3: Abortng command serial number : 94
> BUG: scheduling while atomic: scsi_eh_0/0x00000001/4749
>  [<c02d7684>] __sched_text_start+0x484/0x660
>  [<c013183b>] autoremove_wake_function+0x1b/0x50
>  [<c01264a8>] lock_timer_base+0x28/0x70
>  [<c01265f2>] __mod_timer+0x92/0xd0
>  [<c02d826b>] schedule_timeout+0x4b/0xd0
>  [<c01269c0>] process_timeout+0x0/0x10
>  [<c02d7bbc>] wait_for_completion_timeout+0x9c/0x130
>  [<c0119ee0>] default_wake_function+0x0/0x10
>  [<c024f3c9>] scsi_send_eh_cmnd+0x1b9/0x390
>  [<c011df3e>] vprintk+0x1fe/0x3a0
>  [<c024f805>] scsi_delete_timer+0x15/0x60
>  [<c024f624>] scsi_eh_tur+0x34/0xa0
>  [<c024fe69>] scsi_error_handler+0x429/0xbe0
>  [<c02d74f1>] __sched_text_start+0x2f1/0x660
>  [<c024fa40>] scsi_error_handler+0x0/0xbe0
>  [<c0131679>] kthread+0xa9/0xe0
>  [<c01315d0>] kthread+0x0/0xe0
>  [<c0103d0f>] kernel_thread_helper+0x7/0x18
>  =======================
> vscsihba:3: Abortng command serial number : 95
> vscsihba:3: In Reset Device
> BUG: scheduling while atomic: scsi_eh_0/0x00000001/4749
>  [<c02d7684>] __sched_text_start+0x484/0x660
>  [<c011df3e>] vprintk+0x1fe/0x3a0
>  [<c01264a8>] lock_timer_base+0x28/0x70
>  [<c01265f2>] __mod_timer+0x92/0xd0
>  [<c02d826b>] schedule_timeout+0x4b/0xd0
>  [<c01269c0>] process_timeout+0x0/0x10
>  [<c02d7bbc>] wait_for_completion_timeout+0x9c/0x130
>  [<c0119ee0>] default_wake_function+0x0/0x10
>  [<c024f3c9>] scsi_send_eh_cmnd+0x1b9/0x390
>  [<c024f805>] scsi_delete_timer+0x15/0x60
>  [<c024f624>] scsi_eh_tur+0x34/0xa0
>  [<e1bf00cd>] scsitap_eh_device_reset+0x1d/0x30 [vscsihba]
>  [<c02503a8>] scsi_error_handler+0x968/0xbe0
>  [<c02d74f1>] __sched_text_start+0x2f1/0x660
>  [<c024fa40>] scsi_error_handler+0x0/0xbe0
>  [<c0131679>] kthread+0xa9/0xe0
>  [<c01315d0>] kthread+0x0/0xe0
>  [<c0103d0f>] kernel_thread_helper+0x7/0x18
>  =======================
> vscsihba:3: Abortng command serial number : 96
> vscsihba:3: In Reset Host
> BUG: scheduling while atomic: scsi_eh_0/0x00000001/4749
>  [<c02d7684>] __sched_text_start+0x484/0x660
>  [<c01264a8>] lock_timer_base+0x28/0x70
>  [<c01265f2>] __mod_timer+0x92/0xd0
>  [<c02d826b>] schedule_timeout+0x4b/0xd0
>  [<c01269c0>] process_timeout+0x0/0x10
>  [<c01273d5>] msleep+0x25/0x30
>  [<c024efb1>] scsi_try_host_reset+0xa1/0xd0
>  [<c0250150>] scsi_error_handler+0x710/0xbe0
>  [<c02d74f1>] __sched_text_start+0x2f1/0x660
>  [<c024fa40>] scsi_error_handler+0x0/0xbe0
>  [<c0131679>] kthread+0xa9/0xe0
>  [<c01315d0>] kthread+0x0/0xe0
>  [<c0103d0f>] kernel_thread_helper+0x7/0x18
>  =======================
> BUG: scheduling while atomic: scsi_eh_0/0x00000001/4749
>  [<c02d7684>] __sched_text_start+0x484/0x660
>  [<c01264a8>] lock_timer_base+0x28/0x70
>  [<c01265f2>] __mod_timer+0x92/0xd0
>  [<c02d826b>] schedule_timeout+0x4b/0xd0
>  [<c01269c0>] process_timeout+0x0/0x10
>  [<c02d7bbc>] wait_for_completion_timeout+0x9c/0x130
>  [<c0119ee0>] default_wake_function+0x0/0x10
>  [<c024f3c9>] scsi_send_eh_cmnd+0x1b9/0x390
>  [<c02d74f1>] __sched_text_start+0x2f1/0x660
>  [<c01265f2>] __mod_timer+0x92/0xd0
>  [<c02d8272>] schedule_timeout+0x52/0xd0
>  [<c024f624>] scsi_eh_tur+0x34/0xa0
>  [<c02501a0>] scsi_error_handler+0x760/0xbe0
>  [<c02d74f1>] __sched_text_start+0x2f1/0x660
>  [<c024fa40>] scsi_error_handler+0x0/0xbe0
>  [<c0131679>] kthread+0xa9/0xe0
>  [<c01315d0>] kthread+0x0/0xe0
>  [<c0103d0f>] kernel_thread_helper+0x7/0x18
>  =======================
> vscsihba:3: Abortng command serial number : 97
> sd 0:0:0:0: scsi: Device offlined - not ready after error recovery
> BUG: scheduling while atomic: scsi_eh_0/0x00000001/4749
>  [<c02d7684>] __sched_text_start+0x484/0x660
>  [<c013aa11>] module_put+0x31/0x60
>  [<c024bd6e>] scsi_device_put+0x3e/0x40
>  [<c024be5f>] __scsi_iterate_devices+0x6f/0x90
>  [<c024fa86>] scsi_error_handler+0x46/0xbe0
>  [<c024fa40>] scsi_error_handler+0x0/0xbe0
>  [<c0131679>] kthread+0xa9/0xe0
>  [<c01315d0>] kthread+0x0/0xe0
>  [<c0103d0f>] kernel_thread_helper+0x7/0x18
>  =======================
>
>
> Doug Gilbert
>   


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Linux Virtual SCSI HBAs and Virtual disks
  2007-01-21  9:48         ` Aboo Valappil
@ 2007-01-21  9:53           ` Aboo Valappil
  2007-01-21 11:24             ` Stefan Richter
  0 siblings, 1 reply; 29+ messages in thread
From: Aboo Valappil @ 2007-01-21  9:53 UTC (permalink / raw)
  To: Aboo Valappil; +Cc: dougg, Stefan Richter, linux-scsi

Also, the modified version is available in 
http://vscsihba.aboo.org/vscsihbav202.tgz.
I actually uses the "openers" field in scsi_disk to find out if anyone 
has the scsi_device open or not.

Aboo


Aboo

Aboo Valappil wrote:
> Hi Doug Gilbert,
>
> sorry for the late reply. I am in the process of implementing sense 
> code and I will make it available.
>
> I tried the sdparms and it failed not due to lack of sense code and 
> status. What happened was that the user space SCSI target died due to 
> a unsupported SCSI command (bug in user space target). When it 
> crashed, the SCSI disk served by that user space target was opened by 
> sdparms. The driver removed the scsi_host which was attached to user 
> space target, thinking that the last registered user space part died. 
> I think those stack traces are due to the EH thread trying perform 
> some sort of recovery on SCSI command, but the scsi host has been 
> removed!
>
> To prevent this, I implemented some checks. When the last user space 
> application attached to the scsi_host, the driver will check to make 
> sure that there is no open SCSI devices on that scsi_host. If there is 
> some devices open, it will not remove the scsi_host. This kind of 
> approach should be ok because my design supports re-attaching a new 
> user space target with an existing scsi_host. The design also allows 
> to start multiple user space target to serve one scsi_host in the 
> kernel and there is no issue at all even if one dies. Whoever gets the 
> SCSI command will serve it and sleep if nothing available.
>
> Thanks
>
> Aboo
>
>
> Douglas Gilbert wrote:
>> Aboo Valappil wrote:
>>  
>>> Hi All,
>>>
>>> Thanks everyone to have a look at this.
>>>
>>> I think i modified to have the latest kernel support. Unfortunately I
>>> could not test it with 2.6.20 kernel due to some issues in my laptop 
>>> and
>>> 2.6.20 kernel. But it should work with 2.6.20 with this modification.
>>>
>>> The modified version is available through
>>> http://vscsihba.aboo.org/vscsihbav202.tgz.
>>>
>>> 1. I fixed the kmem_cache issue for sure.
>>> 2. I think i got around with INIT_WORK ... Made the following
>>> modifications ...
>>>     
>>
>> Perhaps you could get some of my scsi tools (e.g.
>> sdparm and sg3_utils) and make sure that vscsihba
>> can handle everything they can throw at it.
>> If the user space doesn't support a SCSI command then
>> your driver should fail gracefully (i.e. CHECK CONDITION,
>> etc).
>>
>> Here is a worrying example: sdparm sends an INQUIRY
>> and a couple of MODE SENSE(10) commands to a device.
>> /dev/sda was created by your script:
>> $ ./start_target.sh id=3 -files zz_lun0
>>
>> $ sdparm /dev/sda
>>     /dev/sda: VirtualH  VHD               0
>> <long wait>
>> $
>>
>>
>> However dmesg showed this:
>>
>> vscsihba:3: In Reset Device
>> vscsihba:3: In Reset Device
>> vscsihba:3: In Reset Device
>> vscsihba:3: In Reset Device
>> vscsihba:3: In Reset Device
>> vscsihba:3: In Reset Device
>> sd 0:0:0:0: SCSI error: return code = 0x00000002
>> end_request: I/O error, dev sda, sector 10240
>> Buffer I/O error on device sda, logical block 10240
>> vscsihba:3: In Reset Device
>> vscsihba:3: In Reset Device
>> vscsihba:3: In Reset Device
>> vscsihba:3: In Reset Device
>> vscsihba:3: In Reset Device
>> vscsihba:3: In Reset Device
>> sd 0:0:0:0: SCSI error: return code = 0x00000002
>> end_request: I/O error, dev sda, sector 10240
>> Buffer I/O error on device sda, logical block 10240
>> vscsihba:3: In Reset Device
>> vscsihba:3: In Reset Device
>> vscsihba:3: In Reset Device
>> vscsihba:3: In Reset Device
>> vscsihba:3: In Reset Device
>> vscsihba:3: In Reset Device
>> sd 0:0:0:0: SCSI error: return code = 0x00000002
>> end_request: I/O error, dev sda, sector 10240
>> Buffer I/O error on device sda, logical block 10240
>> vscsihba:3: In Reset Device
>> vscsihba:3: In Reset Device
>> vscsihba:3: In Reset Device
>> vscsihba:3: In Reset Device
>> vscsihba:3: In Reset Device
>> vscsihba:3: In Reset Device
>> sd 0:0:0:0: SCSI error: return code = 0x00000002
>> end_request: I/O error, dev sda, sector 10240
>> Buffer I/O error on device sda, logical block 10240
>> vscsihba:3: In Reset Device
>> vscsihba:3: In Reset Device
>> vscsihba:3: In Reset Device
>> vscsihba:3: In Reset Device
>> vscsihba:3: In Reset Device
>> vscsihba:3: In Reset Device
>> sd 0:0:0:0: SCSI error: return code = 0x00000002
>> end_request: I/O error, dev sda, sector 10240
>> Buffer I/O error on device sda, logical block 10240
>> vscsihba:3: In Reset Device
>> vscsihba:3: In Reset Device
>> vscsihba:3: In Reset Device
>> vscsihba:3: In Reset Device
>> vscsihba:3: In Reset Device
>> vscsihba:3: In Reset Device
>> sd 0:0:0:0: SCSI error: return code = 0x00000002
>> end_request: I/O error, dev sda, sector 10240
>> Buffer I/O error on device sda, logical block 10240
>> BUG: at kernel/sched.c:3388 sub_preempt_count()
>>  [<e1bf029c>] scsitap_eh_abort+0x1c/0x90 [vscsihba]
>>  [<c024fe22>] scsi_error_handler+0x3e2/0xbe0
>>  [<c02d74f1>] __sched_text_start+0x2f1/0x660
>>  [<c024fa40>] scsi_error_handler+0x0/0xbe0
>>  [<c0131679>] kthread+0xa9/0xe0
>>  [<c01315d0>] kthread+0x0/0xe0
>>  [<c0103d0f>] kernel_thread_helper+0x7/0x18
>>  =======================
>> vscsihba:3: Abortng command serial number : 94
>> BUG: scheduling while atomic: scsi_eh_0/0x00000001/4749
>>  [<c02d7684>] __sched_text_start+0x484/0x660
>>  [<c013183b>] autoremove_wake_function+0x1b/0x50
>>  [<c01264a8>] lock_timer_base+0x28/0x70
>>  [<c01265f2>] __mod_timer+0x92/0xd0
>>  [<c02d826b>] schedule_timeout+0x4b/0xd0
>>  [<c01269c0>] process_timeout+0x0/0x10
>>  [<c02d7bbc>] wait_for_completion_timeout+0x9c/0x130
>>  [<c0119ee0>] default_wake_function+0x0/0x10
>>  [<c024f3c9>] scsi_send_eh_cmnd+0x1b9/0x390
>>  [<c011df3e>] vprintk+0x1fe/0x3a0
>>  [<c024f805>] scsi_delete_timer+0x15/0x60
>>  [<c024f624>] scsi_eh_tur+0x34/0xa0
>>  [<c024fe69>] scsi_error_handler+0x429/0xbe0
>>  [<c02d74f1>] __sched_text_start+0x2f1/0x660
>>  [<c024fa40>] scsi_error_handler+0x0/0xbe0
>>  [<c0131679>] kthread+0xa9/0xe0
>>  [<c01315d0>] kthread+0x0/0xe0
>>  [<c0103d0f>] kernel_thread_helper+0x7/0x18
>>  =======================
>> vscsihba:3: Abortng command serial number : 95
>> vscsihba:3: In Reset Device
>> BUG: scheduling while atomic: scsi_eh_0/0x00000001/4749
>>  [<c02d7684>] __sched_text_start+0x484/0x660
>>  [<c011df3e>] vprintk+0x1fe/0x3a0
>>  [<c01264a8>] lock_timer_base+0x28/0x70
>>  [<c01265f2>] __mod_timer+0x92/0xd0
>>  [<c02d826b>] schedule_timeout+0x4b/0xd0
>>  [<c01269c0>] process_timeout+0x0/0x10
>>  [<c02d7bbc>] wait_for_completion_timeout+0x9c/0x130
>>  [<c0119ee0>] default_wake_function+0x0/0x10
>>  [<c024f3c9>] scsi_send_eh_cmnd+0x1b9/0x390
>>  [<c024f805>] scsi_delete_timer+0x15/0x60
>>  [<c024f624>] scsi_eh_tur+0x34/0xa0
>>  [<e1bf00cd>] scsitap_eh_device_reset+0x1d/0x30 [vscsihba]
>>  [<c02503a8>] scsi_error_handler+0x968/0xbe0
>>  [<c02d74f1>] __sched_text_start+0x2f1/0x660
>>  [<c024fa40>] scsi_error_handler+0x0/0xbe0
>>  [<c0131679>] kthread+0xa9/0xe0
>>  [<c01315d0>] kthread+0x0/0xe0
>>  [<c0103d0f>] kernel_thread_helper+0x7/0x18
>>  =======================
>> vscsihba:3: Abortng command serial number : 96
>> vscsihba:3: In Reset Host
>> BUG: scheduling while atomic: scsi_eh_0/0x00000001/4749
>>  [<c02d7684>] __sched_text_start+0x484/0x660
>>  [<c01264a8>] lock_timer_base+0x28/0x70
>>  [<c01265f2>] __mod_timer+0x92/0xd0
>>  [<c02d826b>] schedule_timeout+0x4b/0xd0
>>  [<c01269c0>] process_timeout+0x0/0x10
>>  [<c01273d5>] msleep+0x25/0x30
>>  [<c024efb1>] scsi_try_host_reset+0xa1/0xd0
>>  [<c0250150>] scsi_error_handler+0x710/0xbe0
>>  [<c02d74f1>] __sched_text_start+0x2f1/0x660
>>  [<c024fa40>] scsi_error_handler+0x0/0xbe0
>>  [<c0131679>] kthread+0xa9/0xe0
>>  [<c01315d0>] kthread+0x0/0xe0
>>  [<c0103d0f>] kernel_thread_helper+0x7/0x18
>>  =======================
>> BUG: scheduling while atomic: scsi_eh_0/0x00000001/4749
>>  [<c02d7684>] __sched_text_start+0x484/0x660
>>  [<c01264a8>] lock_timer_base+0x28/0x70
>>  [<c01265f2>] __mod_timer+0x92/0xd0
>>  [<c02d826b>] schedule_timeout+0x4b/0xd0
>>  [<c01269c0>] process_timeout+0x0/0x10
>>  [<c02d7bbc>] wait_for_completion_timeout+0x9c/0x130
>>  [<c0119ee0>] default_wake_function+0x0/0x10
>>  [<c024f3c9>] scsi_send_eh_cmnd+0x1b9/0x390
>>  [<c02d74f1>] __sched_text_start+0x2f1/0x660
>>  [<c01265f2>] __mod_timer+0x92/0xd0
>>  [<c02d8272>] schedule_timeout+0x52/0xd0
>>  [<c024f624>] scsi_eh_tur+0x34/0xa0
>>  [<c02501a0>] scsi_error_handler+0x760/0xbe0
>>  [<c02d74f1>] __sched_text_start+0x2f1/0x660
>>  [<c024fa40>] scsi_error_handler+0x0/0xbe0
>>  [<c0131679>] kthread+0xa9/0xe0
>>  [<c01315d0>] kthread+0x0/0xe0
>>  [<c0103d0f>] kernel_thread_helper+0x7/0x18
>>  =======================
>> vscsihba:3: Abortng command serial number : 97
>> sd 0:0:0:0: scsi: Device offlined - not ready after error recovery
>> BUG: scheduling while atomic: scsi_eh_0/0x00000001/4749
>>  [<c02d7684>] __sched_text_start+0x484/0x660
>>  [<c013aa11>] module_put+0x31/0x60
>>  [<c024bd6e>] scsi_device_put+0x3e/0x40
>>  [<c024be5f>] __scsi_iterate_devices+0x6f/0x90
>>  [<c024fa86>] scsi_error_handler+0x46/0xbe0
>>  [<c024fa40>] scsi_error_handler+0x0/0xbe0
>>  [<c0131679>] kthread+0xa9/0xe0
>>  [<c01315d0>] kthread+0x0/0xe0
>>  [<c0103d0f>] kernel_thread_helper+0x7/0x18
>>  =======================
>>
>>
>> Doug Gilbert
>>   
>
>


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Linux Virtual SCSI HBAs and Virtual disks
  2007-01-21  9:53           ` Aboo Valappil
@ 2007-01-21 11:24             ` Stefan Richter
  2007-01-22  0:43               ` aboo
  0 siblings, 1 reply; 29+ messages in thread
From: Stefan Richter @ 2007-01-21 11:24 UTC (permalink / raw)
  To: Aboo Valappil; +Cc: dougg, linux-scsi

Aboo Valappil wrote:
> I actually uses the "openers" field in scsi_disk to find out if anyone
> has the scsi_device open or not.

There are several issues with this approach.
  - It will fail eventually because some day there may be other users of
a LU than sd. How would sg, sr, st be accommodated?
  - The type struct scsi_disk is defined locally in sd.c (not somewhere
in linux/include/scsi/) and you have to copy the struct definition in
your hba.c. That's because struct scsi_disk is not part of any in-kernel
API and you shouldn't use it in an LLD. If you really need to extend the
LLD API, then do it explicitly by patching the SCSI core and its
linux/include/scsi/ files, and do it as cleanly as possible.
  - You copied the comment "protected by BKL for now, yuck" on struct
scsi_disk.openers, but you forgot to access openers under actual
protection by BKL. I bet though that there are several more concurrency
issues when poking in dev_get_drvdata(&sdev->sdev_gendev).

(Is it actually still true that the BKL is taken when device files are
opened and closed?)

BTW, the comment on shost_for_each_device() in
linux/include/scsi/scsi_device.h says "This loop takes a reference on
each device and releases it at the end.  If you break out of the loop,
you must call scsi_device_put(sdev)." You forgot to do so.

> Aboo Valappil wrote:
>> I tried the sdparms and it failed not due to lack of sense code and
>> status. What happened was that the user space SCSI target died due
>> to a unsupported SCSI command (bug in user space target). When it
>> crashed, the SCSI disk served by that user space target was opened
>> by sdparms. The driver removed the scsi_host which was attached to
>> user space target, thinking that the last registered user space part
>> died.

When the userspace server vanished, it is as if hot-pluggable hardware
was removed. Your queuecommand hook, and probably the eh hooks and the
shutdown paths too, should be aware of such hot removals and act
accordingly. I haven't checked your code in detail but it seems you
already take some precautions. More may be necessary or at least
convenient, e.g. dequeuing and finishing all outstanding commands when a
hot removal was detected.

I can tell you that it is not exactly trivial to make Linux SCSI LLDs
handle hot removal correctly. You probably should look at some other
LLDs which have to deal with hot removal but (a) I don't guarantee you
find elegant solutions and (b) each type of transport or interconnect
has its own special requirements.

On the bright side, if you get the hot removal handling right, you may
be able to *completely avoid LLD API extensions* of the kind discussed
above.

Another more general note: You mentioned earlier that you suggest
vscsihba for inclusion into mainline. Read the following texts in
linux/Documentation: CodingStyle, SubmittingDrivers, SubmitChecklist.
BTW, you could also write a minimalist version of the userspace
counterpart to vscsihba and submit it as a file in
linux/Documentation/scsi/ as a programming example along with the patch
which adds vscsihba.
-- 
Stefan Richter
-=====-=-=== ---= =-=-=
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Linux Virtual SCSI HBAs and Virtual disks
  2007-01-21 11:24             ` Stefan Richter
@ 2007-01-22  0:43               ` aboo
  2007-01-22  2:23                 ` aboo
  0 siblings, 1 reply; 29+ messages in thread
From: aboo @ 2007-01-22  0:43 UTC (permalink / raw)
  To: Stefan Richter; +Cc: dougg, linux-scsi

Hi Stefan,

I understand, using the scsi_disk is really ugrly, Infact I knew it before.  There are no options without patching the kernel SCSI sub system? From your last email, you explained such an approach. I really do not want to write a patch. I wanted to impliment this in existing SCSI infrastruture). I am also not knowledgle enough to modify the SCSI subsystem with a patch. But I love to do that with guidance of people like you.

You just pointed out one of the problems I had when it broke out of the loop (The module could not be unloaded and I was wondering why!). I really did not read those comments, but used the macro because of the comments in Scsi_Host structure.

Sorry, I just ignored BKL without researching further :(

Aboo


On Sun, 21 Jan 2007 12:24:21 +0100, Stefan Richter <stefanr@s5r6.in-berlin.de> wrote:
> Aboo Valappil wrote:
>> I actually uses the "openers" field in scsi_disk to find out if anyone
>> has the scsi_device open or not.
> 
> There are several issues with this approach.
>   - It will fail eventually because some day there may be other users of
> a LU than sd. How would sg, sr, st be accommodated?
>   - The type struct scsi_disk is defined locally in sd.c (not somewhere
> in linux/include/scsi/) and you have to copy the struct definition in
> your hba.c. That's because struct scsi_disk is not part of any in-kernel
> API and you shouldn't use it in an LLD. If you really need to extend the
> LLD API, then do it explicitly by patching the SCSI core and its
> linux/include/scsi/ files, and do it as cleanly as possible.
>   - You copied the comment "protected by BKL for now, yuck" on struct
> scsi_disk.openers, but you forgot to access openers under actual
> protection by BKL. I bet though that there are several more concurrency
> issues when poking in dev_get_drvdata(&sdev->sdev_gendev).
> 
> (Is it actually still true that the BKL is taken when device files are
> opened and closed?)
> 
> BTW, the comment on shost_for_each_device() in
> linux/include/scsi/scsi_device.h says "This loop takes a reference on
> each device and releases it at the end.  If you break out of the loop,
> you must call scsi_device_put(sdev)." You forgot to do so.
> 
>> Aboo Valappil wrote:
>>> I tried the sdparms and it failed not due to lack of sense code and
>>> status. What happened was that the user space SCSI target died due
>>> to a unsupported SCSI command (bug in user space target). When it
>>> crashed, the SCSI disk served by that user space target was opened
>>> by sdparms. The driver removed the scsi_host which was attached to
>>> user space target, thinking that the last registered user space part
>>> died.
> 
> When the userspace server vanished, it is as if hot-pluggable hardware
> was removed. Your queuecommand hook, and probably the eh hooks and the
> shutdown paths too, should be aware of such hot removals and act
> accordingly. I haven't checked your code in detail but it seems you
> already take some precautions. More may be necessary or at least
> convenient, e.g. dequeuing and finishing all outstanding commands when a
> hot removal was detected.
> 
> I can tell you that it is not exactly trivial to make Linux SCSI LLDs
> handle hot removal correctly. You probably should look at some other
> LLDs which have to deal with hot removal but (a) I don't guarantee you
> find elegant solutions and (b) each type of transport or interconnect
> has its own special requirements.
> 
> On the bright side, if you get the hot removal handling right, you may
> be able to *completely avoid LLD API extensions* of the kind discussed
> above.
> 
> Another more general note: You mentioned earlier that you suggest
> vscsihba for inclusion into mainline. Read the following texts in
> linux/Documentation: CodingStyle, SubmittingDrivers, SubmitChecklist.
> BTW, you could also write a minimalist version of the userspace
> counterpart to vscsihba and submit it as a file in
> linux/Documentation/scsi/ as a programming example along with the patch
> which adds vscsihba.
> --
> Stefan Richter
> -=====-=-=== ---= =-=-=
> http://arcgraph.de/sr/

-------------------------------------
Aboo.Org - Compliments From A & J :)
-------------------------------------



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Linux Virtual SCSI HBAs and Virtual disks
  2007-01-22  0:43               ` aboo
@ 2007-01-22  2:23                 ` aboo
  2007-01-22 16:47                   ` Stefan Richter
  0 siblings, 1 reply; 29+ messages in thread
From: aboo @ 2007-01-22  2:23 UTC (permalink / raw)
  To: Stefan Richter; +Cc: dougg, linux-scsi

Hi Stefan Richter,

Can I use the following method safely to know if a scsi_device is open or not?

if ( atomic_read(&sdev->sdev_gendev.kobj.kref.refcount) > 14 ) {
  //sdev is in use
}

As soon as the scsi_device is created and after it passed through the 'sd' driver, it has got 14 references (Without anyone opening it). I need to go through the SCSI subsystem code in details to find out who is making all these references. I do not know how many references it is going to get when it gets passed through st driver. Any ideas?

Aboo


On Mon, 22 Jan 2007 11:43:16 +1100, aboo <aboo@aboo.org> wrote:
> Hi Stefan,
> 
> I understand, using the scsi_disk is really ugrly, Infact I knew it
> before.  There are no options without patching the kernel SCSI sub system?
> From your last email, you explained such an approach. I really do not want
> to write a patch. I wanted to impliment this in existing SCSI
> infrastruture). I am also not knowledgle enough to modify the SCSI
> subsystem with a patch. But I love to do that with guidance of people like
> you.
> 
> You just pointed out one of the problems I had when it broke out of the
> loop (The module could not be unloaded and I was wondering why!). I really
> did not read those comments, but used the macro because of the comments in
> Scsi_Host structure.
> 
> Sorry, I just ignored BKL without researching further :(
> 
> Aboo
> 
> 
> On Sun, 21 Jan 2007 12:24:21 +0100, Stefan Richter
> <stefanr@s5r6.in-berlin.de> wrote:
>> Aboo Valappil wrote:
>>> I actually uses the "openers" field in scsi_disk to find out if anyone
>>> has the scsi_device open or not.
>>
>> There are several issues with this approach.
>>   - It will fail eventually because some day there may be other users of
>> a LU than sd. How would sg, sr, st be accommodated?
>>   - The type struct scsi_disk is defined locally in sd.c (not somewhere
>> in linux/include/scsi/) and you have to copy the struct definition in
>> your hba.c. That's because struct scsi_disk is not part of any in-kernel
>> API and you shouldn't use it in an LLD. If you really need to extend the
>> LLD API, then do it explicitly by patching the SCSI core and its
>> linux/include/scsi/ files, and do it as cleanly as possible.
>>   - You copied the comment "protected by BKL for now, yuck" on struct
>> scsi_disk.openers, but you forgot to access openers under actual
>> protection by BKL. I bet though that there are several more concurrency
>> issues when poking in dev_get_drvdata(&sdev->sdev_gendev).
>>
>> (Is it actually still true that the BKL is taken when device files are
>> opened and closed?)
>>
>> BTW, the comment on shost_for_each_device() in
>> linux/include/scsi/scsi_device.h says "This loop takes a reference on
>> each device and releases it at the end.  If you break out of the loop,
>> you must call scsi_device_put(sdev)." You forgot to do so.
>>
>>> Aboo Valappil wrote:
>>>> I tried the sdparms and it failed not due to lack of sense code and
>>>> status. What happened was that the user space SCSI target died due
>>>> to a unsupported SCSI command (bug in user space target). When it
>>>> crashed, the SCSI disk served by that user space target was opened
>>>> by sdparms. The driver removed the scsi_host which was attached to
>>>> user space target, thinking that the last registered user space part
>>>> died.
>>
>> When the userspace server vanished, it is as if hot-pluggable hardware
>> was removed. Your queuecommand hook, and probably the eh hooks and the
>> shutdown paths too, should be aware of such hot removals and act
>> accordingly. I haven't checked your code in detail but it seems you
>> already take some precautions. More may be necessary or at least
>> convenient, e.g. dequeuing and finishing all outstanding commands when a
>> hot removal was detected.
>>
>> I can tell you that it is not exactly trivial to make Linux SCSI LLDs
>> handle hot removal correctly. You probably should look at some other
>> LLDs which have to deal with hot removal but (a) I don't guarantee you
>> find elegant solutions and (b) each type of transport or interconnect
>> has its own special requirements.
>>
>> On the bright side, if you get the hot removal handling right, you may
>> be able to *completely avoid LLD API extensions* of the kind discussed
>> above.
>>
>> Another more general note: You mentioned earlier that you suggest
>> vscsihba for inclusion into mainline. Read the following texts in
>> linux/Documentation: CodingStyle, SubmittingDrivers, SubmitChecklist.
>> BTW, you could also write a minimalist version of the userspace
>> counterpart to vscsihba and submit it as a file in
>> linux/Documentation/scsi/ as a programming example along with the patch
>> which adds vscsihba.
>> --
>> Stefan Richter
>> -=====-=-=== ---= =-=-=
>> http://arcgraph.de/sr/
> 
> -------------------------------------
> Aboo.Org - Compliments From A & J :)
> -------------------------------------

-------------------------------------
Aboo.Org - Compliments From A & J :)
-------------------------------------



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Linux Virtual SCSI HBAs and Virtual disks
  2007-01-22  2:23                 ` aboo
@ 2007-01-22 16:47                   ` Stefan Richter
  2007-01-22 16:58                     ` Stefan Richter
                                       ` (2 more replies)
  0 siblings, 3 replies; 29+ messages in thread
From: Stefan Richter @ 2007-01-22 16:47 UTC (permalink / raw)
  To: aboo; +Cc: dougg, linux-scsi

aboo wrote:
> Can I use the following method safely to know if a scsi_device is
> open or not?
> 
> if ( atomic_read(&sdev->sdev_gendev.kobj.kref.refcount) > 14 ) {
>   //sdev is in use
> }

No, this too relies far too much on implementation details of upper
layers. (Besides, what if the device is opened right after that? The
atomic refcount is not enough, something mutex-like would be necessary
to do anything useful with the information "open"/"not open".) Ideally,
your LLD sticks with what the Linux SCSI mid-low API has to offer. Thus
your LLD is only aware of this API, but *not* of implementation details
of the SCSI core, let alone SCSI high-level drivers or block I/O
subsystem or whatever other upper layer.

And in the end, why should vscsihba care whether a scsi_device is in use
or not? If a userspace device server quits or got killed or crashed,
"simply" let vscsihba request the removal of the scsi_device (or the
entire host if there is only one device per host). Whoever opened the
device cannot do anything useful with it anymore anyway when there is no
device server.

Of course it is not entirely as "simple" as it sounds. As mentioned, if
vscsihba becomes aware that a device server quit or crashed, let your
queuecommand hook finish all newly incoming commands immediately instead
of enqueueing them. Dequeue and finish all outstanding commands. Make
sure the eh hooks don't wait for something that can't happen anymore.
Note that when the removal of a device is requested, shutdown methods of
high-level drivers like sd become active and may try to issue new
commands (such as to synchronize disk caches). Therein lies potential
for deadlocks or, less critically, for minutes and minutes spent in
futile error recovery attempts.

So, I said you should ignore the in-use state of a scsi_device. Of
course that way you cannot give the userspace device server a status
notification from vscsihba which says "keep running for now, somebody is
using your device", or vice versa: "your last user went away, you can
safely quit now if you feel like it". But in my opinion you don't really
need such status notification in foreseeable future. vscsihba would
primarily or exclusively be used in controlled setups where the
administrator knows very well when it is safe to terminate a userspace
device server. Besides, you have to take into account anyway that a
userspace device server is killed or crashed when its device was in use.

As I wrote before, deal with it like with hot-unplug. A kernel driver
cannot prevent the user from pulling a cable.
-- 
Stefan Richter
-=====-=-=== ---= =-==-
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Linux Virtual SCSI HBAs and Virtual disks
  2007-01-22 16:47                   ` Stefan Richter
@ 2007-01-22 16:58                     ` Stefan Richter
  2007-01-22 18:07                     ` James Bottomley
  2007-01-23 13:11                     ` Aboo Valappil
  2 siblings, 0 replies; 29+ messages in thread
From: Stefan Richter @ 2007-01-22 16:58 UTC (permalink / raw)
  To: aboo; +Cc: dougg, linux-scsi

> A kernel driver cannot prevent the user from pulling a cable.

At least I hope so. It's ultimately impossible to be sure about that.
-- 
Stefan Richter
-=====-=-=== ---= =-==-
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Linux Virtual SCSI HBAs and Virtual disks
  2007-01-22 16:47                   ` Stefan Richter
  2007-01-22 16:58                     ` Stefan Richter
@ 2007-01-22 18:07                     ` James Bottomley
  2007-01-23 13:11                     ` Aboo Valappil
  2 siblings, 0 replies; 29+ messages in thread
From: James Bottomley @ 2007-01-22 18:07 UTC (permalink / raw)
  To: Stefan Richter; +Cc: aboo, dougg, linux-scsi

On Mon, 2007-01-22 at 17:47 +0100, Stefan Richter wrote:
> As I wrote before, deal with it like with hot-unplug. A kernel driver
> cannot prevent the user from pulling a cable.

This is exactly the correct advice.

The hotplug model requires that you own a reference on the resources you
want to access.  When you're done, you release the reference and the
last person to drop the reference causes the object to be freed.

For scsi_hosts, every open device keeps a reference on the host, so the
way do destroy one is to do a scsi_remove_host() which releases your
interest in the host object, but keeps it around long enough to tidy up
after it.  For physical drivers, your queuecommand() routine may still
be required to handle the odd stray command after the removal notice has
been issued, however, by and large SCSI begins rejecting I/O to the
devices higher up.  When the last device is closed, it will release the
last reference to the host and trigger final cleanup.

James

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Linux Virtual SCSI HBAs and Virtual disks
  2007-01-22 16:47                   ` Stefan Richter
  2007-01-22 16:58                     ` Stefan Richter
  2007-01-22 18:07                     ` James Bottomley
@ 2007-01-23 13:11                     ` Aboo Valappil
  2007-01-23 16:36                       ` Randy Dunlap
                                         ` (2 more replies)
  2 siblings, 3 replies; 29+ messages in thread
From: Aboo Valappil @ 2007-01-23 13:11 UTC (permalink / raw)
  To: Stefan Richter; +Cc: dougg, linux-scsi

Hi Stefan Richter,

Thanks everyone for their advice on this. As per your advice, I did the 
following when the last user space target serving the scsi_host quits, 
the queue command will do the following on the new commands coming through.

                sc->result = DID_NO_CONNECT << 16;
                sc->resid = sc->request_bufflen;
                set_sensedata_commfailure(sc);  --------------------- 
This sets the sense buffer with Device Not ready/Logical Unit 
Commincation failure.
                done(sc);

The scsi_host will remain in the kernel. Let the EH thread handle the 
queued commands (If any). If the user target wants to reconnects to the 
same scsi_host, it can do so (Just re-run the user space target again 
with same command line paramters).  This connection from newly started 
target will make the HBA healthy again and start serving IO.

I implemented a new IOCTL to remove  this  scsi_host  if the user 
process really needs to.  This removal  will first  finish all the SCSI 
commands (With the above status results) queued on the scsi_host  (If at 
all) and then remove the scsi_host.  Also the module unload will delete 
all the scsi_hosts created after finishing all the commands queued with 
the above status and sense information.

I also implemented passing of sense code information from user space to 
sense_buffer. A little more work needs to be done on this.
Also, I need to make sure that all the locking used inside is correctly 
implemented to prevent dead locks and improve efficiency.

The new version is available http://vscsihba.aboo.org/vscsihbav204.gz

Aboo

Stefan Richter wrote:
> aboo wrote:
>   
>> Can I use the following method safely to know if a scsi_device is
>> open or not?
>>
>> if ( atomic_read(&sdev->sdev_gendev.kobj.kref.refcount) > 14 ) {
>>   //sdev is in use
>> }
>>     
>
> No, this too relies far too much on implementation details of upper
> layers. (Besides, what if the device is opened right after that? The
> atomic refcount is not enough, something mutex-like would be necessary
> to do anything useful with the information "open"/"not open".) Ideally,
> your LLD sticks with what the Linux SCSI mid-low API has to offer. Thus
> your LLD is only aware of this API, but *not* of implementation details
> of the SCSI core, let alone SCSI high-level drivers or block I/O
> subsystem or whatever other upper layer.
>
> And in the end, why should vscsihba care whether a scsi_device is in use
> or not? If a userspace device server quits or got killed or crashed,
> "simply" let vscsihba request the removal of the scsi_device (or the
> entire host if there is only one device per host). Whoever opened the
> device cannot do anything useful with it anymore anyway when there is no
> device server.
>
> Of course it is not entirely as "simple" as it sounds. As mentioned, if
> vscsihba becomes aware that a device server quit or crashed, let your
> queuecommand hook finish all newly incoming commands immediately instead
> of enqueueing them. Dequeue and finish all outstanding commands. Make
> sure the eh hooks don't wait for something that can't happen anymore.
> Note that when the removal of a device is requested, shutdown methods of
> high-level drivers like sd become active and may try to issue new
> commands (such as to synchronize disk caches). Therein lies potential
> for deadlocks or, less critically, for minutes and minutes spent in
> futile error recovery attempts.
>
> So, I said you should ignore the in-use state of a scsi_device. Of
> course that way you cannot give the userspace device server a status
> notification from vscsihba which says "keep running for now, somebody is
> using your device", or vice versa: "your last user went away, you can
> safely quit now if you feel like it". But in my opinion you don't really
> need such status notification in foreseeable future. vscsihba would
> primarily or exclusively be used in controlled setups where the
> administrator knows very well when it is safe to terminate a userspace
> device server. Besides, you have to take into account anyway that a
> userspace device server is killed or crashed when its device was in use.
>
> As I wrote before, deal with it like with hot-unplug. A kernel driver
> cannot prevent the user from pulling a cable.
>   


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Linux Virtual SCSI HBAs and Virtual disks
  2007-01-23 13:11                     ` Aboo Valappil
@ 2007-01-23 16:36                       ` Randy Dunlap
  2007-01-23 17:22                         ` Stefan Richter
  2007-01-23 17:16                       ` Stefan Richter
  2007-01-24  3:24                       ` Douglas Gilbert
  2 siblings, 1 reply; 29+ messages in thread
From: Randy Dunlap @ 2007-01-23 16:36 UTC (permalink / raw)
  To: Aboo Valappil; +Cc: Stefan Richter, dougg, linux-scsi

On Wed, 24 Jan 2007 00:11:47 +1100 Aboo Valappil wrote:

> Hi Stefan Richter,
> 
> Thanks everyone for their advice on this. As per your advice, I did the 
> following when the last user space target serving the scsi_host quits, 
> the queue command will do the following on the new commands coming through.
> 
>                 sc->result = DID_NO_CONNECT << 16;
>                 sc->resid = sc->request_bufflen;
>                 set_sensedata_commfailure(sc);  --------------------- 
> This sets the sense buffer with Device Not ready/Logical Unit 
> Commincation failure.
>                 done(sc);
> 
> The scsi_host will remain in the kernel. Let the EH thread handle the 
> queued commands (If any). If the user target wants to reconnects to the 
> same scsi_host, it can do so (Just re-run the user space target again 
> with same command line paramters).  This connection from newly started 
> target will make the HBA healthy again and start serving IO.
> 
> I implemented a new IOCTL to remove  this  scsi_host  if the user 
> process really needs to.  This removal  will first  finish all the SCSI 
> commands (With the above status results) queued on the scsi_host  (If at 
> all) and then remove the scsi_host.  Also the module unload will delete 
> all the scsi_hosts created after finishing all the commands queued with 
> the above status and sense information.
> 
> I also implemented passing of sense code information from user space to 
> sense_buffer. A little more work needs to be done on this.
> Also, I need to make sure that all the locking used inside is correctly 
> implemented to prevent dead locks and improve efficiency.
> 
> The new version is available http://vscsihba.aboo.org/vscsihbav204.gz

404: NOT FOUND

---
~Randy

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Linux Virtual SCSI HBAs and Virtual disks
  2007-01-23 16:36                       ` Randy Dunlap
@ 2007-01-23 17:22                         ` Stefan Richter
  2007-01-24  9:47                           ` Aboo Valappil
  2007-01-25 22:02                           ` Aboo Valappil
  0 siblings, 2 replies; 29+ messages in thread
From: Stefan Richter @ 2007-01-23 17:22 UTC (permalink / raw)
  To: Randy Dunlap; +Cc: Aboo Valappil, dougg, linux-scsi

Randy Dunlap wrote:
> Aboo Valappil wrote:
>> The new version is available http://vscsihba.aboo.org/vscsihbav204.gz
> 
> 404: NOT FOUND

.gz -> .tgz

Besides the tarball, a browsable source tree would be nice for people
who just want to take a quick look.
-- 
Stefan Richter
-=====-=-=== ---= =-===
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Linux Virtual SCSI HBAs and Virtual disks
  2007-01-23 17:22                         ` Stefan Richter
@ 2007-01-24  9:47                           ` Aboo Valappil
  2007-01-25 22:02                           ` Aboo Valappil
  1 sibling, 0 replies; 29+ messages in thread
From: Aboo Valappil @ 2007-01-24  9:47 UTC (permalink / raw)
  To: Stefan Richter; +Cc: Randy Dunlap, dougg, linux-scsi

Stefan Richter wrote:
> Randy Dunlap wrote:
>   
>> Aboo Valappil wrote:
>>     
>>> The new version is available http://vscsihba.aboo.org/vscsihbav204.gz
>>>       
>> 404: NOT FOUND
>>     
>
> .gz -> .tgz
>
> Besides the tarball, a browsable source tree would be nice for people
> who just want to take a quick look.
>   
The source tree is available through the following URL.

http://vscsihba.aboo.org/vscsihbav204/

Abo


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Linux Virtual SCSI HBAs and Virtual disks
  2007-01-23 17:22                         ` Stefan Richter
  2007-01-24  9:47                           ` Aboo Valappil
@ 2007-01-25 22:02                           ` Aboo Valappil
  1 sibling, 0 replies; 29+ messages in thread
From: Aboo Valappil @ 2007-01-25 22:02 UTC (permalink / raw)
  To: Stefan Richter; +Cc: Randy Dunlap, dougg, linux-scsi

Hi All,

Thanks for all your encouragement and help on this project. I like to 
take this project one step ahead. I hope you can help me on this.

As you may have noticed, I am doing a copy of buffers ( request_buffer) 
between user space and kernel space. What are my options these days to 
make a user buffer access inside the kernel space and vice versa? Please 
also not that request_buffer is not a linear buffer it is  a sg :)

One option I researched was mmap. I am facing an issue here. I will try 
to explain. Everything seems to be working, but a big page table 
manipulation issue is found.

When a SCSI read comes in for a device (For eg: dd if=/dev/sda 
of=/dev/null count=1). A read is always accompanied by a write (I do not 
do this, but the kernel does it!).

I think the pages of "request_buffer" will get mapped to virtual memory  
of "dd" process.When I memmap request_buffer (Through mmap_nopage 
method) to "user space SCSI target", the same page of "dd" process gets 
mapped to "user space SCSI target". Since the operation is a read, the 
"user space target" modifies the page making it dirty as far as "page 
cache" is concerned. Of course, "dd" gets what it wants from read. But 
as soon as "dd" closes "/dev/sda", page cache knows that one of the page 
belong to "/dev/sda" is dirty and needs to be flushed to disk!. Now, the 
page cache issues a SCSI write for the same page through the SCSI 
driver. This makes all the SCSI reads accompanied by a SCSI write.

This is the most closest explanation i can come up with for the 
behaviour. If this explanation is correct or not, it is happening (A 
SCSI read following by a SCSI write).

If this has to work properly, I need to find a way of "memmap/user space 
target" to modify the pages silently without letting the page cache know 
about it. There could be some ugly page table manipulation. But I am 
wondering if this can be achieve easily or through some other way. One 
of my friend also told me about the splice option.... Any thoughts on this?

Aboo

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Linux Virtual SCSI HBAs and Virtual disks
  2007-01-23 13:11                     ` Aboo Valappil
  2007-01-23 16:36                       ` Randy Dunlap
@ 2007-01-23 17:16                       ` Stefan Richter
  2007-01-23 22:12                         ` Aboo Valappil
  2007-01-24  3:24                       ` Douglas Gilbert
  2 siblings, 1 reply; 29+ messages in thread
From: Stefan Richter @ 2007-01-23 17:16 UTC (permalink / raw)
  To: Aboo Valappil; +Cc: dougg, linux-scsi

Aboo Valappil wrote:
> I implemented a new IOCTL to remove  this  scsi_host  if the user 
> process really needs to.  This removal  will first  finish all the SCSI 
> commands (With the above status results) queued on the scsi_host  (If at 
> all) and then remove the scsi_host.  Also the module unload will delete 
> all the scsi_hosts created after finishing all the commands queued with 
> the above status and sense information.

This is a valid approach, but probably more useful would be something like:
  - userspace device server or "modprobe -r" or procfs/sysfs magic or
    whatever else requests removal of a Scsi_Host (or merely of a single
    scsi_device),
  - vscsihba enters scsi_remove_host() or scsi_remove_device(),
  - SCSI core and upper layers do whatever it takes to withdraw from
    the respective I-T(-L) nexus gracefully (e.g. synchronize cache,
    unlock drive door...),
  - userspace device server handles any previously remaining commands
    and the new shutdown commands like intended,
  - SCSI core and upper layers are finished with their business,
    the respective Scsi_Host or scsi_device does not exist anymore now,
    vscsihba leaves scsi_remove_host() or scsi_remove_device(),
  - vscsihba tells userspace device server somehow that there will be
    no further requests, ever.

That way, your "virtual" device server is exposed to everything which a
"real" device server would be too when a Linux initiator shuts the
connection down. Could be interesting to testbed device servers as well
as to userspace bridges to "real" device servers.

When implemented, the "graceful shutdown" path should look almost
exactly like the opposite of the start-up path. The "hot-unplug" path
looks a little different because vscsihba has to go through that path
without assistance of the userspace server. Ideally, the "hot-unplug"
path would actually be the "graceful shutdown" path plus a few little
extra measures to account for premature absence of the device server.

[Of course, I'm saying all this without ever having designed a Linux
SCSI LLD myself, only from the background of maintaining an LLD written
by other people...]
-- 
Stefan Richter
-=====-=-=== ---= =-===
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Linux Virtual SCSI HBAs and Virtual disks
  2007-01-23 17:16                       ` Stefan Richter
@ 2007-01-23 22:12                         ` Aboo Valappil
  2007-01-24  0:09                           ` Stefan Richter
  0 siblings, 1 reply; 29+ messages in thread
From: Aboo Valappil @ 2007-01-23 22:12 UTC (permalink / raw)
  To: Stefan Richter; +Cc: dougg, linux-scsi

Stefan Richter wrote:
> Aboo Valappil wrote:
>   
>> I implemented a new IOCTL to remove  this  scsi_host  if the user 
>> process really needs to.  This removal  will first  finish all the SCSI 
>> commands (With the above status results) queued on the scsi_host  (If at 
>> all) and then remove the scsi_host.  Also the module unload will delete 
>> all the scsi_hosts created after finishing all the commands queued with 
>> the above status and sense information.
>>     
>
> This is a valid approach, but probably more useful would be something like:
>   - userspace device server or "modprobe -r" or procfs/sysfs magic or
>     whatever else requests removal of a Scsi_Host (or merely of a single
>     scsi_device),
>   - vscsihba enters scsi_remove_host() or scsi_remove_device(),
>   - SCSI core and upper layers do whatever it takes to withdraw from
>     the respective I-T(-L) nexus gracefully (e.g. synchronize cache,
>     unlock drive door...),
>   
Does this happen automatically when the scsi_remove_host() is called, or 
I have to explicitly tell the upper layers to start shutting down 
gracefully?



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Linux Virtual SCSI HBAs and Virtual disks
  2007-01-23 22:12                         ` Aboo Valappil
@ 2007-01-24  0:09                           ` Stefan Richter
  0 siblings, 0 replies; 29+ messages in thread
From: Stefan Richter @ 2007-01-24  0:09 UTC (permalink / raw)
  To: Aboo Valappil; +Cc: dougg, linux-scsi

Aboo Valappil wrote:
> Stefan Richter wrote:
...
>>   - vscsihba enters scsi_remove_host() or scsi_remove_device(),
>>   - SCSI core and upper layers do whatever it takes to withdraw from
>>     the respective I-T(-L) nexus gracefully (e.g. synchronize cache,
>>     unlock drive door...),
>>   
> Does this happen automatically when the scsi_remove_host() is called,

Yes. The SCSI core passes the remove/shutdown request up to the higher
layers like sd_mod for orderly shutdown. There is no distinction between
regular shutdown and hot-unplug in SCSI core or above; these layers
treat everything as regular shutdown. Only the LLD can (and has to) make
this distinction.
-- 
Stefan Richter
-=====-=-=== ---= ==---
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Linux Virtual SCSI HBAs and Virtual disks
  2007-01-23 13:11                     ` Aboo Valappil
  2007-01-23 16:36                       ` Randy Dunlap
  2007-01-23 17:16                       ` Stefan Richter
@ 2007-01-24  3:24                       ` Douglas Gilbert
  2007-01-24  9:40                         ` Aboo Valappil
  2007-01-25 21:41                         ` Aboo Valappil
  2 siblings, 2 replies; 29+ messages in thread
From: Douglas Gilbert @ 2007-01-24  3:24 UTC (permalink / raw)
  To: Aboo Valappil; +Cc: Stefan Richter, linux-scsi

Aboo Valappil wrote:
> Hi Stefan Richter,
> 
> Thanks everyone for their advice on this. As per your advice, I did the
> following when the last user space target serving the scsi_host quits,
> the queue command will do the following on the new commands coming through.
> 
>                sc->result = DID_NO_CONNECT << 16;
>                sc->resid = sc->request_bufflen;
>                set_sensedata_commfailure(sc);  ---------------------
> This sets the sense buffer with Device Not ready/Logical Unit
> Commincation failure.
>                done(sc);
> 
> The scsi_host will remain in the kernel. Let the EH thread handle the
> queued commands (If any). If the user target wants to reconnects to the
> same scsi_host, it can do so (Just re-run the user space target again
> with same command line paramters).  This connection from newly started
> target will make the HBA healthy again and start serving IO.
> 
> I implemented a new IOCTL to remove  this  scsi_host  if the user
> process really needs to.  This removal  will first  finish all the SCSI
> commands (With the above status results) queued on the scsi_host  (If at
> all) and then remove the scsi_host.  Also the module unload will delete
> all the scsi_hosts created after finishing all the commands queued with
> the above status and sense information.
> 
> I also implemented passing of sense code information from user space to
> sense_buffer. A little more work needs to be done on this.
> Also, I need to make sure that all the locking used inside is correctly
> implemented to prevent dead locks and improve efficiency.
> 
> The new version is available http://vscsihba.aboo.org/vscsihbav204.gz

A few observations from testing this version:

# ./start_target.sh id=3 -files ../../zz_lun0 -v
# lsscsi
[0:0:0:0]    disk    Linux    scsi_debug       0004  /dev/sda
[1:0:0:0]    disk    VirtualH VHD              0     /dev/sdb

So "id=3" doesn't look the target identifier. If not, what
is it?

Here is an attempt to fetch the Read Write Error Recovery
mode page:
# sdparm -p rw -vv /dev/sg1
    inquiry cdb: 12 00 00 00 24 00
    /dev/sg1: VirtualH  VHD               0
    mode sense (10) cdb: 5a 00 01 00 00 00 00 00 08 00
mode sense (10): Probably uninitialized data.
  Try to view as SCSI-1 non-extended sense:
  AdValid=0  Error class=0  Error code=0

>> Read write error recovery mode page [0x1] failed


That implies a sense buffer full of zeroes. The debug
output from start_target.sh associated with that attempt:

SCSI cmd Lun=00 id=2D CDB=12 00 00 00 24 00 00 00 08 00 00 00 00 00 00 00
SCSI cmd Lun=00 id=2D completed, status=0
SCSI cmd Lun=00 id=2E CDB=5A 00 01 00 00 00 00 00 08 00 00 00 00 00 00 00
SCSI cmd Lun=00 id=2E completed, status=2
SCSI cmd Lun=00 id=2F CDB=03 00 00 00 FC 00 00 00 08 00 00 00 00 00 00 00
SCSI cmd Lun=00 id=2F completed, status=0
SCSI cmd Lun=00 id=30 CDB=00 00 00 00 00 00 00 00 08 00 00 00 00 00 00 00
SCSI cmd Lun=00 id=30 completed, status=0

So that is an INQUIRY [expected], MODE SENSE(10) [expected],
REQUEST SENSE [what, no autosense??] and TEST UNIT READY
[ah oh, error recovery??] sequence.

Perhaps you could examine the way scsi_debug (or most
other LLDs) does autosense. This modern technique (used
for about the last 12 years) relieves the scsi midlevel
of having to send a follow up REQUEST SENSE.

It would be easier to read those SCSI commands in the
debug output if they were trimmed to their actual lengths
(e.g. the INQUIRY is 12 00 00 00 24 00).

Doug Gilbert

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Linux Virtual SCSI HBAs and Virtual disks
  2007-01-24  3:24                       ` Douglas Gilbert
@ 2007-01-24  9:40                         ` Aboo Valappil
  2007-01-25 21:41                         ` Aboo Valappil
  1 sibling, 0 replies; 29+ messages in thread
From: Aboo Valappil @ 2007-01-24  9:40 UTC (permalink / raw)
  To: dougg; +Cc: Stefan Richter, linux-scsi

Douglas Gilbert wrote:
> Aboo Valappil wrote:
>   
>> Hi Stefan Richter,
>>
>> Thanks everyone for their advice on this. As per your advice, I did the
>> following when the last user space target serving the scsi_host quits,
>> the queue command will do the following on the new commands coming through.
>>
>>                sc->result = DID_NO_CONNECT << 16;
>>                sc->resid = sc->request_bufflen;
>>                set_sensedata_commfailure(sc);  ---------------------
>> This sets the sense buffer with Device Not ready/Logical Unit
>> Commincation failure.
>>                done(sc);
>>
>> The scsi_host will remain in the kernel. Let the EH thread handle the
>> queued commands (If any). If the user target wants to reconnects to the
>> same scsi_host, it can do so (Just re-run the user space target again
>> with same command line paramters).  This connection from newly started
>> target will make the HBA healthy again and start serving IO.
>>
>> I implemented a new IOCTL to remove  this  scsi_host  if the user
>> process really needs to.  This removal  will first  finish all the SCSI
>> commands (With the above status results) queued on the scsi_host  (If at
>> all) and then remove the scsi_host.  Also the module unload will delete
>> all the scsi_hosts created after finishing all the commands queued with
>> the above status and sense information.
>>
>> I also implemented passing of sense code information from user space to
>> sense_buffer. A little more work needs to be done on this.
>> Also, I need to make sure that all the locking used inside is correctly
>> implemented to prevent dead locks and improve efficiency.
>>
>> The new version is available http://vscsihba.aboo.org/vscsihbav204.gz
>>     
>
> A few observations from testing this version:
>
> # ./start_target.sh id=3 -files ../../zz_lun0 -v
> # lsscsi
> [0:0:0:0]    disk    Linux    scsi_debug       0004  /dev/sda
> [1:0:0:0]    disk    VirtualH VHD              0     /dev/sdb
>
> So "id=3" doesn't look the target identifier. If not, what
> is it?
>   

This is just an identification for the scsi_host created inside the 
kernel. If you re-run the same command again with same id, the new 
process would attach to the same scsi_host. This can be seen as two user 
process serving the same virtual host bus adapter.  If use a different 
id with the same lun file, It will create a new scsi_host and it would 
appear as the same LUN on a different host bus adapter.
.
It is not the target.
> Here is an attempt to fetch the Read Write Error Recovery
> mode page:
> # sdparm -p rw -vv /dev/sg1
>     inquiry cdb: 12 00 00 00 24 00
>     /dev/sg1: VirtualH  VHD               0
>     mode sense (10) cdb: 5a 00 01 00 00 00 00 00 08 00
> mode sense (10): Probably uninitialized data.
>   Try to view as SCSI-1 non-extended sense:
>   AdValid=0  Error class=0  Error code=0
>
>   
>>> Read write error recovery mode page [0x1] failed
>>>       
>
>
> That implies a sense buffer full of zeroes. The debug
> output from start_target.sh associated with that attempt:
>
> SCSI cmd Lun=00 id=2D CDB=12 00 00 00 24 00 00 00 08 00 00 00 00 00 00 00
> SCSI cmd Lun=00 id=2D completed, status=0
> SCSI cmd Lun=00 id=2E CDB=5A 00 01 00 00 00 00 00 08 00 00 00 00 00 00 00
> SCSI cmd Lun=00 id=2E completed, status=2
> SCSI cmd Lun=00 id=2F CDB=03 00 00 00 FC 00 00 00 08 00 00 00 00 00 00 00
> SCSI cmd Lun=00 id=2F completed, status=0
> SCSI cmd Lun=00 id=30 CDB=00 00 00 00 00 00 00 00 08 00 00 00 00 00 00 00
> SCSI cmd Lun=00 id=30 completed, status=0
>
> So that is an INQUIRY [expected], MODE SENSE(10) [expected],
> REQUEST SENSE [what, no autosense??] and TEST UNIT READY
> [ah oh, error recovery??] sequence.
>
> Perhaps you could examine the way scsi_debug (or most
> other LLDs) does autosense. This modern technique (used
> for about the last 12 years) relieves the scsi midlevel
> of having to send a follow up REQUEST SENSE.
>   
I shall look in the sdebug and see how it handles the Auto sense. What 
is the scsi target does not give any sense informationx with a non zero 
SCSI response?
Can I just make one up? I have modified it to give a sense of  SK=06 
(Illegal request), ASC=20(Invalid command Op code). May be looking at 
the sdebug may give an idea.
> It would be easier to read those SCSI commands in the
> debug output if they were trimmed to their actual lengths
> (e.g. the INQUIRY is 12 00 00 00 24 00).
>   
I will make it look better.
> Doug Gilbert
>   


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Linux Virtual SCSI HBAs and Virtual disks
  2007-01-24  3:24                       ` Douglas Gilbert
  2007-01-24  9:40                         ` Aboo Valappil
@ 2007-01-25 21:41                         ` Aboo Valappil
  2007-01-25 22:01                           ` Stefan Richter
  1 sibling, 1 reply; 29+ messages in thread
From: Aboo Valappil @ 2007-01-25 21:41 UTC (permalink / raw)
  To: dougg; +Cc: Stefan Richter, linux-scsi

Hi Doug Gilbert,

I am not sure if my previous email was received or not.
> # ./start_target.sh id=3 -files ../../zz_lun0 -v
> # lsscsi
> [0:0:0:0]    disk    Linux    scsi_debug       0004  /dev/sda
> [1:0:0:0]    disk    VirtualH VHD              0     /dev/sdb
>
> So "id=3" doesn't look the target identifier. If not, what
> is it?
>
>   

Actually it is not the target number, it can have only one target. This 
is just an identification for the scsi_host created inside the kernel. 
If you re-run the same command again with same id, the new process would 
attach to the same scsi_host. This can be seen as two user process 
serving the same virtual host bus adapter.  If use a different id with 
the same lun file, It will create a new scsi_host and it would appear as 
the same LUN on a different host bus adapter.

> Perhaps you could examine the way scsi_debug (or most
> other LLDs) does autosense. This modern technique (used
> for about the last 12 years) relieves the scsi midlevel
> of having to send a follow up REQUEST SENSE.
>   
What would i do if the SCSI target does not return any sense code? I 
make some thing up or just return no-sense? In the new version, I made 
up  SK="Illegal request", ASC="Invalid Opcode" if I do not receive any 
sense from target. But I think I should return  no sense, ie SK=0, 
ASC=0, ASCQ=0. Any thoughts?
> It would be easier to read those SCSI commands in the
> debug output if they were trimmed to their actual lengths
> (e.g. the INQUIRY is 12 00 00 00 24 00).
>   
I will fix this.


Aboo


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Linux Virtual SCSI HBAs and Virtual disks
  2007-01-25 21:41                         ` Aboo Valappil
@ 2007-01-25 22:01                           ` Stefan Richter
  0 siblings, 0 replies; 29+ messages in thread
From: Stefan Richter @ 2007-01-25 22:01 UTC (permalink / raw)
  To: Aboo Valappil; +Cc: dougg, linux-scsi

Aboo Valappil wrote:
> What would i do if the SCSI target does not return any sense code? I
> make some thing up or just return no-sense?

I guess it depends on whether the in-kernel part of your software stack
actually knows enough details to generate correct sense codes.

> In the new version, I made up  SK="Illegal request", ASC="Invalid
> Opcode" if I do not receive any sense from target. But I think I
> should return  no sense, ie SK=0, ASC=0, ASCQ=0.

I think ASC/ASCQ = 0/0 = "no additional sense information" would indeed
be correct on such occasions.
-- 
Stefan Richter
-=====-=-=== ---= ==--=
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2007-01-25 22:03 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-01-16 10:22 Linux Virtual SCSI HBAs and Virtual disks Aboo Valappil
2007-01-16 21:52 ` Erik Mouw
2007-01-16 23:01   ` aboo
2007-01-17  1:50 ` Douglas Gilbert
2007-01-17  8:36   ` Stefan Richter
2007-01-17 10:24     ` Aboo Valappil
2007-01-17 22:20       ` Douglas Gilbert
2007-01-17 21:59         ` aboo
2007-01-18  0:38           ` Stefan Richter
2007-01-21  9:48         ` Aboo Valappil
2007-01-21  9:53           ` Aboo Valappil
2007-01-21 11:24             ` Stefan Richter
2007-01-22  0:43               ` aboo
2007-01-22  2:23                 ` aboo
2007-01-22 16:47                   ` Stefan Richter
2007-01-22 16:58                     ` Stefan Richter
2007-01-22 18:07                     ` James Bottomley
2007-01-23 13:11                     ` Aboo Valappil
2007-01-23 16:36                       ` Randy Dunlap
2007-01-23 17:22                         ` Stefan Richter
2007-01-24  9:47                           ` Aboo Valappil
2007-01-25 22:02                           ` Aboo Valappil
2007-01-23 17:16                       ` Stefan Richter
2007-01-23 22:12                         ` Aboo Valappil
2007-01-24  0:09                           ` Stefan Richter
2007-01-24  3:24                       ` Douglas Gilbert
2007-01-24  9:40                         ` Aboo Valappil
2007-01-25 21:41                         ` Aboo Valappil
2007-01-25 22:01                           ` Stefan Richter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox