Linux SCSI subsystem development

Linux SCSI subsystem development
 help / color / mirror / Atom feed

* [PATCH] [SCSI] Introduce scsi_req_abort_cmd (REPOST)
From: Luben Tuikov @ 2006-04-19 18:59 UTC (permalink / raw)
  To: linux-scsi, linux-ide; +Cc: Tejun Heo

Introduce scsi_req_abort_cmd(struct scsi_cmnd *).
This function requests that SCSI Core start recovery for the
command by deleting the timer and adding the command to the eh
queue.  It can be called by either LLDDs or SCSI Core.  LLDDs who
implement their own error recovery MAY ignore the timeout event if
they generated scsi_req_abort_cmd.

First post:
http://marc.theaimsgroup.com/?l=linux-scsi&m=113833937421677&w=2

Signed-off-by: Luben Tuikov <ltuikov@yahoo.com>

---

 drivers/scsi/scsi.c      |   18 ++++++++++++++++++
 include/scsi/scsi_cmnd.h |    1 +
 2 files changed, 19 insertions(+), 0 deletions(-)

51df19a1669bd502b536178d6c294e68be25ce79
diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
index 245ca99..1af9795 100644
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -721,6 +721,24 @@ void scsi_init_cmd_from_req(struct scsi_
 static DEFINE_PER_CPU(struct list_head, scsi_done_q);
 
 /**
+ * scsi_req_abort_cmd -- Request command recovery for the specified command
+ * cmd: pointer to the SCSI command of interest
+ *
+ * This function requests that SCSI Core start recovery for the
+ * command by deleting the timer and adding the command to the eh
+ * queue.  It can be called by either LLDDs or SCSI Core.  LLDDs who
+ * implement their own error recovery MAY ignore the timeout event if
+ * they generated scsi_req_abort_cmd.
+ */
+void scsi_req_abort_cmd(struct scsi_cmnd *cmd)
+{
+	if (!scsi_delete_timer(cmd))
+		return;
+	scsi_times_out(cmd);
+}
+EXPORT_SYMBOL(scsi_req_abort_cmd);
+
+/**
  * scsi_done - Enqueue the finished SCSI command into the done queue.
  * @cmd: The SCSI Command for which a low-level device driver (LLDD) gives
  * ownership back to SCSI Core -- i.e. the LLDD has finished with it.
diff --git a/include/scsi/scsi_cmnd.h b/include/scsi/scsi_cmnd.h
index 7529f43..8b9ad8c 100644
--- a/include/scsi/scsi_cmnd.h
+++ b/include/scsi/scsi_cmnd.h
@@ -151,5 +151,6 @@ extern struct scsi_cmnd *scsi_get_comman
 extern void scsi_put_command(struct scsi_cmnd *);
 extern void scsi_io_completion(struct scsi_cmnd *, unsigned int, unsigned int);
 extern void scsi_finish_command(struct scsi_cmnd *cmd);
+extern void scsi_req_abort_cmd(struct scsi_cmnd *cmd);
 
 #endif /* _SCSI_SCSI_CMND_H */
-- 
1.3.0.ga809


^ permalink raw reply related

* Re: [PATCH 1/2] SCSI: implement scsi_eh_schedule_cmd()
From: Luben Tuikov @ 2006-04-19 18:49 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Patrick Mansfield, Jeff Garzik, hch, James.Bottomley, alan,
	albertcc, arjan, linux-ide, linux-scsi
In-Reply-To: <443F8F41.1060002@gmail.com>

Hi Tejun,

--- Tejun Heo <htejun@gmail.com> wrote:
> So, what's your suggestion here?  Do you think libata should do such 
> things with its own mechanism?

Your instinct is correct.  Anything between I and T, as in I_T, is SDS
domain.  That is, SAM doesn't have a _context_ for anything between the I and T.
(In SATA's case I_T_L, of course.)

So this is why you shouldn't call SCSI's ER without context, as in "Hey,
neither command, nor device is broken, but do some ER for me."  Such work belongs
in your SATA Layer, unless either a device or command is involved, as we discussed.

Handle protocol ER in your SATA layer.  Note however that most of your ER could
be done on behalf of a command or device, since of course, a device or command(s)
end up being affected.  Basically "do as little as possible but no less" technique.

Good luck!
     Luben

^ permalink raw reply

* RE: aacraid on Poweredge 2650 ()
From: Salyzyn, Mark @ 2006-04-19 18:25 UTC (permalink / raw)
  To: Adrian von Bidder, linux-scsi

There is an extensive CHANGELOG, but it is Adaptec centric and not
kernel.org version centric.

The main fix that aided stability for the percraid adapters was waiting
up to 60 seconds for the adapter to catch up and complete commands when
they became reticent (overloaded). This wait was added in the scsi error
recover path (hba reset). By waiting, the adapter could (not always)
recover, before the adapter would be taken offline and hell would break
loose in the Linux Filesystem drivers.

Other issues surfaced as we experimented to improve the performance of
the driver by touching various limits of the adapters (percraid had 34SG
elements, 65KB maximum stripe sizes etc); the later block and scsi
systems aided our ability to enforce these limits long before they hit
the driver. These experiments were primarily performed in our labs, but
we did have some minor fixes that were rushed out post delivery in the
2.6.13 timeframe (2.6.8 was prior to these experiments).

The firmware updates, hardware and drives are primarily responsible for
the stability of the storage system.

There are no changes I know of since 2.6.14 in 1.1-4 that affect
percraid adapters. The new interfaces, performance and feature
enhancements all require modern Adaptec Firmware & Hardware.

Sincerely -- Mark Salyzyn

> -----Original Message-----
> From: linux-scsi-owner@vger.kernel.org 
> [mailto:linux-scsi-owner@vger.kernel.org] On Behalf Of Adrian 
> von Bidder
> Sent: Wednesday, April 19, 2006 10:55 AM
> To: linux-scsi@vger.kernel.org
> Subject: aacraid on Poweredge 2650 ()
> 
> 
> Yo!
> 
> Running a Dell Poweredge 2650, I run into a stability problem 
> triggered by 
> lots of disk activity (sometimes just tarring the whole 
> filesystem for 
> backup or creating a new chroot would suffice, sometimes 
> hundreds of GB of 
> filetransfers were necessary)
> 
> This was with Debian stable (2.6.8 kernel), and with what 
> appeared to be a 
> half-buggy disk and a (ecc-correctable) faulty memory.   Now, 
> with replaced 
> hardware and with the 2.6.15 kernel (aac 1.1-4), things seem 
> to be better.  
> (replacing the hardware alone didn't help, so it seemed to be 
> a driver 
> problem.) 
> 
> A quick test, moving a few 100G and a few 1000 files, 
> couldn't reproduce the 
> issue, but still: are there known stability problems with that driver 
> version?  Is there a changelog of the aacraid driver somewhere?
> 
> thanks in advance
> -- vbi
> 
> relevant kernel messages afaict:
> ===
> Adaptec aacraid driver (1.1-4 Mar  7 2006 02:24:50)
> AAC0: kernel 2.7-1[3170]
> AAC0: monitor 2.7-1[3170]
> AAC0: bios 2.7-1[3170]
> AAC0: serial d15810d3
> scsi0 : percraid
>   Vendor: DELL      Model: 3 discs and HS    Rev: V1.0
>   Type:   Direct-Access                      ANSI SCSI revision: 02
> SCSI device sda: 142183296 512-byte hdwr sectors (72798 MB)
> sda: Write Protect is off
> sda: Mode Sense: 03 00 00 00
> sda: got wrong page
> sda: assuming drive cache: write through
> SCSI device sda: 142183296 512-byte hdwr sectors (72798 MB)
> sda: Write Protect is off
> sda: Mode Sense: 03 00 00 00
> sda: got wrong page
> sda: assuming drive cache: write through
>  sda: sda1 sda2 sda3 < sda5 sda6 >
> sd 0:0:0:0: Attached scsi removable disk sda
> ===
> 
> lspci output
> ===
> 0000:04:08.1 RAID bus controller: Dell PowerEdge Expandable 
> RAID Controller 
> 3/Di (rev 01)
>         Subsystem: Dell: Unknown device 0121
>         Flags: bus master, 66MHz, slow devsel, latency 32, IRQ 185
>         Memory at f0000000 (32-bit, prefetchable) [size=128M]
>         Expansion ROM at fcb00000 [disabled] [size=64K]
>         Capabilities: [80] Power Management version 2
> 
> 0000:05:06.0 SCSI storage controller: Adaptec RAID subsystem 
> HBA (rev 01)
>         Subsystem: Dell PowerEdge 2400,2500,2550,4400
>         Flags: bus master, 66MHz, medium devsel, latency 32, IRQ 7
>         BIST result: 00
>         I/O ports at cc00 [size=256]
>         Memory at fccff000 (64-bit, non-prefetchable) [size=4K]
>         Expansion ROM at fcd00000 [disabled] [size=128K]
>         Capabilities: [dc] Power Management version 2
> 
> 0000:05:06.1 SCSI storage controller: Adaptec RAID subsystem 
> HBA (rev 01)
>         Subsystem: Dell PowerEdge 2400,2500,2550,4400
>         Flags: bus master, 66MHz, medium devsel, latency 32, IRQ 11
>         BIST result: 00
>         I/O ports at c800 [size=256]
>         Memory at fccfe000 (64-bit, non-prefetchable) [size=4K]
>         Expansion ROM at f8100000 [disabled] [size=128K]
>         Capabilities: [dc] Power Management version 2
> ===
> 
> -- 
> Life is fraught with opportunities to keep your mouth shut.
> 

^ permalink raw reply

* Re: [RFC] Netlink and user-space buffer pointers
From: Patrick McHardy @ 2006-04-19 17:16 UTC (permalink / raw)
  To: James.Smart; +Cc: linux-scsi, netdev, linux-kernel
In-Reply-To: <44466EA7.3030206@emulex.com>

James Smart wrote:
> 
> 
> Patrick McHardy wrote:
> 
>> This might be problematic, since there is a shared receive-queue in
>> the kernel netlink message might get processed in the context of
>> a different process. I didn't find any spots where ISCSI passes
>> pointers over netlink, can you point me to it?
> 
> 
> Please explain... Would the pid be set erroneously as well ?  Ignoring
> the kernel-user space pointer issue, we're going to have a tight
> pid + request_id relationship being maintained across multiple messages.
> We'll also be depending on the pid events for clean up if an app dies.
> So I hope pid is consistent.

The PID contained in the netlink message itself is correct, current->pid
might not be.

^ permalink raw reply

* Re: [RFC] Netlink and user-space buffer pointers
From: James Smart @ 2006-04-19 17:08 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: linux-scsi, netdev, linux-kernel
In-Reply-To: <444663A9.9020502@trash.net>

Patrick McHardy wrote:
> This might be problematic, since there is a shared receive-queue in
> the kernel netlink message might get processed in the context of
> a different process. I didn't find any spots where ISCSI passes
> pointers over netlink, can you point me to it?

Please explain... Would the pid be set erroneously as well ?  Ignoring
the kernel-user space pointer issue, we're going to have a tight
pid + request_id relationship being maintained across multiple messages.
We'll also be depending on the pid events for clean up if an app dies.
So I hope pid is consistent.

-- james s

^ permalink raw reply

* Re: [RFC] Netlink and user-space buffer pointers
From: James Smart @ 2006-04-19 17:05 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: linux-scsi, netdev, linux-kernel
In-Reply-To: <20060419092645.29cb0420@localhost.localdomain>



Stephen Hemminger wrote:
> On Wed, 19 Apr 2006 08:57:25 -0400
> James Smart <James.Smart@Emulex.Com> wrote:
> 
>> Folks,
>>
>> To take netlink to where we want to use it within the SCSI subsystem (as
>> the mechanism of choice to replace ioctls), we're going to need to pass
>> user-space buffer pointers.
> 
> This changes the design of netlink. It is desired that netlink
> can be done remotely over the network as well as queueing.
> The current design is message based, not RPC based. By including a
> user-space pointer, you are making the message dependent on the
> context as it is process.
> 
> Please rethink your design.

I assume that the message receiver has some way to determine where the
message originated (via the sk_buff), and thus could reject it if it
didn't meet the right criteria.  True ?  You just have to be cognizant
that it is usable from a remote entity - which is a very good thing.

^ permalink raw reply

* Vendor specific cdrom error messages
From: Orion Poplawski @ 2006-04-19 16:26 UTC (permalink / raw)
  To: linux-scsi

I got the following error trying to burn a DVD on an IBM USB2 DVD-R burner:

Nov  7 11:40:36 makani kernel: cdrom: This disc doesn't have any tracks 
I recognize!
Nov  7 11:41:59 makani kernel: sr 0:0:0:0: SCSI error: return code = 
0x8000002
Nov  7 11:41:59 makani kernel: sr0: Current: sense key: Data Protect
Nov  7 11:41:59 makani kernel:     ASC=0x27 <<vendor>> ASCQ=0xff

This is on Fedora Core 4 and kernel 2.6.13-1.1532_FC4.

It turns out that this is because the drive did not have enough power 
(it was not plugged into the separate ac adapter).  I was able to 
determine this by booting into Windows, which gave me a nice descriptive 
error message.

I was wondering if it makes sense for the kernel drivers to know these 
error messages and report better errors, of it should be left up to user 
space tools.

- Orion

^ permalink raw reply

* Re: [RFC] Netlink and user-space buffer pointers
From: Stephen Hemminger @ 2006-04-19 16:26 UTC (permalink / raw)
  To: James.Smart; +Cc: linux-scsi, netdev, linux-kernel
In-Reply-To: <444633B5.5030208@emulex.com>

On Wed, 19 Apr 2006 08:57:25 -0400
James Smart <James.Smart@Emulex.Com> wrote:

> Folks,
> 
> To take netlink to where we want to use it within the SCSI subsystem (as
> the mechanism of choice to replace ioctls), we're going to need to pass
> user-space buffer pointers.

This changes the design of netlink. It is desired that netlink
can be done remotely over the network as well as queueing.
The current design is message based, not RPC based. By including a
user-space pointer, you are making the message dependent on the
context as it is process.

Please rethink your design.

> What is the best, portable manner to pass a pointer between user and kernel
> space within a netlink message ?  The example I've seen is in the iscsi
> target code - and it's passed between user-kernel space as a u64, then
> typecast to a void *, and later within the bio_map_xxx functions, as an
> unsigned long. I assume we are going to continue with this method ?
> 
> -- james s
> -
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [RFC] Netlink and user-space buffer pointers
From: Patrick McHardy @ 2006-04-19 16:22 UTC (permalink / raw)
  To: James.Smart; +Cc: linux-scsi, netdev, linux-kernel
In-Reply-To: <444633B5.5030208@emulex.com>

James Smart wrote:
> To take netlink to where we want to use it within the SCSI subsystem (as
> the mechanism of choice to replace ioctls), we're going to need to pass
> user-space buffer pointers.
> 
> What is the best, portable manner to pass a pointer between user and kernel
> space within a netlink message ?  The example I've seen is in the iscsi
> target code - and it's passed between user-kernel space as a u64, then
> typecast to a void *, and later within the bio_map_xxx functions, as an
> unsigned long. I assume we are going to continue with this method ?

This might be problematic, since there is a shared receive-queue in
the kernel netlink message might get processed in the context of
a different process. I didn't find any spots where ISCSI passes
pointers over netlink, can you point me to it?

Besides that, netlink protocols should use fixed size architecture
independant types, so u64 would be the best choice for pointers.

^ permalink raw reply

* Re: [RFC] FC Transport : Async Events via netlink interface
From: James Smart @ 2006-04-19 16:11 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: linux-scsi
In-Reply-To: <20060419145931.GL24104@parisc-linux.org>



Matthew Wilcox wrote:
> enable NET in order to force netlink to be built.  So something like
> this would be enough:
> 
>  config SCSI_FC_ATTRS
>  	tristate "FiberChannel Transport Attributes"
>  	depends on SCSI
> +	select NET
>  	help
>  	  If you wish to export transport-specific information about
>  	  each attached FiberChannel device to sysfs, say Y.

Thanks....

>> +#define get_list_head_entry(pos, head, member) 		\
>> +	pos = list_entry((head)->next, typeof(*pos), member)
>> +
> 
> This one sets alarm bells ringing ...

Please explain why. I've always wondered why the list macros never
let you look, without dequeuing, the head of the list. It will let
you look - as long as you use the functions that make it think
you're going to walk the list.


> I would write this as:
> 
> 	struct fc_nl_user *nluser, *tmp;
> 	list_for_each_entry_safe(nluser, tmp, &fc_nl_user_list, ulist) {
> 		kfree(nluser);
> 	}

ok.. Subtle though, as you have to know you are consuming the entire list.

-- james



^ permalink raw reply

* Re: 2.6.14 regression: aic7xxx hangs on boot
From: James Bottomley @ 2006-04-19 16:02 UTC (permalink / raw)
  To: Daniel Drake; +Cc: gibbs, linux-scsi
In-Reply-To: <444658BE.5080309@gentoo.org>

On Wed, 2006-04-19 at 16:35 +0100, Daniel Drake wrote:
> James Bottomley wrote:
> > Really, 2.6.13-16 are the kernels around which the aic7xxx driver
> > changed rapidly.  What happens with 2.6.17-rc1?
> 
> Exactly the same.

OK ... so more details, like the dmesg up to the problem (primarily to
diagnose the transport parameters and what's on the bus).  Plus, what
were the transport parameters in the case where it worked
(from /proc/scsi/aic7xxx/*)

Also, if it's a multi-device bus is the problem caused by a single
device (as in do a binary search pulling things off the bus).

James

^ permalink raw reply

* Re: [Comments Needed] scan vs remove_target deadlock
From: Michael Reed @ 2006-04-19 15:34 UTC (permalink / raw)
  To: James.Smart; +Cc: Stefan Richter, linux-scsi
In-Reply-To: <44455BAA.6080509@emulex.com>



James Smart wrote:
> Michael Reed wrote:
>> The remove is not for the target which holds the scsi host's scan mutex.
>> Hence, the unblock doesn't kick the [right] queue.
> 
> Certainly could be true.

I don't think it would deadlock if it wasn't.  The scan mutex is a rather
gross lock.

> 
>> I think this means that transport cannot call scsi_remove_target() for any
>> target if a scan is running.  So, transport has to wait until it can assure
>> that no scan is running, perhaps a new mutex, and has to have a way of kicking
>> a blocked target which is being scanned, either when the LLDD unblocks
>> the target or the delete work for that target fires.
> 
> Well - that's one way. Very difficult for the transport to know when this is
> true (not all scans occur from the transport). It should be a midlayer thing
> to ensure the proper things happen. Also highlights just how gross the that
> scan_lock is - which is where the real fix should be, although this will be
> a rats nest.

There's fc_user_scan() which I believe handles scans initiated
via the sysfs/proc variables.  There's fc_scsi_scan_rport() run via the scan work.
It appears that the routines that perform a scan, in a fibre channel context,
are all entered via the transport.

What am I missing?

Mike

> 
> -- james s
> 

^ permalink raw reply

* Re: 2.6.14 regression: aic7xxx hangs on boot
From: Daniel Drake @ 2006-04-19 15:35 UTC (permalink / raw)
  To: James Bottomley; +Cc: gibbs, linux-scsi
In-Reply-To: <1145454585.3465.5.camel@mulgrave.il.steeleye.com>

James Bottomley wrote:
> Really, 2.6.13-16 are the kernels around which the aic7xxx driver
> changed rapidly.  What happens with 2.6.17-rc1?

Exactly the same.

Thanks,
Daniel

^ permalink raw reply

* Re: [RFC] FC Transport : Async Events via netlink interface
From: Matthew Wilcox @ 2006-04-19 14:59 UTC (permalink / raw)
  To: James Smart; +Cc: linux-scsi
In-Reply-To: <1145306661.4151.0.camel@localhost.localdomain>

On Mon, Apr 17, 2006 at 04:44:21PM -0400, James Smart wrote:
> PS: Comments on Kconfig change appreciated. I don't have much experience on
>   changing the kernel config and build process.
> 
> diff -upNr a/drivers/scsi/Kconfig b/drivers/scsi/Kconfig
> --- a/drivers/scsi/Kconfig	2006-03-29 11:53:24.000000000 -0500
> +++ b/drivers/scsi/Kconfig	2006-04-17 12:03:31.000000000 -0400
> @@ -221,7 +221,7 @@ config SCSI_SPI_ATTRS
>  
>  config SCSI_FC_ATTRS
>  	tristate "FiberChannel Transport Attributes"
> -	depends on SCSI
> +	depends on SCSI && NET && NETFILTER && NETFILTER_NETLINK
>  	help
>  	  If you wish to export transport-specific information about
>  	  each attached FiberChannel device to sysfs, say Y.

I would use a select here rather than a depends.  So enabling a driver
that uses FC_ATTRS will force netlink to be added.  However, I think you
have the wrong symbols here.  NETFILTER_NETLINK is the netlink interface
for netfilter -- entirely unrelated.  It looks like you only need to
enable NET in order to force netlink to be built.  So something like
this would be enough:

 config SCSI_FC_ATTRS
 	tristate "FiberChannel Transport Attributes"
 	depends on SCSI
+	select NET
 	help
 	  If you wish to export transport-specific information about
 	  each attached FiberChannel device to sysfs, say Y.


> +#define get_list_head_entry(pos, head, member) 		\
> +	pos = list_entry((head)->next, typeof(*pos), member)
> +

This one sets alarm bells ringing ...

>  static void __exit fc_transport_exit(void)
>  {
> +	struct fc_nl_user *nluser;
> +
> +	sock_release(fc_nl_sock->sk_socket);
> +	netlink_unregister_notifier(&fc_netlink_notifier);
> +	while (!list_empty(&fc_nl_user_list)) {
> +		get_list_head_entry(nluser, &fc_nl_user_list, ulist);
> +		list_del(&nluser->ulist);
> +		kfree(nluser);
> +	}
>  	transport_class_unregister(&fc_transport_class);
>  	transport_class_unregister(&fc_rport_class);
>  	transport_class_unregister(&fc_host_class);

I would write this as:

	struct fc_nl_user *nluser, *tmp;
	list_for_each_entry_safe(nluser, tmp, &fc_nl_user_list, ulist) {
		kfree(nluser);
	}


^ permalink raw reply

* aacraid on Poweredge 2650 ()
From: Adrian von Bidder @ 2006-04-19 14:55 UTC (permalink / raw)
  To: linux-scsi

[-- Attachment #1: Type: text/plain, Size: 2977 bytes --]

Yo!

Running a Dell Poweredge 2650, I run into a stability problem triggered by 
lots of disk activity (sometimes just tarring the whole filesystem for 
backup or creating a new chroot would suffice, sometimes hundreds of GB of 
filetransfers were necessary)

This was with Debian stable (2.6.8 kernel), and with what appeared to be a 
half-buggy disk and a (ecc-correctable) faulty memory.   Now, with replaced 
hardware and with the 2.6.15 kernel (aac 1.1-4), things seem to be better.  
(replacing the hardware alone didn't help, so it seemed to be a driver 
problem.) 

A quick test, moving a few 100G and a few 1000 files, couldn't reproduce the 
issue, but still: are there known stability problems with that driver 
version?  Is there a changelog of the aacraid driver somewhere?

thanks in advance
-- vbi

relevant kernel messages afaict:
===
Adaptec aacraid driver (1.1-4 Mar  7 2006 02:24:50)
AAC0: kernel 2.7-1[3170]
AAC0: monitor 2.7-1[3170]
AAC0: bios 2.7-1[3170]
AAC0: serial d15810d3
scsi0 : percraid
  Vendor: DELL      Model: 3 discs and HS    Rev: V1.0
  Type:   Direct-Access                      ANSI SCSI revision: 02
SCSI device sda: 142183296 512-byte hdwr sectors (72798 MB)
sda: Write Protect is off
sda: Mode Sense: 03 00 00 00
sda: got wrong page
sda: assuming drive cache: write through
SCSI device sda: 142183296 512-byte hdwr sectors (72798 MB)
sda: Write Protect is off
sda: Mode Sense: 03 00 00 00
sda: got wrong page
sda: assuming drive cache: write through
 sda: sda1 sda2 sda3 < sda5 sda6 >
sd 0:0:0:0: Attached scsi removable disk sda
===

lspci output
===
0000:04:08.1 RAID bus controller: Dell PowerEdge Expandable RAID Controller 
3/Di (rev 01)
        Subsystem: Dell: Unknown device 0121
        Flags: bus master, 66MHz, slow devsel, latency 32, IRQ 185
        Memory at f0000000 (32-bit, prefetchable) [size=128M]
        Expansion ROM at fcb00000 [disabled] [size=64K]
        Capabilities: [80] Power Management version 2

0000:05:06.0 SCSI storage controller: Adaptec RAID subsystem HBA (rev 01)
        Subsystem: Dell PowerEdge 2400,2500,2550,4400
        Flags: bus master, 66MHz, medium devsel, latency 32, IRQ 7
        BIST result: 00
        I/O ports at cc00 [size=256]
        Memory at fccff000 (64-bit, non-prefetchable) [size=4K]
        Expansion ROM at fcd00000 [disabled] [size=128K]
        Capabilities: [dc] Power Management version 2

0000:05:06.1 SCSI storage controller: Adaptec RAID subsystem HBA (rev 01)
        Subsystem: Dell PowerEdge 2400,2500,2550,4400
        Flags: bus master, 66MHz, medium devsel, latency 32, IRQ 11
        BIST result: 00
        I/O ports at c800 [size=256]
        Memory at fccfe000 (64-bit, non-prefetchable) [size=4K]
        Expansion ROM at f8100000 [disabled] [size=128K]
        Capabilities: [dc] Power Management version 2
===

-- 
Life is fraught with opportunities to keep your mouth shut.

[-- Attachment #2: Type: application/pgp-signature, Size: 388 bytes --]

^ permalink raw reply

* Re: 2.6.14 regression: aic7xxx hangs on boot
From: James Bottomley @ 2006-04-19 13:49 UTC (permalink / raw)
  To: Daniel Drake; +Cc: gibbs, linux-scsi
In-Reply-To: <44461FB5.8060705@gentoo.org>

On Wed, 2006-04-19 at 12:32 +0100, Daniel Drake wrote:
> It's a bit of an odd one. 2.6.13 works (even in the present day), 2.6.14 
> does not. On 2.6.14, these messages appear during early boot:
> 
> 	<<<<<<<<<<<<<<<<< Dump Card State Ends >>>>>>>>>>>>>>>>>>
> 	Recovery code sleeping
> 	Recovery code awake
> 	aic7xxx_abort returns 0x2003
> 	aic7xxx_dev_reset returns 0x2003
> 	Recovery SCB completes
> 
> These kind of mesages repeat quickly in a seemingly infinite loop, the 
> boot does not complete.

Really, 2.6.13-16 are the kernels around which the aic7xxx driver
changed rapidly.  What happens with 2.6.17-rc1?

James



^ permalink raw reply

* [RFC] Netlink and user-space buffer pointers
From: James Smart @ 2006-04-19 12:57 UTC (permalink / raw)
  To: linux-scsi, netdev, linux-kernel
In-Reply-To: <20060418160121.GA2707@us.ibm.com>

Folks,

To take netlink to where we want to use it within the SCSI subsystem (as
the mechanism of choice to replace ioctls), we're going to need to pass
user-space buffer pointers.

What is the best, portable manner to pass a pointer between user and kernel
space within a netlink message ?  The example I've seen is in the iscsi
target code - and it's passed between user-kernel space as a u64, then
typecast to a void *, and later within the bio_map_xxx functions, as an
unsigned long. I assume we are going to continue with this method ?

-- james s

^ permalink raw reply

* Re: [RFC] FC Transport : Async Events via netlink interface
From: James Smart @ 2006-04-19 12:52 UTC (permalink / raw)
  To: Mike Anderson; +Cc: linux-scsi
In-Reply-To: <20060418160121.GA2707@us.ibm.com>

Mike Anderson wrote:
> Is there some reason that you are not using nlmsg_multicast. The caller of
> this function is somewhat simulating the function of multicast.

Only that I haven't looked into using groups yet. It certainly makes sense.

> In the send_fail case it looks like you leak skbs. Do you need to add a
> call to nlmsg_free or kfree_skb?

Yep.

I'll include these comments in the revised post. I'll wait a little longer
for any further comments.

-- james


^ permalink raw reply

* 2.6.14 regression: aic7xxx hangs on boot
From: Daniel Drake @ 2006-04-19 11:32 UTC (permalink / raw)
  To: gibbs; +Cc: linux-scsi, James.Bottomley

Hi,

A Gentoo user reported an aic7xxx regression at:
http://bugs.gentoo.org/127991

It's a bit of an odd one. 2.6.13 works (even in the present day), 2.6.14 
does not. On 2.6.14, these messages appear during early boot:

	<<<<<<<<<<<<<<<<< Dump Card State Ends >>>>>>>>>>>>>>>>>>
	Recovery code sleeping
	Recovery code awake
	aic7xxx_abort returns 0x2003
	aic7xxx_dev_reset returns 0x2003
	Recovery SCB completes

These kind of mesages repeat quickly in a seemingly infinite loop, the 
boot does not complete.

git-bisect tracked it down to this commit:

Author: James Bottomley <James.Bottomley@steeleye.com> 

Date:   Sun Aug 14 17:09:01 2005 -0500 


     [SCSI] correct transport class abstraction to work outside SCSI 


http://www.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff_plain;h=d0a7e574007fd547d72ec693bfa35778623d0738;hp=10c1b88987d618f4f89c10e11e574c76de73b5e7

However, applying that patch to 2.6.13 does not cause the break to 
happen. Incidently, these messages do appear instead:

scheduling while atomic: swapper/0x00000001/1
  [<c0399573>] schedule+0x9c3/0xc97
  [<c0118b49>] __wake_up_common+0x3f/0x5e
  [<c03999a4>] wait_for_completion+0x85/0xca
  [<c0118af8>] default_wake_function+0x0/0x12
  [<c0118af8>] default_wake_function+0x0/0x12
  [<c012cfd5>] queue_work+0x79/0x7b
  [<c012cee3>] call_usermodehelper_keys+0xd6/0xe3
  [<c012cdac>] __call_usermodehelper+0x0/0x61
  [<c025625d>] kobject_hotplug+0x27f/0x2ec
  [<c0195549>] sysfs_create_link+0x44/0x64
  [<c029ecc6>] class_device_add+0x117/0x1d4
  [<c02a0b73>] attribute_container_add_class_device+0x10/0x26
  [<c02a0deb>] transport_add_class_device+0x10/0x40
  [<c02a0ab5>] attribute_container_device_trigger+0x97/0x9d
  [<c02a0e32>] transport_add_device+0x17/0x1b
  [<c02a0ddb>] transport_add_class_device+0x0/0x40
  [<c02c1038>] scsi_alloc_target+0x1ed/0x27a
  [<c02a0f57>] transport_destroy_device+0x17/0x1c
  [<c02c1fff>] scsi_scan_target+0x62/0x174
  [<c02c21c1>] scsi_scan_channel+0xb0/0xce
  [<c02c2259>] scsi_scan_host_selected+0x7a/0xd9
  [<c02c22e7>] scsi_scan_host+0x2f/0x33
  [<c02dcdfe>] ahc_linux_register_host+0x1b3/0x1bd
  [<c02e0aa0>] ahc_pci_map_int+0x38/0x60
  [<c02ddf5b>] ahc_linux_isr+0x0/0x27e
  [<c02d5eab>] ahc_pci_config+0x717/0x9cf
  [<c025d276>] pci_set_master+0x42/0x84
  [<c02e06b7>] ahc_linux_pci_dev_probe+0x10b/0x14b
  [<c012cdac>] __call_usermodehelper+0x0/0x61
  [<c025ecc9>] pci_match_device+0x2a/0xdd
  [<c025edd5>] __pci_device_probe+0x59/0x67
  [<c025ee12>] pci_device_probe+0x2f/0x59
  [<c029dcf0>] driver_probe_device+0x3b/0xc5
  [<c029dde5>] __driver_attach+0x0/0x45
  [<c029de28>] __driver_attach+0x43/0x45
  [<c029d327>] bus_for_each_dev+0x58/0x78
  [<c029de50>] driver_attach+0x26/0x2a
  [<c029dde5>] __driver_attach+0x0/0x45
  [<c029d846>] bus_add_driver+0x83/0xec
  [<c025f085>] pci_register_driver+0x7e/0x94
  [<c02e0706>] ahc_linux_pci_init+0xf/0x1b
  [<c049564a>] ahc_linux_init+0x7f/0xa6
  [<c0480978>] do_initcalls+0x53/0xb5
  [<c0100386>] init+0x7c/0x19e
  [<c010030a>] init+0x0/0x19e
  [<c0101101>] kernel_thread_helper+0x5/0xb

Any ideas/suggestions?

Thanks,
Daniel

^ permalink raw reply

* qla2xxx lock ordering question
From: Arjan van de Ven @ 2006-04-19  9:22 UTC (permalink / raw)
  To: Andrew Vasquez; +Cc: mingo, linux-scsi

Hi,

a question about qla2xxx lock ordering since it trips up with Ingo's
lock depenceny tool:

in qla2x00_mailbox_command() the code first grabs the mbx_reg_lock lock,
then the hardware_lock. So far so good. But then...
it drops the mbx_reg_lock, does stuff, and regrabs the mbx_reg_lock
lock, while keeping the hardware_lock held!

This appears to be an AB-BA deadlock risk since for the second part you
are taking the locks in the wrong order... or am I missing something
here?

Greetings,
   Arjan van de Ven

^ permalink raw reply

* [patch 11/17] remove drivers/scsi/constants.c:scsi_print_req_sense()
From: akpm @ 2006-04-19  4:09 UTC (permalink / raw)
  To: James.Bottomley; +Cc: linux-scsi, akpm, bunk


From: Adrian Bunk <bunk@stusta.de>

This function is no longer used anywhere.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
---

 drivers/scsi/constants.c |   10 ----------
 include/scsi/scsi_dbg.h  |    1 -
 2 files changed, 11 deletions(-)

diff -puN drivers/scsi/constants.c~remove-drivers-scsi-constantscscsi_print_req_sense drivers/scsi/constants.c
--- devel/drivers/scsi/constants.c~remove-drivers-scsi-constantscscsi_print_req_sense	2006-04-05 21:27:56.000000000 -0700
+++ devel-akpm/drivers/scsi/constants.c	2006-04-05 21:27:56.000000000 -0700
@@ -1268,16 +1268,6 @@ void scsi_print_sense(const char *devcla
 }
 EXPORT_SYMBOL(scsi_print_sense);
 
-void scsi_print_req_sense(const char *devclass, struct scsi_request *sreq)
-{
-	const char *name = devclass;
-
-	if (sreq->sr_request->rq_disk)
-		name = sreq->sr_request->rq_disk->disk_name;
-	__scsi_print_sense(name, sreq->sr_sense_buffer, SCSI_SENSE_BUFFERSIZE);
-}
-EXPORT_SYMBOL(scsi_print_req_sense);
-
 void scsi_print_command(struct scsi_cmnd *cmd)
 {
 	/* Assume appended output (i.e. not at start of line) */
diff -puN include/scsi/scsi_dbg.h~remove-drivers-scsi-constantscscsi_print_req_sense include/scsi/scsi_dbg.h
--- devel/include/scsi/scsi_dbg.h~remove-drivers-scsi-constantscscsi_print_req_sense	2006-04-05 21:27:56.000000000 -0700
+++ devel-akpm/include/scsi/scsi_dbg.h	2006-04-05 21:27:56.000000000 -0700
@@ -9,7 +9,6 @@ extern void scsi_print_command(struct sc
 extern void scsi_print_sense_hdr(const char *, struct scsi_sense_hdr *);
 extern void __scsi_print_command(unsigned char *);
 extern void scsi_print_sense(const char *, struct scsi_cmnd *);
-extern void scsi_print_req_sense(const char *, struct scsi_request *);
 extern void __scsi_print_sense(const char *name,
 			       const unsigned char *sense_buffer,
 			       int sense_len);
_

^ permalink raw reply

* [patch 12/17] drivers/scsi/aic7xxx/aic79xx_core.c: make ahd_match_scb() static
From: akpm @ 2006-04-19  4:09 UTC (permalink / raw)
  To: James.Bottomley; +Cc: linux-scsi, akpm, bunk


From: Adrian Bunk <bunk@stusta.de>

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
---

 drivers/scsi/aic7xxx/aic79xx.h      |    3 ---
 drivers/scsi/aic7xxx/aic79xx_core.c |    5 ++++-
 2 files changed, 4 insertions(+), 4 deletions(-)

diff -puN drivers/scsi/aic7xxx/aic79xx_core.c~drivers-scsi-aic7xxx-aic79xx_corec-make-ahd_match_scb-static drivers/scsi/aic7xxx/aic79xx_core.c
--- devel/drivers/scsi/aic7xxx/aic79xx_core.c~drivers-scsi-aic7xxx-aic79xx_corec-make-ahd_match_scb-static	2006-04-14 23:41:42.000000000 -0700
+++ devel-akpm/drivers/scsi/aic7xxx/aic79xx_core.c	2006-04-14 23:41:42.000000000 -0700
@@ -262,6 +262,9 @@ static void		ahd_update_coalescing_value
 						     u_int mincmds);
 static int		ahd_verify_vpd_cksum(struct vpd_config *vpd);
 static int		ahd_wait_seeprom(struct ahd_softc *ahd);
+static int		ahd_match_scb(struct ahd_softc *ahd, struct scb *scb,
+				      int target, char channel, int lun,
+				      u_int tag, role_t role);
 
 /******************************** Private Inlines *****************************/
 
@@ -7213,7 +7216,7 @@ ahd_busy_tcl(struct ahd_softc *ahd, u_in
 }
 
 /************************** SCB and SCB queue management **********************/
-int
+static int
 ahd_match_scb(struct ahd_softc *ahd, struct scb *scb, int target,
 	      char channel, int lun, u_int tag, role_t role)
 {
diff -puN drivers/scsi/aic7xxx/aic79xx.h~drivers-scsi-aic7xxx-aic79xx_corec-make-ahd_match_scb-static drivers/scsi/aic7xxx/aic79xx.h
--- devel/drivers/scsi/aic7xxx/aic79xx.h~drivers-scsi-aic7xxx-aic79xx_corec-make-ahd_match_scb-static	2006-04-14 23:41:42.000000000 -0700
+++ devel-akpm/drivers/scsi/aic7xxx/aic79xx.h	2006-04-14 23:41:42.000000000 -0700
@@ -1347,9 +1347,6 @@ int	ahd_pci_test_register_access(struct 
 /************************** SCB and SCB queue management **********************/
 void		ahd_qinfifo_requeue_tail(struct ahd_softc *ahd,
 					 struct scb *scb);
-int		ahd_match_scb(struct ahd_softc *ahd, struct scb *scb,
-			      int target, char channel, int lun,
-			      u_int tag, role_t role);
 
 /****************************** Initialization ********************************/
 struct ahd_softc	*ahd_alloc(void *platform_arg, char *name);
_

^ permalink raw reply

* [patch 10/17] scsi/megaraid/megaraid_mm.c: fix a NULL pointer dereference
From: akpm @ 2006-04-19  4:09 UTC (permalink / raw)
  To: James.Bottomley; +Cc: linux-scsi, akpm, bunk


From: Adrian Bunk <bunk@stusta.de>

This patch fixes a NULL pointer dereference spotted by the Coverity
checker.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
---

 drivers/scsi/megaraid/megaraid_mm.c |    6 ++----
 1 files changed, 2 insertions(+), 4 deletions(-)

diff -puN drivers/scsi/megaraid/megaraid_mm.c~scsi-megaraid-megaraid_mmc-fix-a-null-pointer-dereference drivers/scsi/megaraid/megaraid_mm.c
--- devel/drivers/scsi/megaraid/megaraid_mm.c~scsi-megaraid-megaraid_mmc-fix-a-null-pointer-dereference	2006-03-21 23:05:42.000000000 -0800
+++ devel-akpm/drivers/scsi/megaraid/megaraid_mm.c	2006-03-21 23:05:42.000000000 -0800
@@ -898,10 +898,8 @@ mraid_mm_register_adp(mraid_mmadp_t *lld
 
 	adapter = kmalloc(sizeof(mraid_mmadp_t), GFP_KERNEL);
 
-	if (!adapter) {
-		rval = -ENOMEM;
-		goto memalloc_error;
-	}
+	if (!adapter)
+		return -ENOMEM;
 
 	memset(adapter, 0, sizeof(mraid_mmadp_t));
 
_

^ permalink raw reply

* [patch 07/17] drivers/scsi/qla2xxx/: make some functions static
From: akpm @ 2006-04-19  4:09 UTC (permalink / raw)
  To: James.Bottomley; +Cc: linux-scsi, akpm, bunk


From: Adrian Bunk <bunk@stusta.de>

This patch makes some needlessly global functions static.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
---

 drivers/scsi/qla2xxx/qla_os.c  |   12 ++++++------
 drivers/scsi/qla2xxx/qla_sup.c |    8 ++++----
 2 files changed, 10 insertions(+), 10 deletions(-)

diff -puN drivers/scsi/qla2xxx/qla_os.c~drivers-scsi-qla2xxx-make-some-functions-static drivers/scsi/qla2xxx/qla_os.c
--- devel/drivers/scsi/qla2xxx/qla_os.c~drivers-scsi-qla2xxx-make-some-functions-static	2006-03-21 23:05:42.000000000 -0800
+++ devel-akpm/drivers/scsi/qla2xxx/qla_os.c	2006-03-21 23:05:42.000000000 -0800
@@ -274,7 +274,7 @@ qla24xx_pci_info_str(struct scsi_qla_hos
 	return str;
 }
 
-char *
+static char *
 qla2x00_fw_version_str(struct scsi_qla_host *ha, char *str)
 {
 	char un_str[10];
@@ -312,7 +312,7 @@ qla2x00_fw_version_str(struct scsi_qla_h
 	return (str);
 }
 
-char *
+static char *
 qla24xx_fw_version_str(struct scsi_qla_host *ha, char *str)
 {
 	sprintf(str, "%d.%02d.%02d ", ha->fw_major_version,
@@ -600,7 +600,7 @@ qla2x00_wait_for_loop_ready(scsi_qla_hos
 *
 * Note:
 **************************************************************************/
-int
+static int
 qla2xxx_eh_abort(struct scsi_cmnd *cmd)
 {
 	scsi_qla_host_t *ha = to_qla_host(cmd->device->host);
@@ -734,7 +734,7 @@ qla2x00_eh_wait_for_pending_target_comma
 *    SUCCESS/FAILURE (defined as macro in scsi.h).
 *
 **************************************************************************/
-int
+static int
 qla2xxx_eh_device_reset(struct scsi_cmnd *cmd)
 {
 	scsi_qla_host_t *ha = to_qla_host(cmd->device->host);
@@ -865,7 +865,7 @@ qla2x00_eh_wait_for_pending_commands(scs
 *    SUCCESS/FAILURE (defined as macro in scsi.h).
 *
 **************************************************************************/
-int
+static int
 qla2xxx_eh_bus_reset(struct scsi_cmnd *cmd)
 {
 	scsi_qla_host_t *ha = to_qla_host(cmd->device->host);
@@ -926,7 +926,7 @@ eh_bus_reset_done:
 *
 * Note:
 **************************************************************************/
-int
+static int
 qla2xxx_eh_host_reset(struct scsi_cmnd *cmd)
 {
 	scsi_qla_host_t *ha = to_qla_host(cmd->device->host);
diff -puN drivers/scsi/qla2xxx/qla_sup.c~drivers-scsi-qla2xxx-make-some-functions-static drivers/scsi/qla2xxx/qla_sup.c
--- devel/drivers/scsi/qla2xxx/qla_sup.c~drivers-scsi-qla2xxx-make-some-functions-static	2006-03-21 23:05:42.000000000 -0800
+++ devel-akpm/drivers/scsi/qla2xxx/qla_sup.c	2006-03-21 23:05:42.000000000 -0800
@@ -428,7 +428,7 @@ nvram_data_to_access_addr(uint32_t naddr
 	return FARX_ACCESS_NVRAM_DATA | naddr;
 }
 
-uint32_t
+static uint32_t
 qla24xx_read_flash_dword(scsi_qla_host_t *ha, uint32_t addr)
 {
 	int rval;
@@ -469,7 +469,7 @@ qla24xx_read_flash_data(scsi_qla_host_t 
 	return dwptr;
 }
 
-int
+static int
 qla24xx_write_flash_dword(scsi_qla_host_t *ha, uint32_t addr, uint32_t data)
 {
 	int rval;
@@ -491,7 +491,7 @@ qla24xx_write_flash_dword(scsi_qla_host_
 	return rval;
 }
 
-void
+static void
 qla24xx_get_flash_manufacturer(scsi_qla_host_t *ha, uint8_t *man_id,
     uint8_t *flash_id)
 {
@@ -502,7 +502,7 @@ qla24xx_get_flash_manufacturer(scsi_qla_
 	*flash_id = MSB(ids);
 }
 
-int
+static int
 qla24xx_write_flash_data(scsi_qla_host_t *ha, uint32_t *dwptr, uint32_t faddr,
     uint32_t dwords)
 {
_

^ permalink raw reply

* [patch 08/17] drivers/scsi/aic7xxx/aic79xx_core.c: make ahd_done_with_status() static
From: akpm @ 2006-04-19  4:09 UTC (permalink / raw)
  To: James.Bottomley; +Cc: linux-scsi, akpm, bunk


From: Adrian Bunk <bunk@stusta.de>

This patch makes a needlessly global function static.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
---

 drivers/scsi/aic7xxx/aic79xx_core.c |    2 +-
 1 files changed, 1 insertion(+), 1 deletion(-)

diff -puN drivers/scsi/aic7xxx/aic79xx_core.c~drivers-scsi-aic7xxx-aic79xx_corec-make-ahd_done_with_status-static drivers/scsi/aic7xxx/aic79xx_core.c
--- devel/drivers/scsi/aic7xxx/aic79xx_core.c~drivers-scsi-aic7xxx-aic79xx_corec-make-ahd_done_with_status-static	2006-04-14 23:41:41.000000000 -0700
+++ devel-akpm/drivers/scsi/aic7xxx/aic79xx_core.c	2006-04-14 23:41:41.000000000 -0700
@@ -7352,7 +7352,7 @@ ahd_reset_cmds_pending(struct ahd_softc 
 	ahd->flags &= ~AHD_UPDATE_PEND_CMDS;
 }
 
-void
+static void
 ahd_done_with_status(struct ahd_softc *ahd, struct scb *scb, uint32_t status)
 {
 	cam_status ostat;
_

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox