* Re: kexec and aacraid broken
[not found] <86802c440705291859y39a4ca27uf5ddb84810f33510@mail.gmail.com>
@ 2007-05-30 2:13 ` Andrew Morton
2007-05-30 11:44 ` Salyzyn, Mark
0 siblings, 1 reply; 28+ messages in thread
From: Andrew Morton @ 2007-05-30 2:13 UTC (permalink / raw)
To: Yinghai Lu
Cc: Vivek Goyal, Eric W. Biederman, aacraid,
Linux Kernel Mailing List, linux-scsi, Michal Piotrowski
On Tue, 29 May 2007 18:59:32 -0700 "Yinghai Lu" <yhlu.kernel@gmail.com> wrote:
> latest tree, can not use kexec to load 2.6.22-rc3 at least.
>
> got:
>
> AAC0: adapter kernel panic'd fffffffd
> AAC0: adapter kernel failed to start, init status=0
One of the two diffs below, I guess. Please do a `patch -R -p1' of this
email and retest?
>
> but can load 2.6.21.3
>
Michal, can you please add this to the regression list?
commit 9e4d4a5d71d673901d9c1df5146ce545c2cc0cc0
Author: Salyzyn, Mark <mark_salyzyn@adaptec.com>
Date: Tue May 1 11:43:06 2007 -0400
[SCSI] aacraid: superfluous adapter reset for IBM 8 series ServeRAID controllers
The kexec patch introduced a superfluous (and otherwise inert) reset of
some adapters. The register can have a hardware default value that has
zeros for the undefined interrupts. This patch refines the test of the
interrupt enable register to focus on only the interrupts that affect
the driver in order to detect if an incomplete shutdown of the Adapter
had occurred (kdump).
Signed-off-by: Mark Salyzyn <aacraid@adaptec.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
diff --git a/drivers/scsi/aacraid/rx.c b/drivers/scsi/aacraid/rx.c
index b6ee3c0..291cd14 100644
--- a/drivers/scsi/aacraid/rx.c
+++ b/drivers/scsi/aacraid/rx.c
@@ -542,7 +542,7 @@ int _aac_rx_init(struct aac_dev *dev)
dev->a_ops.adapter_sync_cmd = rx_sync_cmd;
dev->a_ops.adapter_enable_int = aac_rx_disable_interrupt;
dev->OIMR = status = rx_readb (dev, MUnit.OIMR);
- if ((((status & 0xff) != 0xff) || reset_devices) &&
+ if ((((status & 0x0c) != 0x0c) || reset_devices) &&
!aac_rx_restart_adapter(dev, 0))
++restart;
/*
commit a5694ec545a880f9d23463fddc894f5096cc68fa
Author: Salyzyn, Mark <mark_salyzyn@adaptec.com>
Date: Mon Apr 30 13:22:24 2007 -0400
[SCSI] aacraid: kexec fix (reset interrupt handler)
Another layer on this onion also discovered by Duane, the
interrupt enable handler also needed to be set ... The interrupt enable
was called from within the synchronous command handler.
Signed-off-by: Mark Salyzyn <aacraid@adaptec.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
diff --git a/drivers/scsi/aacraid/rx.c b/drivers/scsi/aacraid/rx.c
index 0c71315..b6ee3c0 100644
--- a/drivers/scsi/aacraid/rx.c
+++ b/drivers/scsi/aacraid/rx.c
@@ -539,6 +539,8 @@ int _aac_rx_init(struct aac_dev *dev)
}
/* Failure to reset here is an option ... */
+ dev->a_ops.adapter_sync_cmd = rx_sync_cmd;
+ dev->a_ops.adapter_enable_int = aac_rx_disable_interrupt;
dev->OIMR = status = rx_readb (dev, MUnit.OIMR);
if ((((status & 0xff) != 0xff) || reset_devices) &&
!aac_rx_restart_adapter(dev, 0))
^ permalink raw reply related [flat|nested] 28+ messages in thread
* RE: kexec and aacraid broken
2007-05-30 2:13 ` kexec and aacraid broken Andrew Morton
@ 2007-05-30 11:44 ` Salyzyn, Mark
2007-05-30 13:24 ` Vivek Goyal
2007-05-30 21:22 ` Yinghai Lu
0 siblings, 2 replies; 28+ messages in thread
From: Salyzyn, Mark @ 2007-05-30 11:44 UTC (permalink / raw)
To: Andrew Morton, Yinghai Lu
Cc: Vivek Goyal, Eric W. Biederman, Linux Kernel Mailing List,
linux-scsi, Michal Piotrowski
[-- Attachment #1: Type: text/plain, Size: 4219 bytes --]
I believe this issue is a result of the aacraid_commit_reset patch (as
posted for scsi-misc-2.6, enclosed to permit testing) not yet propagated
to the 2.6.22-rc3 tree.
This is the adapter taking longer than 3 minutes to start after a reset.
I seriously doubt either of these patches suggested below will have an
affect. And if they do, they are not root cause, one reduces the chances
that the card will be reset during initialization (thus applied would
likely mitigate this problem), the other prevents a panic when the
Adapter is reset (removed, would result in dogs and cats sleeping with
each other).
Please use kernel parameter aacraid.startup_timeout=540 (merely larger
than the default 180 seconds) when spawning the kexec or see if the
aacraid_commit_reset.patch resolves the issue to confirm my hunch.
Sincerely -- Mark Salyzyn
> -----Original Message-----
> From: Andrew Morton [mailto:akpm@linux-foundation.org]
> Sent: Tuesday, May 29, 2007 10:14 PM
> To: Yinghai Lu
> Cc: Vivek Goyal; Eric W. Biederman; AACRAID; Linux Kernel
> Mailing List; linux-scsi@vger.kernel.org; Michal Piotrowski
> Subject: Re: kexec and aacraid broken
>
>
> On Tue, 29 May 2007 18:59:32 -0700 "Yinghai Lu"
> <yhlu.kernel@gmail.com> wrote:
>
> > latest tree, can not use kexec to load 2.6.22-rc3 at least.
> >
> > got:
> >
> > AAC0: adapter kernel panic'd fffffffd
> > AAC0: adapter kernel failed to start, init status=0
>
> One of the two diffs below, I guess. Please do a `patch -R
> -p1' of this
> email and retest?
>
> >
> > but can load 2.6.21.3
> >
>
> Michal, can you please add this to the regression list?
>
>
>
>
> commit 9e4d4a5d71d673901d9c1df5146ce545c2cc0cc0
> Author: Salyzyn, Mark <mark_salyzyn@adaptec.com>
> Date: Tue May 1 11:43:06 2007 -0400
>
> [SCSI] aacraid: superfluous adapter reset for IBM 8
> series ServeRAID controllers
>
> The kexec patch introduced a superfluous (and otherwise
> inert) reset of
> some adapters. The register can have a hardware default
> value that has
> zeros for the undefined interrupts. This patch refines
> the test of the
> interrupt enable register to focus on only the interrupts
> that affect
> the driver in order to detect if an incomplete shutdown
> of the Adapter
> had occurred (kdump).
>
> Signed-off-by: Mark Salyzyn <aacraid@adaptec.com>
> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
>
> diff --git a/drivers/scsi/aacraid/rx.c b/drivers/scsi/aacraid/rx.c
> index b6ee3c0..291cd14 100644
> --- a/drivers/scsi/aacraid/rx.c
> +++ b/drivers/scsi/aacraid/rx.c
> @@ -542,7 +542,7 @@ int _aac_rx_init(struct aac_dev *dev)
> dev->a_ops.adapter_sync_cmd = rx_sync_cmd;
> dev->a_ops.adapter_enable_int = aac_rx_disable_interrupt;
> dev->OIMR = status = rx_readb (dev, MUnit.OIMR);
> - if ((((status & 0xff) != 0xff) || reset_devices) &&
> + if ((((status & 0x0c) != 0x0c) || reset_devices) &&
> !aac_rx_restart_adapter(dev, 0))
> ++restart;
> /*
> commit a5694ec545a880f9d23463fddc894f5096cc68fa
> Author: Salyzyn, Mark <mark_salyzyn@adaptec.com>
> Date: Mon Apr 30 13:22:24 2007 -0400
>
> [SCSI] aacraid: kexec fix (reset interrupt handler)
>
> Another layer on this onion also discovered by Duane, the
> interrupt enable handler also needed to be set ... The
> interrupt enable
> was called from within the synchronous command handler.
>
> Signed-off-by: Mark Salyzyn <aacraid@adaptec.com>
> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
>
> diff --git a/drivers/scsi/aacraid/rx.c b/drivers/scsi/aacraid/rx.c
> index 0c71315..b6ee3c0 100644
> --- a/drivers/scsi/aacraid/rx.c
> +++ b/drivers/scsi/aacraid/rx.c
> @@ -539,6 +539,8 @@ int _aac_rx_init(struct aac_dev *dev)
> }
>
> /* Failure to reset here is an option ... */
> + dev->a_ops.adapter_sync_cmd = rx_sync_cmd;
> + dev->a_ops.adapter_enable_int = aac_rx_disable_interrupt;
> dev->OIMR = status = rx_readb (dev, MUnit.OIMR);
> if ((((status & 0xff) != 0xff) || reset_devices) &&
> !aac_rx_restart_adapter(dev, 0))
>
>
[-- Attachment #2: aacraid_commit_reset.patch --]
[-- Type: application/octet-stream, Size: 3499 bytes --]
diff -ru a/drivers/scsi/aacraid/aachba.c b/drivers/scsi/aacraid/aachba.c
--- a/drivers/scsi/aacraid/aachba.c 2007-05-16 10:29:25.697735367 -0400
+++ b/drivers/scsi/aacraid/aachba.c 2007-05-16 10:37:33.537128485 -0400
@@ -146,7 +146,7 @@
static int nondasd = -1;
static int dacmode = -1;
-static int commit = -1;
+int aac_commit = -1;
int startup_timeout = 180;
int aif_timeout = 120;
@@ -154,7 +154,7 @@
MODULE_PARM_DESC(nondasd, "Control scanning of hba for nondasd devices. 0=off, 1=on");
module_param(dacmode, int, S_IRUGO|S_IWUSR);
MODULE_PARM_DESC(dacmode, "Control whether dma addressing is using 64 bit DAC. 0=off, 1=on");
-module_param(commit, int, S_IRUGO|S_IWUSR);
+module_param_named(commit, aac_commit, int, S_IRUGO|S_IWUSR);
MODULE_PARM_DESC(commit, "Control whether a COMMIT_CONFIG is issued to the adapter for foreign arrays.\nThis is typically needed in systems that do not have a BIOS. 0=off, 1=on");
module_param(startup_timeout, int, S_IRUGO|S_IWUSR);
MODULE_PARM_DESC(startup_timeout, "The duration of time in seconds to wait for adapter to have it's kernel up and\nrunning. This is typically adjusted for large systems that do not have a BIOS.");
@@ -173,6 +173,9 @@
module_param(expose_physicals, int, S_IRUGO|S_IWUSR);
MODULE_PARM_DESC(expose_physicals, "Expose physical components of the arrays. -1=protect 0=off, 1=on");
+int aac_reset_devices = 0;
+module_param_named(reset_devices, aac_reset_devices, int, S_IRUGO|S_IWUSR);
+MODULE_PARM_DESC(reset_devices, "Force an adapter reset at initialization.");
static inline int aac_valid_context(struct scsi_cmnd *scsicmd,
struct fib *fibptr) {
@@ -246,7 +249,7 @@
aac_fib_complete(fibptr);
/* Send a CT_COMMIT_CONFIG to enable discovery of devices */
if (status >= 0) {
- if ((commit == 1) || commit_flag) {
+ if ((aac_commit == 1) || commit_flag) {
struct aac_commit_config * dinfo;
aac_fib_init(fibptr);
dinfo = (struct aac_commit_config *) fib_data(fibptr);
@@ -261,7 +264,7 @@
1, 1,
NULL, NULL);
aac_fib_complete(fibptr);
- } else if (commit == 0) {
+ } else if (aac_commit == 0) {
printk(KERN_WARNING
"aac_get_config_status: Foreign device configurations are being ignored\n");
}
diff -ru a/drivers/scsi/aacraid/aacraid.h b/drivers/scsi/aacraid/aacraid.h
--- a/drivers/scsi/aacraid/aacraid.h 2007-05-16 10:29:25.697735367 -0400
+++ b/drivers/scsi/aacraid/aacraid.h 2007-05-16 10:37:33.538128354 -0400
@@ -1829,3 +1829,5 @@
extern int startup_timeout;
extern int aif_timeout;
extern int expose_physicals;
+extern int aac_reset_devices;
+extern int aac_commit;
diff -ru a/drivers/scsi/aacraid/rx.c b/drivers/scsi/aacraid/rx.c
--- a/drivers/scsi/aacraid/rx.c 2007-05-16 10:29:25.699735113 -0400
+++ b/drivers/scsi/aacraid/rx.c 2007-05-16 10:37:33.539128223 -0400
@@ -488,6 +488,8 @@
return -EINVAL;
if (rx_readl(dev, MUnit.OMRx[0]) & KERNEL_PANIC)
return -ENODEV;
+ if (startup_timeout < 300)
+ startup_timeout = 300;
return 0;
}
@@ -542,7 +544,7 @@
dev->a_ops.adapter_sync_cmd = rx_sync_cmd;
dev->a_ops.adapter_enable_int = aac_rx_disable_interrupt;
dev->OIMR = status = rx_readb (dev, MUnit.OIMR);
- if ((((status & 0x0c) != 0x0c) || reset_devices) &&
+ if ((((status & 0x0c) != 0x0c) || aac_reset_devices || reset_devices) &&
!aac_rx_restart_adapter(dev, 0))
++restart;
/*
@@ -594,6 +596,8 @@
}
msleep(1);
}
+ if (restart)
+ aac_commit = 1;
/*
* Fill in the common function dispatch table.
*/
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: kexec and aacraid broken
2007-05-30 11:44 ` Salyzyn, Mark
@ 2007-05-30 13:24 ` Vivek Goyal
2007-05-30 13:57 ` Salyzyn, Mark
2007-05-30 21:22 ` Yinghai Lu
1 sibling, 1 reply; 28+ messages in thread
From: Vivek Goyal @ 2007-05-30 13:24 UTC (permalink / raw)
To: Salyzyn, Mark
Cc: Andrew Morton, Yinghai Lu, Eric W. Biederman,
Linux Kernel Mailing List, linux-scsi, Michal Piotrowski
On Wed, May 30, 2007 at 07:44:02AM -0400, Salyzyn, Mark wrote:
> I believe this issue is a result of the aacraid_commit_reset patch (as
> posted for scsi-misc-2.6, enclosed to permit testing) not yet propagated
> to the 2.6.22-rc3 tree.
>
> This is the adapter taking longer than 3 minutes to start after a reset.
> I seriously doubt either of these patches suggested below will have an
> affect. And if they do, they are not root cause, one reduces the chances
> that the card will be reset during initialization (thus applied would
> likely mitigate this problem), the other prevents a panic when the
> Adapter is reset (removed, would result in dogs and cats sleeping with
> each other).
>
> Please use kernel parameter aacraid.startup_timeout=540 (merely larger
> than the default 180 seconds) when spawning the kexec or see if the
> aacraid_commit_reset.patch resolves the issue to confirm my hunch.
>
Hi Mark,
During a normal kexec (not kdump) adapter reset should not have taken
place at all. device_shutdown() routines should have taken care to
bring the device to a known sane state in first kernel so that second
kernel can initialize it without doing a reset.
With reset patch, now reset triggers on every kexec. Previously
that was not the case with kexec and adapter used to come up. I think
this needs to be looked into.
Thanks
Vivek
^ permalink raw reply [flat|nested] 28+ messages in thread
* RE: kexec and aacraid broken
2007-05-30 13:24 ` Vivek Goyal
@ 2007-05-30 13:57 ` Salyzyn, Mark
2007-05-30 14:17 ` Vivek Goyal
0 siblings, 1 reply; 28+ messages in thread
From: Salyzyn, Mark @ 2007-05-30 13:57 UTC (permalink / raw)
To: vgoyal
Cc: Andrew Morton, Yinghai Lu, Eric W. Biederman,
Linux Kernel Mailing List, linux-scsi, Michal Piotrowski
This is clouding the issue, Vivek.
There should be no harm, except to time, resetting the adapter. I do
want to optimize for boot time, but do not view this as a 'bug' if the
Adapter should reset during the initialization procedure. We need
instead to harden the driver to deal with Adapters that behave in an
untimely manner as a result of the reset since this generically deals
with all possible transitions (boot w/o BIOS, w/BIOS, kexec and kdump).
I will look into a possibility the driver is not performing the clean
shutdown as a result of a kexec, but that is a refinement and should not
be considered a fix for *this* reported problem; it merely moves the
problem to a kdump. The driver only disables the interrupts when the
driver is .remove'd (aac_remove_one) and not for .shutdown
(aac_shutdown). The later merely tells the firmware to stop performing
builds if in progress, flush the cache, and all subsequent writes are
performed in write-through mode; it does not clear out the driver
resources and leaves that to the .remove function only. The failure of
.remove being called may be a result of this being a boot driver?
Also, the code:
dev->OIMR = status = rx_readb (dev, MUnit.OIMR);
if ((((status & 0x0c) != 0x0c) . . .
detects if the adapter's interrupts were disabled, as would happen on a
clean shutdown. Some of the Adapters can NOT disable their interrupts,
and some have a default state with the interrupts enabled. If the
Adapter still has active interrupts, then there is no telling what
transpired before and it is considered a safety measure to reset the
Adapter in these cases. I'd prefer to err on the side of resetting the
Adapter superfluously than deal with a condition where the Adapter could
be in an unknown state with a possibility of sustaining an outstanding
command and associated interrupt (which was the whole reason this code
was introduced).
In time I am sure, I will refine this code to incorporate Quirks for
adapters that have unusual conditions for the above stated interrupt and
remove the possible superfluous reset.
Yinghai, can you please provide the Adapter designation just in case it
could be the first in this refined list. I will NOT consider this
refinement a bugfix for the same reasons stated above.
Sincerely -- Mark Salyzyn
> -----Original Message-----
> From: Vivek Goyal [mailto:vgoyal@in.ibm.com]
> Sent: Wednesday, May 30, 2007 9:25 AM
> To: Salyzyn, Mark
> Cc: Andrew Morton; Yinghai Lu; Eric W. Biederman; Linux
> Kernel Mailing List; linux-scsi@vger.kernel.org; Michal Piotrowski
> Subject: Re: kexec and aacraid broken
>
>
> On Wed, May 30, 2007 at 07:44:02AM -0400, Salyzyn, Mark wrote:
> > I believe this issue is a result of the
> aacraid_commit_reset patch (as
> > posted for scsi-misc-2.6, enclosed to permit testing) not
> yet propagated
> > to the 2.6.22-rc3 tree.
> >
> > This is the adapter taking longer than 3 minutes to start
> after a reset.
> > I seriously doubt either of these patches suggested below
> will have an
> > affect. And if they do, they are not root cause, one
> reduces the chances
> > that the card will be reset during initialization (thus
> applied would
> > likely mitigate this problem), the other prevents a panic when the
> > Adapter is reset (removed, would result in dogs and cats
> sleeping with
> > each other).
> >
> > Please use kernel parameter aacraid.startup_timeout=540
> (merely larger
> > than the default 180 seconds) when spawning the kexec or see if the
> > aacraid_commit_reset.patch resolves the issue to confirm my hunch.
> >
>
> Hi Mark,
>
> During a normal kexec (not kdump) adapter reset should not have taken
> place at all. device_shutdown() routines should have taken care to
> bring the device to a known sane state in first kernel so that second
> kernel can initialize it without doing a reset.
>
> With reset patch, now reset triggers on every kexec. Previously
> that was not the case with kexec and adapter used to come up. I think
> this needs to be looked into.
>
> Thanks
> Vivek
>
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: kexec and aacraid broken
2007-05-30 13:57 ` Salyzyn, Mark
@ 2007-05-30 14:17 ` Vivek Goyal
2007-05-30 14:30 ` Salyzyn, Mark
0 siblings, 1 reply; 28+ messages in thread
From: Vivek Goyal @ 2007-05-30 14:17 UTC (permalink / raw)
To: Salyzyn, Mark
Cc: Andrew Morton, Yinghai Lu, Eric W. Biederman,
Linux Kernel Mailing List, linux-scsi, Michal Piotrowski
On Wed, May 30, 2007 at 09:57:08AM -0400, Salyzyn, Mark wrote:
> This is clouding the issue, Vivek.
>
> There should be no harm, except to time, resetting the adapter. I do
> want to optimize for boot time, but do not view this as a 'bug' if the
> Adapter should reset during the initialization procedure. We need
> instead to harden the driver to deal with Adapters that behave in an
> untimely manner as a result of the reset since this generically deals
> with all possible transitions (boot w/o BIOS, w/BIOS, kexec and kdump).
>
Hi Mark,
I agree. We should make sure that we should be able to do a software
reset of adapters.
> I will look into a possibility the driver is not performing the clean
> shutdown as a result of a kexec, but that is a refinement and should not
> be considered a fix for *this* reported problem; it merely moves the
> problem to a kdump.
Agreed. I just wanted to bring out this point that right now we are
triggering software reset on every kexec and probably that is not
required. One can avoid it to save boot time. That was the whole
purpose of kexec (fastboot) project.
But this is not a fix for this problem. We should any way be able to
reset the device and should root cause this.
> The driver only disables the interrupts when the
> driver is .remove'd (aac_remove_one) and not for .shutdown
> (aac_shutdown). The later merely tells the firmware to stop performing
> builds if in progress, flush the cache, and all subsequent writes are
> performed in write-through mode; it does not clear out the driver
> resources and leaves that to the .remove function only. The failure of
> .remove being called may be a result of this being a boot driver?
>
> Also, the code:
>
> dev->OIMR = status = rx_readb (dev, MUnit.OIMR);
> if ((((status & 0x0c) != 0x0c) . . .
>
> detects if the adapter's interrupts were disabled, as would happen on a
> clean shutdown. Some of the Adapters can NOT disable their interrupts,
> and some have a default state with the interrupts enabled. If the
> Adapter still has active interrupts, then there is no telling what
> transpired before and it is considered a safety measure to reset the
> Adapter in these cases. I'd prefer to err on the side of resetting the
> Adapter superfluously than deal with a condition where the Adapter could
> be in an unknown state with a possibility of sustaining an outstanding
> command and associated interrupt (which was the whole reason this code
> was introduced).
>
So most likely if we start disabling the interrupts in .shutdown routine
we might skip resetting adapter on every kexec without any side affects?
Thanks
Vivek
^ permalink raw reply [flat|nested] 28+ messages in thread
* RE: kexec and aacraid broken
2007-05-30 14:17 ` Vivek Goyal
@ 2007-05-30 14:30 ` Salyzyn, Mark
2007-05-30 15:59 ` [PATCH] aacraid: fix shutdown handler to also disable interrupts Salyzyn, Mark
2007-05-30 21:19 ` kexec and aacraid broken Yinghai Lu
0 siblings, 2 replies; 28+ messages in thread
From: Salyzyn, Mark @ 2007-05-30 14:30 UTC (permalink / raw)
To: vgoyal
Cc: Andrew Morton, Yinghai Lu, Eric W. Biederman,
Linux Kernel Mailing List, linux-scsi, Michal Piotrowski
Vivek Goyal [mailto:vgoyal@in.ibm.com] writes:
> So most likely if we start disabling the interrupts
> in .shutdown routine we might skip resetting adapter
> on every kexec without any side affects?
Not that simple. The .shutdown would need to perform more resource
cleanups of the .remove call to prevent side effects. I need to move
some of the .remove activity into the .shutdown handler to make sure the
adapter is quiesced.
I will hold off on submitting any of these changes until they are
evaluated and tested; I am waiting for feedback from Yinghai on the
other mitigations that I feel are closer to the root cause.
Sincerely -- Mark Salyzyn
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH] aacraid: fix shutdown handler to also disable interrupts.
2007-05-30 14:30 ` Salyzyn, Mark
@ 2007-05-30 15:59 ` Salyzyn, Mark
2007-05-30 17:36 ` Yinghai Lu
` (2 more replies)
2007-05-30 21:19 ` kexec and aacraid broken Yinghai Lu
1 sibling, 3 replies; 28+ messages in thread
From: Salyzyn, Mark @ 2007-05-30 15:59 UTC (permalink / raw)
To: linux-scsi
Cc: vgoyal, Andrew Morton, Yinghai Lu, Eric W. Biederman,
Michal Piotrowski, Linux Kernel Mailing List
[-- Attachment #1: Type: text/plain, Size: 2022 bytes --]
Moves quiesce, thread and interrupt shutdown into aacraid drivers'
.shutdown handler. This fix to the aac_shutdown handler will remove the
superfluous reset of the adapter during a (clean) kexec.
This fix may mitigate the active investigation 'kexec and aacraid
broken' but it is unlikely to affect the root cause (issue likely
present in both kexec and kdump). This patch reduces the chance the
problem will occur with a kexec. The fix for root cause is currently
expected to be the minimum value check to the aacraid.startup_timeout
driver variable after an adapter reset within aacraid_commit_reset.patch
submitted on 05/22/2007 and awaiting testing by Yinghai to confirm.
This attached patch is against current scsi-misc-2.6
ObligatoryDisclaimer: Please accept my condolences regarding Outlook's
handling of patch attachments.
Signed-off-by: Mark Salyzyn <aacraid@adaptec.com>
Sincerely -- Mark Salyzyn
> -----Original Message-----
> From: linux-scsi-owner@vger.kernel.org
> [mailto:linux-scsi-owner@vger.kernel.org] On Behalf Of Salyzyn, Mark
> Sent: Wednesday, May 30, 2007 10:31 AM
> To: vgoyal@in.ibm.com
> Cc: Andrew Morton; Yinghai Lu; Eric W. Biederman; Linux
> Kernel Mailing List; linux-scsi@vger.kernel.org; Michal Piotrowski
> Subject: RE: kexec and aacraid broken
>
> Vivek Goyal [mailto:vgoyal@in.ibm.com] writes:
> > So most likely if we start disabling the interrupts
> > in .shutdown routine we might skip resetting adapter
> > on every kexec without any side affects?
>
> Not that simple. The .shutdown would need to perform more resource
> cleanups of the .remove call to prevent side effects. I need to move
> some of the .remove activity into the .shutdown handler to
> make sure the
> adapter is quiesced.
>
> I will hold off on submitting any of these changes until they are
> evaluated and tested; I am waiting for feedback from Yinghai on the
> other mitigations that I feel are closer to the root cause.
>
> Sincerely -- Mark Salyzyn
[-- Attachment #2: aacraid_shutdown.patch --]
[-- Type: application/octet-stream, Size: 1524 bytes --]
diff -ru a/drivers/scsi/aacraid/linit.c b/drivers/scsi/aacraid/linit.c
--- a/drivers/scsi/aacraid/linit.c 2007-05-30 11:00:36.619831521 -0400
+++ b/drivers/scsi/aacraid/linit.c 2007-05-30 11:04:35.325867212 -0400
@@ -859,6 +859,14 @@
.emulated = 1,
};
+static void __aac_shutdown(struct aac_dev * aac)
+{
+ kthread_stop(aac->thread);
+ aac_send_shutdown(aac);
+ aac_adapter_disable_int(aac);
+ free_irq(aac->pdev->irq, aac);
+}
+
static int __devinit aac_probe_one(struct pci_dev *pdev,
const struct pci_device_id *id)
{
@@ -1011,10 +1019,7 @@
return 0;
out_deinit:
- kthread_stop(aac->thread);
- aac_send_shutdown(aac);
- aac_adapter_disable_int(aac);
- free_irq(pdev->irq, aac);
+ __aac_shutdown(aac);
out_unmap:
aac_fib_map_free(aac);
pci_free_consistent(aac->pdev, aac->comm_size, aac->comm_addr, aac->comm_phys);
@@ -1034,7 +1039,8 @@
{
struct Scsi_Host *shost = pci_get_drvdata(dev);
struct aac_dev *aac = (struct aac_dev *)shost->hostdata;
- aac_send_shutdown(aac);
+ scsi_block_requests(shost);
+ __aac_shutdown(aac);
}
static void __devexit aac_remove_one(struct pci_dev *pdev)
@@ -1044,16 +1050,12 @@
scsi_remove_host(shost);
- kthread_stop(aac->thread);
-
- aac_send_shutdown(aac);
- aac_adapter_disable_int(aac);
+ __aac_shutdown(aac);
aac_fib_map_free(aac);
pci_free_consistent(aac->pdev, aac->comm_size, aac->comm_addr,
aac->comm_phys);
kfree(aac->queues);
- free_irq(pdev->irq, aac);
aac_adapter_ioremap(aac, 0);
kfree(aac->fibs);
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] aacraid: fix shutdown handler to also disable interrupts.
2007-05-30 15:59 ` [PATCH] aacraid: fix shutdown handler to also disable interrupts Salyzyn, Mark
@ 2007-05-30 17:36 ` Yinghai Lu
2007-06-01 11:08 ` Vivek Goyal
2007-06-07 17:21 ` [PATCH] aacraid: add SCSI SYNCHONIZE_CACHE range checking Salyzyn, Mark
2 siblings, 0 replies; 28+ messages in thread
From: Yinghai Lu @ 2007-05-30 17:36 UTC (permalink / raw)
To: Salyzyn, Mark
Cc: linux-scsi, vgoyal, Andrew Morton, Eric W. Biederman,
Michal Piotrowski, Linux Kernel Mailing List
On 5/30/07, Salyzyn, Mark <mark_salyzyn@adaptec.com> wrote:
> Moves quiesce, thread and interrupt shutdown into aacraid drivers'
> .shutdown handler. This fix to the aac_shutdown handler will remove the
> superfluous reset of the adapter during a (clean) kexec.
>
> This fix may mitigate the active investigation 'kexec and aacraid
> broken' but it is unlikely to affect the root cause (issue likely
> present in both kexec and kdump). This patch reduces the chance the
> problem will occur with a kexec. The fix for root cause is currently
> expected to be the minimum value check to the aacraid.startup_timeout
> driver variable after an adapter reset within aacraid_commit_reset.patch
> submitted on 05/22/2007 and awaiting testing by Yinghai to confirm.
>
> This attached patch is against current scsi-misc-2.6
>
> ObligatoryDisclaimer: Please accept my condolences regarding Outlook's
> handling of patch attachments.
>
> Signed-off-by: Mark Salyzyn <aacraid@adaptec.com>
>
> Sincerely -- Mark Salyzyn
>
the kernel with this patch -4 and even without
1. [SCSI] aacraid: superfluous adapter reset for IBM 8 series
ServeRAID controllers
2. [SCSI] aacraid: kexec fix (reset interrupt handler)
3. aacraid_commit_reset.patch
can load other kernel with or without patch 1,2,3
YH
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: kexec and aacraid broken
2007-05-30 14:30 ` Salyzyn, Mark
2007-05-30 15:59 ` [PATCH] aacraid: fix shutdown handler to also disable interrupts Salyzyn, Mark
@ 2007-05-30 21:19 ` Yinghai Lu
1 sibling, 0 replies; 28+ messages in thread
From: Yinghai Lu @ 2007-05-30 21:19 UTC (permalink / raw)
To: Salyzyn, Mark
Cc: vgoyal, Andrew Morton, Eric W. Biederman,
Linux Kernel Mailing List, linux-scsi, Michal Piotrowski
On 5/30/07, Salyzyn, Mark <mark_salyzyn@adaptec.com> wrote:
> Vivek Goyal [mailto:vgoyal@in.ibm.com] writes:
> > So most likely if we start disabling the interrupts
> > in .shutdown routine we might skip resetting adapter
> > on every kexec without any side affects?
>
> Not that simple. The .shutdown would need to perform more resource
> cleanups of the .remove call to prevent side effects. I need to move
> some of the .remove activity into the .shutdown handler to make sure the
> adapter is quiesced.
>
> I will hold off on submitting any of these changes until they are
> evaluated and tested; I am waiting for feedback from Yinghai on the
> other mitigations that I feel are closer to the root cause.
>
1. [SCSI] aacraid: superfluous adapter reset for IBM 8 series
ServeRAID controllers
2. [SCSI] aacraid: kexec fix (reset interrupt handler)
3. aacraid_commit_reset.patch
4. [PATCH] aacraid: fix shutdown handler to also disable interrupts
the kernel with this patch -4 and even without 1, 2, 3
can load other kernel with or without patch 1,2,3
YH
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: kexec and aacraid broken
2007-05-30 11:44 ` Salyzyn, Mark
2007-05-30 13:24 ` Vivek Goyal
@ 2007-05-30 21:22 ` Yinghai Lu
2007-05-30 21:49 ` Salyzyn, Mark
1 sibling, 1 reply; 28+ messages in thread
From: Yinghai Lu @ 2007-05-30 21:22 UTC (permalink / raw)
To: Salyzyn, Mark
Cc: Andrew Morton, Vivek Goyal, Eric W. Biederman,
Linux Kernel Mailing List, linux-scsi, Michal Piotrowski
On 5/30/07, Salyzyn, Mark <mark_salyzyn@adaptec.com> wrote:
> I believe this issue is a result of the aacraid_commit_reset patch (as
> posted for scsi-misc-2.6, enclosed to permit testing) not yet propagated
> to the 2.6.22-rc3 tree.
>
> This is the adapter taking longer than 3 minutes to start after a reset.
> I seriously doubt either of these patches suggested below will have an
> affect. And if they do, they are not root cause, one reduces the chances
> that the card will be reset during initialization (thus applied would
> likely mitigate this problem), the other prevents a panic when the
> Adapter is reset (removed, would result in dogs and cats sleeping with
> each other).
>
> Please use kernel parameter aacraid.startup_timeout=540 (merely larger
> than the default 180 seconds) when spawning the kexec or see if the
> aacraid_commit_reset.patch resolves the issue to confirm my hunch.
>
aacraid_commit_reset.patch is in the mainline already.
YH
^ permalink raw reply [flat|nested] 28+ messages in thread
* RE: kexec and aacraid broken
2007-05-30 21:22 ` Yinghai Lu
@ 2007-05-30 21:49 ` Salyzyn, Mark
2007-05-30 22:11 ` Yinghai Lu
0 siblings, 1 reply; 28+ messages in thread
From: Salyzyn, Mark @ 2007-05-30 21:49 UTC (permalink / raw)
To: Yinghai Lu
Cc: Andrew Morton, Vivek Goyal, Eric W. Biederman,
Linux Kernel Mailing List, linux-scsi, Michal Piotrowski
Yinghai Lu [mailto:yhlu.kernel@gmail.com] writes:
> aacraid_commit_reset.patch is in the mainline already.
But aacraid_commit_reset.patch is not in 2.6.22-rc3 (to which you report
the issue). Does the aacraid_commit_reset.patch work to resolve this
issue all by itself in the kexec'd kernel? Or alternatively did you try
aacraid.startup_timeout=540 as one of the kernel parameters passed to
the kexec'd kernel?
The '[PATCH] aacraid: fix shutdown handler to also disable interrupts'
patch (you refer to this as patch 4) is not to be in the picture because
it will hide the root cause. I believe I have you correct in stating
that this patch (4) resolves the problem... but I expect the problem to
remain with kdump.
Sincerely -- Mark Salyzyn
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: kexec and aacraid broken
2007-05-30 21:49 ` Salyzyn, Mark
@ 2007-05-30 22:11 ` Yinghai Lu
2007-05-31 12:37 ` Salyzyn, Mark
0 siblings, 1 reply; 28+ messages in thread
From: Yinghai Lu @ 2007-05-30 22:11 UTC (permalink / raw)
To: Salyzyn, Mark
Cc: Andrew Morton, Vivek Goyal, Eric W. Biederman,
Linux Kernel Mailing List, linux-scsi, Michal Piotrowski
On 5/30/07, Salyzyn, Mark <mark_salyzyn@adaptec.com> wrote:
> Yinghai Lu [mailto:yhlu.kernel@gmail.com] writes:
> > aacraid_commit_reset.patch is in the mainline already.
>
> But aacraid_commit_reset.patch is not in 2.6.22-rc3 (to which you report
> the issue). Does the aacraid_commit_reset.patch work to resolve this
> issue all by itself in the kexec'd kernel? Or alternatively did you try
> aacraid.startup_timeout=540 as one of the kernel parameters passed to
> the kexec'd kernel?
No, still get adapter kernel panic
>
> The '[PATCH] aacraid: fix shutdown handler to also disable interrupts'
> patch (you refer to this as patch 4) is not to be in the picture because
> it will hide the root cause. I believe I have you correct in stating
> that this patch (4) resolves the problem... but I expect the problem to
> remain with kdump.
Oh.
without patch(4), latest kernel still can use kexec to 2.6.21.3
will try to load 2.6.22-rc1 etc.
YH
^ permalink raw reply [flat|nested] 28+ messages in thread
* RE: kexec and aacraid broken
2007-05-30 22:11 ` Yinghai Lu
@ 2007-05-31 12:37 ` Salyzyn, Mark
2007-05-31 19:59 ` Yinghai Lu
0 siblings, 1 reply; 28+ messages in thread
From: Salyzyn, Mark @ 2007-05-31 12:37 UTC (permalink / raw)
To: Yinghai Lu
Cc: Andrew Morton, Vivek Goyal, Eric W. Biederman,
Linux Kernel Mailing List, linux-scsi, Michal Piotrowski
> No, still get adapter kernel panic
Which adapter are you using?
Sincerely -- Mark Salyzyn
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: kexec and aacraid broken
2007-05-31 12:37 ` Salyzyn, Mark
@ 2007-05-31 19:59 ` Yinghai Lu
2007-05-31 20:45 ` Salyzyn, Mark
0 siblings, 1 reply; 28+ messages in thread
From: Yinghai Lu @ 2007-05-31 19:59 UTC (permalink / raw)
To: Salyzyn, Mark
Cc: Andrew Morton, Vivek Goyal, Eric W. Biederman,
Linux Kernel Mailing List, linux-scsi, Michal Piotrowski
SUN coguar with 11731
YH
On 5/31/07, Salyzyn, Mark <mark_salyzyn@adaptec.com> wrote:
> > No, still get adapter kernel panic
>
> Which adapter are you using?
>
> Sincerely -- Mark Salyzyn
>
^ permalink raw reply [flat|nested] 28+ messages in thread
* RE: kexec and aacraid broken
2007-05-31 19:59 ` Yinghai Lu
@ 2007-05-31 20:45 ` Salyzyn, Mark
0 siblings, 0 replies; 28+ messages in thread
From: Salyzyn, Mark @ 2007-05-31 20:45 UTC (permalink / raw)
To: Yinghai Lu
Cc: Andrew Morton, Vivek Goyal, Eric W. Biederman,
Linux Kernel Mailing List, linux-scsi, Michal Piotrowski
Ahhhh. explains why I am having troubles duping this issue thus far.
This is prerelease Firmware on a yet to be released card and thus should
not get any driver workarounds if this issue can be resolved in
Firmware. If this can be duped on a released card with released
Firmware, then the story changes of course; but still does not preclude
a Firmware/Hardware/Drive Compatibility bug ;-} . Until then, please
work this issue via SUN channels so that we get all the necessary card
debug information for our teams to work this.
I will ensure Adaptec will remain on top of this issue since it is
clearly a problem with the Adapter Hardware interfacing. The adapter is
not surviving an IOP_RESET and is going into an Adapter Firmware Kernel
Panic or taking an excessively long period (in the testing thus far >
540 seconds) of time to complete it's reset.
Sincerely -- Mark Salyzyn
Yinghai Lu [mailto:yhlu.kernel@gmail.com] sez:
> SUN coguar with 11731
>
> On 5/31/07, Salyzyn, Mark <mark_salyzyn@adaptec.com> wrote:
> > > No, still get adapter kernel panic
> >
> > Which adapter are you using?
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] aacraid: fix shutdown handler to also disable interrupts.
2007-05-30 15:59 ` [PATCH] aacraid: fix shutdown handler to also disable interrupts Salyzyn, Mark
2007-05-30 17:36 ` Yinghai Lu
@ 2007-06-01 11:08 ` Vivek Goyal
2007-06-01 17:07 ` Yinghai Lu
2007-06-07 17:21 ` [PATCH] aacraid: add SCSI SYNCHONIZE_CACHE range checking Salyzyn, Mark
2 siblings, 1 reply; 28+ messages in thread
From: Vivek Goyal @ 2007-06-01 11:08 UTC (permalink / raw)
To: Salyzyn, Mark
Cc: linux-scsi, Andrew Morton, Yinghai Lu, Eric W. Biederman,
Michal Piotrowski, Linux Kernel Mailing List
On Wed, May 30, 2007 at 11:59:13AM -0400, Salyzyn, Mark wrote:
> Moves quiesce, thread and interrupt shutdown into aacraid drivers'
> .shutdown handler. This fix to the aac_shutdown handler will remove the
> superfluous reset of the adapter during a (clean) kexec.
>
> This fix may mitigate the active investigation 'kexec and aacraid
> broken' but it is unlikely to affect the root cause (issue likely
> present in both kexec and kdump). This patch reduces the chance the
> problem will occur with a kexec. The fix for root cause is currently
> expected to be the minimum value check to the aacraid.startup_timeout
> driver variable after an adapter reset within aacraid_commit_reset.patch
> submitted on 05/22/2007 and awaiting testing by Yinghai to confirm.
>
> This attached patch is against current scsi-misc-2.6
>
> ObligatoryDisclaimer: Please accept my condolences regarding Outlook's
> handling of patch attachments.
>
> Signed-off-by: Mark Salyzyn <aacraid@adaptec.com>
>
Thanks Mark. This does fix the issue of unnecessary reset of aacraid
adapter over kexec on my machine.
Thanks
Vivek
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] aacraid: fix shutdown handler to also disable interrupts.
2007-06-01 11:08 ` Vivek Goyal
@ 2007-06-01 17:07 ` Yinghai Lu
2007-06-01 17:34 ` Salyzyn, Mark
0 siblings, 1 reply; 28+ messages in thread
From: Yinghai Lu @ 2007-06-01 17:07 UTC (permalink / raw)
To: vgoyal
Cc: Salyzyn, Mark, linux-scsi, Andrew Morton, Eric W. Biederman,
Michal Piotrowski, Linux Kernel Mailing List
On 6/1/07, Vivek Goyal <vgoyal@in.ibm.com> wrote:
> On Wed, May 30, 2007 at 11:59:13AM -0400, Salyzyn, Mark wrote:
> > Moves quiesce, thread and interrupt shutdown into aacraid drivers'
> > .shutdown handler. This fix to the aac_shutdown handler will remove the
> > superfluous reset of the adapter during a (clean) kexec.
> >
> > This fix may mitigate the active investigation 'kexec and aacraid
> > broken' but it is unlikely to affect the root cause (issue likely
> > present in both kexec and kdump). This patch reduces the chance the
> > problem will occur with a kexec. The fix for root cause is currently
> > expected to be the minimum value check to the aacraid.startup_timeout
> > driver variable after an adapter reset within aacraid_commit_reset.patch
> > submitted on 05/22/2007 and awaiting testing by Yinghai to confirm.
> >
> > This attached patch is against current scsi-misc-2.6
> >
> > ObligatoryDisclaimer: Please accept my condolences regarding Outlook's
> > handling of patch attachments.
> >
> > Signed-off-by: Mark Salyzyn <aacraid@adaptec.com>
> >
>
> Thanks Mark. This does fix the issue of unnecessary reset of aacraid
> adapter over kexec on my machine.
>
i'm little confused about that.
this patch is some clear shutdown, so even next start will have tight
condition will not try to reset the adapter fw. right Mark?
Maybe the driver could be smart to find out if it need to reset adaptec fw.
YH
^ permalink raw reply [flat|nested] 28+ messages in thread
* RE: [PATCH] aacraid: fix shutdown handler to also disable interrupts.
2007-06-01 17:07 ` Yinghai Lu
@ 2007-06-01 17:34 ` Salyzyn, Mark
0 siblings, 0 replies; 28+ messages in thread
From: Salyzyn, Mark @ 2007-06-01 17:34 UTC (permalink / raw)
To: Yinghai Lu, vgoyal
Cc: linux-scsi, Andrew Morton, Eric W. Biederman, Michal Piotrowski,
Linux Kernel Mailing List
Yes, this patch makes sure that the Adapter is shut down correctly, and
thus when the kexec driver loads, it does not automatically reset the
adapter during initialization. This regression was a result of adding
code to the driver to detect if the adapter needed a reset as a result
of an unclean shutdown in order to deal with an issue that came up with
kdump. Kdump does not issue a clean shutdown. As you see, it was the
process of making the driver smarter to find out if it needed to reset
the adaptec fw that triggered the problem.
As noted before, please be advised to go through SUN channels. Upgrade
your Drive(s), SES, Motherboard and Card Firmware to the latest
versions; and make sure you are using compatible drives and drive bays
to see if this problem dealing with the superfluous reset on your
pre-release system goes away. You will be able to trigger this by trying
to perform a kdump on the system, OR by reverting this patch and running
your kexec test. The superfluous reset has yet to cause an issue with a
released card beyond noticing a superfluous Firmware reset as Vivek has
pointed out.
Sincerely -- Mark Salyzyn
From: Yinghai Lu [mailto:yhlu.kernel@gmail.com] sez:
> On 6/1/07, Vivek Goyal <vgoyal@in.ibm.com> wrote:
> > Thanks Mark. This does fix the issue of unnecessary reset of aacraid
> > adapter over kexec on my machine.
> i'm little confused about that.
> this patch is some clear shutdown, so even next start will have tight
> condition will not try to reset the adapter fw. right Mark?
> Maybe the driver could be smart to find out if it need to
> reset adaptec fw.
>
> YH
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH] aacraid: add SCSI SYNCHONIZE_CACHE range checking.
2007-05-30 15:59 ` [PATCH] aacraid: fix shutdown handler to also disable interrupts Salyzyn, Mark
2007-05-30 17:36 ` Yinghai Lu
2007-06-01 11:08 ` Vivek Goyal
@ 2007-06-07 17:21 ` Salyzyn, Mark
2007-06-11 20:17 ` [PATCH] aacraid: probe related code cleanup Salyzyn, Mark
2007-06-20 15:30 ` [PATCH] aacraid: add SCSI SYNCHONIZE_CACHE range checking (take 2) Salyzyn, Mark
2 siblings, 2 replies; 28+ messages in thread
From: Salyzyn, Mark @ 2007-06-07 17:21 UTC (permalink / raw)
To: linux-scsi
[-- Attachment #1: Type: text/plain, Size: 977 bytes --]
Customer running an application that issues SYNCHRONIZE_CACHE calls
directly noticed the broad stroke of the current implementation in the
aacraid driver resulting in multiple applications feeding I/O to the
storage causing the issuing application to stall for long periods of
time. By only waiting for the current WRITE commands, rather than all
commands, to complete; and those that are in range of the
SYNCHRONIZE_CACHE call that would associate more tightly with the
issuing application before telling the Firmware to flush it's dirty
cache, we managed to reduce the stalling. The Firmware itself still
flushes all the dirty cache associated with the array ignoring the
range, it just does so in a more timely manner.
This attached patch is against current scsi-misc-2.6
ObligatoryDisclaimer: Please accept my condolences regarding Outlook's
handling of patch attachments.
Signed-off-by: Mark Salyzyn <aacraid@adaptec.com>
Sincerely -- Mark Salyzyn
[-- Attachment #2: aacraid_synch_range.patch --]
[-- Type: application/octet-stream, Size: 2510 bytes --]
diff -ru a/drivers/scsi/aacraid/aachba.c b/drivers/scsi/aacraid/aachba.c
--- a/drivers/scsi/aacraid/aachba.c 2007-06-07 12:52:44.951750334 -0400
+++ b/drivers/scsi/aacraid/aachba.c 2007-06-07 13:04:34.564189741 -0400
@@ -1587,7 +1587,7 @@
COMMAND_COMPLETE << 8 | SAM_STAT_GOOD;
else {
struct scsi_device *sdev = cmd->device;
- struct aac_dev *dev = (struct aac_dev *)sdev->host->hostdata;
+ struct aac_dev *dev = fibptr->dev;
u32 cid = sdev_id(sdev);
printk(KERN_WARNING
"synchronize_callback: synchronize failed, status = %d\n",
@@ -1618,6 +1618,9 @@
struct scsi_device *sdev = scsicmd->device;
int active = 0;
struct aac_dev *aac;
+ u64 lba = ((u64)scsicmd->cmnd[2] << 24) | (scsicmd->cmnd[3] << 16) |
+ (scsicmd->cmnd[4] << 8) | scsicmd->cmnd[5];
+ u32 count = (scsicmd->cmnd[7] << 8) | scsicmd->cmnd[8];
unsigned long flags;
/*
@@ -1626,11 +1629,54 @@
*/
spin_lock_irqsave(&sdev->list_lock, flags);
list_for_each_entry(cmd, &sdev->cmd_list, list)
- if (cmd != scsicmd && cmd->SCp.phase == AAC_OWNER_FIRMWARE) {
+ if (cmd->SCp.phase == AAC_OWNER_FIRMWARE) {
+ u64 cmnd_lba;
+ u32 cmnd_count;
+
+ if (cmd->cmnd[0] == WRITE_6) {
+ cmnd_lba = ((cmd->cmnd[1] & 0x1F) << 16) |
+ (cmd->cmnd[2] << 8) |
+ cmd->cmnd[3];
+ cmnd_count = cmd->cmnd[4];
+ if (cmnd_count == 0)
+ cmnd_count = 256;
+ } else if (cmd->cmnd[0] == WRITE_16) {
+ cmnd_lba = ((u64)cmd->cmnd[2] << 56) |
+ ((u64)cmd->cmnd[3] << 48) |
+ ((u64)cmd->cmnd[4] << 40) |
+ ((u64)cmd->cmnd[5] << 32) |
+ ((u64)cmd->cmnd[6] << 24) |
+ (cmd->cmnd[7] << 16) |
+ (cmd->cmnd[8] << 8) |
+ cmd->cmnd[9];
+ cmnd_count = (cmd->cmnd[10] << 24) |
+ (cmd->cmnd[11] << 16) |
+ (cmd->cmnd[12] << 8) |
+ cmd->cmnd[13];
+ } else if (cmd->cmnd[0] == WRITE_12) {
+ cmnd_lba = ((u64)cmd->cmnd[2] << 24) |
+ (cmd->cmnd[3] << 16) |
+ (cmd->cmnd[4] << 8) |
+ cmd->cmnd[5];
+ cmnd_count = (cmd->cmnd[6] << 24) |
+ (cmd->cmnd[7] << 16) |
+ (cmd->cmnd[8] << 8) |
+ cmd->cmnd[9];
+ } else if (cmd->cmnd[0] == WRITE_10) {
+ cmnd_lba = ((u64)cmd->cmnd[2] << 24) |
+ (cmd->cmnd[3] << 16) |
+ (cmd->cmnd[4] << 8) |
+ cmd->cmnd[5];
+ cmnd_count = (cmd->cmnd[7] << 8) |
+ cmd->cmnd[8];
+ } else
+ continue;
+ if (((cmnd_lba + cmnd_count) < lba) ||
+ (count && ((lba + count) < cmnd_lba)))
+ continue;
++active;
break;
}
-
spin_unlock_irqrestore(&sdev->list_lock, flags);
/*
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH] aacraid: probe related code cleanup
2007-06-07 17:21 ` [PATCH] aacraid: add SCSI SYNCHONIZE_CACHE range checking Salyzyn, Mark
@ 2007-06-11 20:17 ` Salyzyn, Mark
2007-06-20 15:30 ` [PATCH] aacraid: add SCSI SYNCHONIZE_CACHE range checking (take 2) Salyzyn, Mark
1 sibling, 0 replies; 28+ messages in thread
From: Salyzyn, Mark @ 2007-06-11 20:17 UTC (permalink / raw)
To: linux-scsi
[-- Attachment #1: Type: text/plain, Size: 1197 bytes --]
Sundry cleanups:
1) Use kzalloc instead of kmalloc.
2) Make sure probe worked before recalling the SCSI command to finalize
processing.
3) _aac_probe_container2 and _aac_probe_container1 return value goes
unused, change return to void.
4) Use a lower depth pointer reference to pick up the driver instance
variable.
5) Although effectively unused except to fake for scsicmd validity, set
the scsi_done in probe code to aac_probe_container_callback1 instead of
the less valid dummy reference to _aac_probe_container1.
6) SCp.phase is set in aac_valid_context, drop setting up this value in
caller when unnecessary.
7) take container target id at the beginning, rather than referencing
scmd_id() to pick it up.
There should be no side effects or functionality changes.
This attached patch is against current scsi-misc-2.6, scsi-rc-fixes-2.6
& scsi-pending-2.6
ObligatoryDisclaimer: Please accept my condolences regarding Outlook's
handling of patch attachments.
Signed-off-by: Mark Salyzyn <aacraid@adaptec.com>
drivers/scsi/aacraid/aachba.c | 64
++++++++++++++++++++++++++++++---------------------------------
1 file changed, 31 insertions(+), 33 deletions(-)
[-- Attachment #2: aacraid_probe_cleanup.patch --]
[-- Type: application/octet-stream, Size: 6423 bytes --]
diff -ru a/drivers/scsi/aacraid/aachba.c b/drivers/scsi/aacraid/aachba.c
--- a/drivers/scsi/aacraid/aachba.c 2007-06-11 15:15:38.908828462 -0400
+++ b/drivers/scsi/aacraid/aachba.c 2007-06-11 15:51:30.352826577 -0400
@@ -312,11 +312,10 @@
if (maximum_num_containers < MAXIMUM_NUM_CONTAINERS)
maximum_num_containers = MAXIMUM_NUM_CONTAINERS;
- fsa_dev_ptr = kmalloc(sizeof(*fsa_dev_ptr) * maximum_num_containers,
+ fsa_dev_ptr = kzalloc(sizeof(*fsa_dev_ptr) * maximum_num_containers,
GFP_KERNEL);
if (!fsa_dev_ptr)
return -ENOMEM;
- memset(fsa_dev_ptr, 0, sizeof(*fsa_dev_ptr) * maximum_num_containers);
dev->fsa_dev = fsa_dev_ptr;
dev->maximum_num_containers = maximum_num_containers;
@@ -446,7 +445,7 @@
{
struct fsa_dev_info *fsa_dev_ptr = ((struct aac_dev *)(scsicmd->device->host->hostdata))->fsa_dev;
- if (fsa_dev_ptr[scmd_id(scsicmd)].valid)
+ if ((fsa_dev_ptr[scmd_id(scsicmd)].valid & 1))
return aac_scsi_cmd(scsicmd);
scsicmd->result = DID_NO_CONNECT << 16;
@@ -454,18 +453,18 @@
return 0;
}
-static int _aac_probe_container2(void * context, struct fib * fibptr)
+static void _aac_probe_container2(void * context, struct fib * fibptr)
{
struct fsa_dev_info *fsa_dev_ptr;
int (*callback)(struct scsi_cmnd *);
struct scsi_cmnd * scsicmd = (struct scsi_cmnd *)context;
- if (!aac_valid_context(scsicmd, fibptr))
- return 0;
- fsa_dev_ptr = ((struct aac_dev *)(scsicmd->device->host->hostdata))->fsa_dev;
+ if (!aac_valid_context(scsicmd, fibptr))
+ return;
scsicmd->SCp.Status = 0;
+ fsa_dev_ptr = fibptr->dev->fsa_dev;
if (fsa_dev_ptr) {
struct aac_mount * dresp = (struct aac_mount *) fib_data(fibptr);
fsa_dev_ptr += scmd_id(scsicmd);
@@ -488,10 +487,11 @@
aac_fib_free(fibptr);
callback = (int (*)(struct scsi_cmnd *))(scsicmd->SCp.ptr);
scsicmd->SCp.ptr = NULL;
- return (*callback)(scsicmd);
+ (*callback)(scsicmd);
+ return;
}
-static int _aac_probe_container1(void * context, struct fib * fibptr)
+static void _aac_probe_container1(void * context, struct fib * fibptr)
{
struct scsi_cmnd * scsicmd;
struct aac_mount * dresp;
@@ -501,13 +501,14 @@
dresp = (struct aac_mount *) fib_data(fibptr);
dresp->mnt[0].capacityhigh = 0;
if ((le32_to_cpu(dresp->status) != ST_OK) ||
- (le32_to_cpu(dresp->mnt[0].vol) != CT_NONE))
- return _aac_probe_container2(context, fibptr);
+ (le32_to_cpu(dresp->mnt[0].vol) != CT_NONE)) {
+ _aac_probe_container2(context, fibptr);
+ return;
+ }
scsicmd = (struct scsi_cmnd *) context;
- scsicmd->SCp.phase = AAC_OWNER_MIDLEVEL;
if (!aac_valid_context(scsicmd, fibptr))
- return 0;
+ return;
aac_fib_init(fibptr);
@@ -522,21 +523,18 @@
sizeof(struct aac_query_mount),
FsaNormal,
0, 1,
- (fib_callback) _aac_probe_container2,
+ _aac_probe_container2,
(void *) scsicmd);
/*
* Check that the command queued to the controller
*/
- if (status == -EINPROGRESS) {
+ if (status == -EINPROGRESS)
scsicmd->SCp.phase = AAC_OWNER_FIRMWARE;
- return 0;
- }
- if (status < 0) {
+ else if (status < 0) {
/* Inherit results from VM_NameServe, if any */
dresp->status = cpu_to_le32(ST_OK);
- return _aac_probe_container2(context, fibptr);
+ _aac_probe_container2(context, fibptr);
}
- return 0;
}
static int _aac_probe_container(struct scsi_cmnd * scsicmd, int (*callback)(struct scsi_cmnd *))
@@ -561,7 +559,7 @@
sizeof(struct aac_query_mount),
FsaNormal,
0, 1,
- (fib_callback) _aac_probe_container1,
+ _aac_probe_container1,
(void *) scsicmd);
/*
* Check that the command queued to the controller
@@ -615,7 +613,7 @@
return -ENOMEM;
}
scsicmd->list.next = NULL;
- scsicmd->scsi_done = (void (*)(struct scsi_cmnd*))_aac_probe_container1;
+ scsicmd->scsi_done = (void (*)(struct scsi_cmnd*))aac_probe_container_callback1;
scsicmd->device = scsidev;
scsidev->sdev_state = 0;
@@ -1329,7 +1327,7 @@
if (!aac_valid_context(scsicmd, fibptr))
return;
- dev = (struct aac_dev *)scsicmd->device->host->hostdata;
+ dev = fibptr->dev;
cid = scmd_id(scsicmd);
if (nblank(dprintk(x))) {
@@ -1587,7 +1585,7 @@
COMMAND_COMPLETE << 8 | SAM_STAT_GOOD;
else {
struct scsi_device *sdev = cmd->device;
- struct aac_dev *dev = (struct aac_dev *)sdev->host->hostdata;
+ struct aac_dev *dev = fibptr->dev;
u32 cid = sdev_id(sdev);
printk(KERN_WARNING
"synchronize_callback: synchronize failed, status = %d\n",
@@ -1694,7 +1692,7 @@
int aac_scsi_cmd(struct scsi_cmnd * scsicmd)
{
- u32 cid = 0;
+ u32 cid;
struct Scsi_Host *host = scsicmd->device->host;
struct aac_dev *dev = (struct aac_dev *)host->hostdata;
struct fsa_dev_info *fsa_dev_ptr = dev->fsa_dev;
@@ -1706,15 +1704,15 @@
* Test does not apply to ID 16, the pseudo id for the controller
* itself.
*/
- if (scmd_id(scsicmd) != host->this_id) {
- if ((scmd_channel(scsicmd) == CONTAINER_CHANNEL)) {
- if((scmd_id(scsicmd) >= dev->maximum_num_containers) ||
+ cid = scmd_id(scsicmd);
+ if (cid != host->this_id) {
+ if (scmd_channel(scsicmd) == CONTAINER_CHANNEL) {
+ if((cid >= dev->maximum_num_containers) ||
(scsicmd->device->lun != 0)) {
scsicmd->result = DID_NO_CONNECT << 16;
scsicmd->scsi_done(scsicmd);
return 0;
}
- cid = scmd_id(scsicmd);
/*
* If the target container doesn't exist, it may have
@@ -1777,7 +1775,7 @@
{
struct inquiry_data inq_data;
- dprintk((KERN_DEBUG "INQUIRY command, ID: %d.\n", scmd_id(scsicmd)));
+ dprintk((KERN_DEBUG "INQUIRY command, ID: %d.\n", cid));
memset(&inq_data, 0, sizeof (struct inquiry_data));
inq_data.inqd_ver = 2; /* claim compliance to SCSI-2 */
@@ -1789,7 +1787,7 @@
* Set the Vendor, Product, and Revision Level
* see: <vendor>.c i.e. aac.c
*/
- if (scmd_id(scsicmd) == host->this_id) {
+ if (cid == host->this_id) {
setinqstr(dev, (void *) (inq_data.inqd_vid), ARRAY_SIZE(container_types));
inq_data.inqd_pdt = INQD_PDT_PROC; /* Processor device */
aac_internal_transfer(scsicmd, &inq_data, 0, sizeof(inq_data));
@@ -2160,10 +2158,10 @@
if (!aac_valid_context(scsicmd, fibptr))
return;
- dev = (struct aac_dev *)scsicmd->device->host->hostdata;
-
BUG_ON(fibptr == NULL);
+ dev = fibptr->dev;
+
srbreply = (struct aac_srb_reply *) fib_data(fibptr);
scsicmd->sense_buffer[0] = '\0'; /* Initialize sense valid flag to false */
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH] aacraid: add SCSI SYNCHONIZE_CACHE range checking (take 2)
2007-06-07 17:21 ` [PATCH] aacraid: add SCSI SYNCHONIZE_CACHE range checking Salyzyn, Mark
2007-06-11 20:17 ` [PATCH] aacraid: probe related code cleanup Salyzyn, Mark
@ 2007-06-20 15:30 ` Salyzyn, Mark
2007-07-09 13:57 ` [PATCH] aacraid: add 51245, 51645 and 52245 adapters to documentation Salyzyn, Mark
1 sibling, 1 reply; 28+ messages in thread
From: Salyzyn, Mark @ 2007-06-20 15:30 UTC (permalink / raw)
To: linux-scsi
[-- Attachment #1: Type: text/plain, Size: 1852 bytes --]
There was some overlap with another patch (?) this one has not shown in
scsi-pending-2.6. Modernized to apply cleanly and did some extra
cleanup.
This attached patch is against current scsi-misc-2.6
ObligatoryDisclaimer: Please accept my condolences regarding Outlook's
handling of patch attachments.
Signed-off-by: Mark Salyzyn <aacraid@adaptec.com>
drivers/scsi/aacraid/aachba.c | 63
++++++++++++++++++++++++++++++++++++------
1 file changed, 55 insertions(+), 8 deletions(-)
Sincerely -- Mark Salyzyn
> -----Original Message-----
> From: linux-scsi-owner@vger.kernel.org
> [mailto:linux-scsi-owner@vger.kernel.org] On Behalf Of Salyzyn, Mark
> Sent: Thursday, June 07, 2007 1:21 PM
> To: linux-scsi@vger.kernel.org
> Subject: [PATCH] aacraid: add SCSI SYNCHONIZE_CACHE range checking.
>
>
> Customer running an application that issues SYNCHRONIZE_CACHE calls
> directly noticed the broad stroke of the current implementation in the
> aacraid driver resulting in multiple applications feeding I/O to the
> storage causing the issuing application to stall for long periods of
> time. By only waiting for the current WRITE commands, rather than all
> commands, to complete; and those that are in range of the
> SYNCHRONIZE_CACHE call that would associate more tightly with the
> issuing application before telling the Firmware to flush it's dirty
> cache, we managed to reduce the stalling. The Firmware itself still
> flushes all the dirty cache associated with the array ignoring the
> range, it just does so in a more timely manner.
>
> This attached patch is against current scsi-misc-2.6
>
> ObligatoryDisclaimer: Please accept my condolences regarding Outlook's
> handling of patch attachments.
>
> Signed-off-by: Mark Salyzyn <aacraid@adaptec.com>
>
> Sincerely -- Mark Salyzyn
>
[-- Attachment #2: aacraid_synch_range2.patch --]
[-- Type: application/octet-stream, Size: 3893 bytes --]
diff -ru a/drivers/scsi/aacraid/aachba.c b/drivers/scsi/aacraid/aachba.c
--- a/drivers/scsi/aacraid/aachba.c 2007-06-20 11:05:47.673609233 -0400
+++ b/drivers/scsi/aacraid/aachba.c 2007-06-20 11:21:33.655053285 -0400
@@ -1595,23 +1595,23 @@
if (!aac_valid_context(cmd, fibptr))
return;
- dprintk((KERN_DEBUG "synchronize_callback[cpu %d]: t = %ld.\n",
+ dprintk((KERN_DEBUG "synchronize_callback[cpu %d]: t = %ld.\n",
smp_processor_id(), jiffies));
BUG_ON(fibptr == NULL);
synchronizereply = fib_data(fibptr);
if (le32_to_cpu(synchronizereply->status) == CT_OK)
- cmd->result = DID_OK << 16 |
+ cmd->result = DID_OK << 16 |
COMMAND_COMPLETE << 8 | SAM_STAT_GOOD;
else {
struct scsi_device *sdev = cmd->device;
struct aac_dev *dev = fibptr->dev;
u32 cid = sdev_id(sdev);
- printk(KERN_WARNING
+ printk(KERN_WARNING
"synchronize_callback: synchronize failed, status = %d\n",
le32_to_cpu(synchronizereply->status));
- cmd->result = DID_OK << 16 |
+ cmd->result = DID_OK << 16 |
COMMAND_COMPLETE << 8 | SAM_STAT_CHECK_CONDITION;
set_sense((u8 *)&dev->fsa_dev[cid].sense_data,
HARDWARE_ERROR,
@@ -1619,7 +1619,7 @@
ASENCODE_INTERNAL_TARGET_FAILURE, 0, 0,
0, 0);
memcpy(cmd->sense_buffer, &dev->fsa_dev[cid].sense_data,
- min(sizeof(dev->fsa_dev[cid].sense_data),
+ min(sizeof(dev->fsa_dev[cid].sense_data),
sizeof(cmd->sense_buffer)));
}
@@ -1637,6 +1637,9 @@
struct scsi_device *sdev = scsicmd->device;
int active = 0;
struct aac_dev *aac;
+ u64 lba = ((u64)scsicmd->cmnd[2] << 24) | (scsicmd->cmnd[3] << 16) |
+ (scsicmd->cmnd[4] << 8) | scsicmd->cmnd[5];
+ u32 count = (scsicmd->cmnd[7] << 8) | scsicmd->cmnd[8];
unsigned long flags;
/*
@@ -1645,7 +1648,51 @@
*/
spin_lock_irqsave(&sdev->list_lock, flags);
list_for_each_entry(cmd, &sdev->cmd_list, list)
- if (cmd != scsicmd && cmd->SCp.phase == AAC_OWNER_FIRMWARE) {
+ if (cmd->SCp.phase == AAC_OWNER_FIRMWARE) {
+ u64 cmnd_lba;
+ u32 cmnd_count;
+
+ if (cmd->cmnd[0] == WRITE_6) {
+ cmnd_lba = ((cmd->cmnd[1] & 0x1F) << 16) |
+ (cmd->cmnd[2] << 8) |
+ cmd->cmnd[3];
+ cmnd_count = cmd->cmnd[4];
+ if (cmnd_count == 0)
+ cmnd_count = 256;
+ } else if (cmd->cmnd[0] == WRITE_16) {
+ cmnd_lba = ((u64)cmd->cmnd[2] << 56) |
+ ((u64)cmd->cmnd[3] << 48) |
+ ((u64)cmd->cmnd[4] << 40) |
+ ((u64)cmd->cmnd[5] << 32) |
+ ((u64)cmd->cmnd[6] << 24) |
+ (cmd->cmnd[7] << 16) |
+ (cmd->cmnd[8] << 8) |
+ cmd->cmnd[9];
+ cmnd_count = (cmd->cmnd[10] << 24) |
+ (cmd->cmnd[11] << 16) |
+ (cmd->cmnd[12] << 8) |
+ cmd->cmnd[13];
+ } else if (cmd->cmnd[0] == WRITE_12) {
+ cmnd_lba = ((u64)cmd->cmnd[2] << 24) |
+ (cmd->cmnd[3] << 16) |
+ (cmd->cmnd[4] << 8) |
+ cmd->cmnd[5];
+ cmnd_count = (cmd->cmnd[6] << 24) |
+ (cmd->cmnd[7] << 16) |
+ (cmd->cmnd[8] << 8) |
+ cmd->cmnd[9];
+ } else if (cmd->cmnd[0] == WRITE_10) {
+ cmnd_lba = ((u64)cmd->cmnd[2] << 24) |
+ (cmd->cmnd[3] << 16) |
+ (cmd->cmnd[4] << 8) |
+ cmd->cmnd[5];
+ cmnd_count = (cmd->cmnd[7] << 8) |
+ cmd->cmnd[8];
+ } else
+ continue;
+ if (((cmnd_lba + cmnd_count) < lba) ||
+ (count && ((lba + count) < cmnd_lba)))
+ continue;
++active;
break;
}
@@ -1674,7 +1721,7 @@
synchronizecmd->command = cpu_to_le32(VM_ContainerConfig);
synchronizecmd->type = cpu_to_le32(CT_FLUSH_CACHE);
synchronizecmd->cid = cpu_to_le32(scmd_id(scsicmd));
- synchronizecmd->count =
+ synchronizecmd->count =
cpu_to_le32(sizeof(((struct aac_synchronize_reply *)NULL)->data));
/*
@@ -1696,7 +1743,7 @@
return 0;
}
- printk(KERN_WARNING
+ printk(KERN_WARNING
"aac_synchronize: aac_fib_send failed with status: %d.\n", status);
aac_fib_complete(cmd_fibcontext);
aac_fib_free(cmd_fibcontext);
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH] aacraid: add 51245, 51645 and 52245 adapters to documentation.
2007-06-20 15:30 ` [PATCH] aacraid: add SCSI SYNCHONIZE_CACHE range checking (take 2) Salyzyn, Mark
@ 2007-07-09 13:57 ` Salyzyn, Mark
2007-07-23 14:13 ` [PATCH] aacraid: sysfs adapter reset/status format change Salyzyn, Mark
0 siblings, 1 reply; 28+ messages in thread
From: Salyzyn, Mark @ 2007-07-09 13:57 UTC (permalink / raw)
To: linux-scsi
[-- Attachment #1: Type: text/plain, Size: 458 bytes --]
Adding Adaptec 51245 (16 port), 51645 (20 port) and 52445 (28 port)
Universal Serial RAID controllers to the aacraid documentation.
This attached patch is against current scsi-misc-2.6
ObligatoryDisclaimer: Please accept my condolences regarding Outlook's
handling of patch attachments.
Signed-off-by: Mark Salyzyn <aacraid@adaptec.com>
Documentation/scsi/aacraid.txt | 3 +++
1 file changed, 3 insertions(+)
Sincerely -- Mark Salyzyn
[-- Attachment #2: aacraid_voodoo244.patch --]
[-- Type: application/octet-stream, Size: 673 bytes --]
diff -ru a/Documentation/scsi/aacraid.txt b/Documentation/scsi/aacraid.txt
--- a/Documentation/scsi/aacraid.txt 2007-07-09 09:38:47.319012381 -0400
+++ b/Documentation/scsi/aacraid.txt 2007-07-09 09:47:03.383207866 -0400
@@ -50,6 +50,9 @@
9005:0285:9005:02be Adaptec 31605 (Marauder160)
9005:0285:9005:02c3 Adaptec 51205 (Voodoo120)
9005:0285:9005:02c4 Adaptec 51605 (Voodoo160)
+ 9005:0285:9005:02ce Adaptec 51245 (Voodoo124)
+ 9005:0285:9005:02cf Adaptec 51645 (Voodoo164)
+ 9005:0285:9005:02d0 Adaptec 52445 (Voodoo244)
1011:0046:9005:0364 Adaptec 5400S (Mustang)
9005:0287:9005:0800 Adaptec Themisto (Jupiter)
9005:0200:9005:0200 Adaptec Themisto (Jupiter)
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH] aacraid: sysfs adapter reset/status format change.
2007-07-09 13:57 ` [PATCH] aacraid: add 51245, 51645 and 52245 adapters to documentation Salyzyn, Mark
@ 2007-07-23 14:13 ` Salyzyn, Mark
2007-07-26 18:20 ` [PATCH 1/1] aacraid: draw line in sand, sundry cleanup and version update Salyzyn, Mark
0 siblings, 1 reply; 28+ messages in thread
From: Salyzyn, Mark @ 2007-07-23 14:13 UTC (permalink / raw)
To: linux-scsi
[-- Attachment #1: Type: text/plain, Size: 483 bytes --]
We need to newline terminate responses from nodes within the sysfs tree,
the Adapter status value reported by the reset adapter node is adjusted.
This attached patch is against current scsi-misc-2.6
ObligatoryDisclaimer: Please accept my condolences regarding Outlook's
handling of patch attachments.
Signed-off-by: Mark Salyzyn <aacraid@adaptec.com>
drivers/scsi/aacraid/linit.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Sincerely -- Mark Salyzyn
[-- Attachment #2: aacraid_adapter_status_format_change.patch --]
[-- Type: application/octet-stream, Size: 439 bytes --]
diff -ru a/drivers/scsi/aacraid/linit.c b/drivers/scsi/aacraid/linit.c
--- a/drivers/scsi/aacraid/linit.c 2007-07-23 09:53:06.852929239 -0400
+++ b/drivers/scsi/aacraid/linit.c 2007-07-23 10:08:10.347390939 -0400
@@ -822,7 +822,7 @@
tmp = aac_adapter_check_health(dev);
if ((tmp == 0) && dev->in_reset)
tmp = -EBUSY;
- len = snprintf(buf, PAGE_SIZE, "0x%x", tmp);
+ len = snprintf(buf, PAGE_SIZE, "0x%x\n", tmp);
return len;
}
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH 1/1] aacraid: draw line in sand, sundry cleanup and version update
2007-07-23 14:13 ` [PATCH] aacraid: sysfs adapter reset/status format change Salyzyn, Mark
@ 2007-07-26 18:20 ` Salyzyn, Mark
2007-07-27 14:29 ` [PATCH 1/1] aacraid: fix Sunrise Lake reset handling Salyzyn, Mark
0 siblings, 1 reply; 28+ messages in thread
From: Salyzyn, Mark @ 2007-07-26 18:20 UTC (permalink / raw)
To: linux-scsi
[-- Attachment #1: Type: text/plain, Size: 741 bytes --]
Minor unimportant cuttings from the floor bundled in with a version
stamp update. Only controversial change is the dropping of Alan Cox
copyright on the nark.c module since that file has no code written by
him in it.
This attached patch is against current scsi-misc-2.6
ObligatoryDisclaimer: Please accept my condolences regarding Outlook's
handling of patch attachments.
Signed-off-by: Mark Salyzyn <aacraid@adaptec.com>
drivers/scsi/aacraid/aachba.c | 3 +--
drivers/scsi/aacraid/aacraid.h | 6 +++---
drivers/scsi/aacraid/linit.c | 3 +--
drivers/scsi/aacraid/nark.c | 3 +--
drivers/scsi/aacraid/rkt.c | 2 +-
5 files changed, 7 insertions(+), 10 deletions(-)
Sincerely -- Mark Salyzyn
[-- Attachment #2: aacraid_cleanup_2449.patch --]
[-- Type: application/octet-stream, Size: 3267 bytes --]
diff -ru a/drivers/scsi/aacraid/aachba.c b/drivers/scsi/aacraid/aachba.c
--- a/drivers/scsi/aacraid/aachba.c 2007-07-26 13:28:32.179279220 -0400
+++ b/drivers/scsi/aacraid/aachba.c 2007-07-26 14:11:31.762916390 -0400
@@ -194,8 +194,7 @@
struct scsi_device *device;
if (unlikely(!scsicmd || !scsicmd->scsi_done )) {
- dprintk((KERN_WARNING "aac_valid_context: scsi command corrupt\n"))
-;
+ dprintk((KERN_WARNING "aac_valid_context: scsi command corrupt\n"));
aac_fib_complete(fibptr);
aac_fib_free(fibptr);
return 0;
diff -ru a/drivers/scsi/aacraid/aacraid.h b/drivers/scsi/aacraid/aacraid.h
--- a/drivers/scsi/aacraid/aacraid.h 2007-07-26 13:28:32.180279094 -0400
+++ b/drivers/scsi/aacraid/aacraid.h 2007-07-26 14:11:31.770915383 -0400
@@ -12,7 +12,7 @@
*----------------------------------------------------------------------------*/
#ifndef AAC_DRIVER_BUILD
-# define AAC_DRIVER_BUILD 2447
+# define AAC_DRIVER_BUILD 2449
# define AAC_DRIVER_BRANCH "-ms"
#endif
#define MAXIMUM_NUM_CONTAINERS 32
@@ -1807,10 +1807,10 @@
* accounting for the fact capacity could be a 64 bit value
*
*/
-static inline u32 cap_to_cyls(sector_t capacity, u32 divisor)
+static inline unsigned int cap_to_cyls(sector_t capacity, unsigned divisor)
{
sector_div(capacity, divisor);
- return (u32)capacity;
+ return capacity;
}
/* SCp.phase values */
diff -ru a/drivers/scsi/aacraid/linit.c b/drivers/scsi/aacraid/linit.c
--- a/drivers/scsi/aacraid/linit.c 2007-07-26 13:28:32.183278715 -0400
+++ b/drivers/scsi/aacraid/linit.c 2007-07-26 14:11:31.772915132 -0400
@@ -1122,9 +1122,8 @@
static void aac_shutdown(struct pci_dev *dev)
{
struct Scsi_Host *shost = pci_get_drvdata(dev);
- struct aac_dev *aac = (struct aac_dev *)shost->hostdata;
scsi_block_requests(shost);
- __aac_shutdown(aac);
+ __aac_shutdown((struct aac_dev *)shost->hostdata);
}
static void __devexit aac_remove_one(struct pci_dev *pdev)
diff -ru a/drivers/scsi/aacraid/nark.c b/drivers/scsi/aacraid/nark.c
--- a/drivers/scsi/aacraid/nark.c 2007-07-26 13:28:32.184278589 -0400
+++ b/drivers/scsi/aacraid/nark.c 2007-07-26 14:11:31.772915132 -0400
@@ -1,11 +1,10 @@
/*
* Adaptec AAC series RAID controller driver
- * (c) Copyright 2001 Red Hat Inc. <alan@redhat.com>
*
* based on the old aacraid driver that is..
* Adaptec aacraid device driver for Linux.
*
- * Copyright (c) 2000 Adaptec, Inc. (aacraid@adaptec.com)
+ * Copyright (c) 2006-2007 Adaptec, Inc. (aacraid@adaptec.com)
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
diff -ru a/drivers/scsi/aacraid/rkt.c b/drivers/scsi/aacraid/rkt.c
--- a/drivers/scsi/aacraid/rkt.c 2007-07-26 13:28:32.184278589 -0400
+++ b/drivers/scsi/aacraid/rkt.c 2007-07-26 14:11:31.780914125 -0400
@@ -5,7 +5,7 @@
* based on the old aacraid driver that is..
* Adaptec aacraid device driver for Linux.
*
- * Copyright (c) 2000 Adaptec, Inc. (aacraid@adaptec.com)
+ * Copyright (c) 2000-2007 Adaptec, Inc. (aacraid@adaptec.com)
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH 1/1] aacraid: fix Sunrise Lake reset handling
2007-07-26 18:20 ` [PATCH 1/1] aacraid: draw line in sand, sundry cleanup and version update Salyzyn, Mark
@ 2007-07-27 14:29 ` Salyzyn, Mark
2007-08-02 19:38 ` [PATCH 1/1] aacraid: prevent panic on adapter resource failure Salyzyn, Mark
0 siblings, 1 reply; 28+ messages in thread
From: Salyzyn, Mark @ 2007-07-27 14:29 UTC (permalink / raw)
To: linux-scsi, Linux Kernel Mailing List
Cc: Yinghai Lu, Vivek Goyal, Eric W. Biederman
[-- Attachment #1: Type: text/plain, Size: 3750 bytes --]
The patch is *much* smaller than the description. I am attempting to
answer to those that want to understand an issue that was reported in
May this year.
If a Sunrise Lake based card that requires an alternate reset mechanism
is set up to ignore the commanded IOP_RESET it reports 0x00000010
(IOP_RESET ignored) instead of 0x3803000F (use alternate reset mechanism
to reset all cores), and thus the reset platform function decides to
switch to IOP_RESET_ALWAYS because the reset platform function
parameters indicate that we *need* to reset the card. IOP_RESET_ALWAYS
then responds with the 0x3803000F return code, but alas we treat this as
an error instead of using the alternate reset mechanism (put a 0x03 into
the register offset 0x38). The reset fails, but the fact that the
IOP_RESET_ALWAYS command was issued has put the card in a purposeful
shutdown state in preparation for the alternate hardware reset to be
applied. Yuck.
IOP_RESET is ignored in internal production cards, typically to ensure
that we catch all adapter lockup issues without the driver progressing
further, so this would not appear to be a field issue and thus this
patch was destined to be only in the internal Adaptec source tree.
IOP_RESET_ALWAYS is reserved for
kexec/kdump/FirmwareUpdate/AutomatedTestFrames so we did not function as
expected in any case. Also in the past we have had OEMs specifically
request that cards not be resetable after a BlinkLED/FirmwareAssert for
one reason or another and To head off the possibility that the Sunrise
Lake based cards would suffer a similar fate, we propose the enclosed
fix.
Yinghai Lu of SUN had a pre-production card with IOP_RESET disabled when
he reported an issue to the linux kernel list back in May regarding a
kexec problem resulting from this reset being ignore. His fix was to
update the Firmware to one that did not ignore the IOP_RESET. Previous
kernels did not attempt to reset the adapter and that is why it surfaced
as a regression in his hands.
The current list of aacraid based cards that use Sunrise Lake:
9005:0285:9005:02b5 Adaptec 5445
9005:0285:9005:02b6 Adaptec 5805
9005:0285:9005:02b7 Adaptec 5085
9005:0285:9005:02c3 Adaptec 51205
9005:0285:9005:02c4 Adaptec 51605
9005:0285:9005:02ce Adaptec 51245
9005:0285:9005:02cf Adaptec 51645
9005:0285:9005:02d0 Adaptec 52445
9005:0285:9005:02d1 Adaptec 5405
9005:0285:9005:02b8 ICP ICP5445SL
9005:0285:9005:02b9 ICP ICP5085SL
9005:0285:9005:02ba ICP ICP5805SL
9005:0285:9005:02c5 ICP ICP5125SL
9005:0285:9005:02c6 ICP ICP5165SL
9005:0285:108e:7aac SUN STK RAID REM
9005:0285:108e:0286 SUN STK RAID INT
9005:0285:108e:0287 SUN STK RAID EXT
9005:0285:108e:7aae SUN STK RAID EM
All of these are publicly released with IOP_RESET enabled. So there is
no immediate need for this patch.
This attached patch is against July 11 2007 scsi-misc-2.6, still applies
today.
ObligatoryDisclaimer: Please accept my condolences regarding Outlook's
handling of patch attachments.
Signed-off-by: Mark Salyzyn <aacraid@adaptec.com>
/drivers/scsi/aacraid/rx.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Sincerely -- Mark Salyzyn
-----Original Message-----
From: Yinghai Lu [mailto:yhlu.kernel@gmail.com]
Sent: Tuesday, May 29, 2007 10:00 PM
To: Andrew Morton; Vivek Goyal; Eric W. Biederman; AACRAID
Cc: Linux Kernel Mailing List
Subject: kexec and aacraid broken
latest tree, can not use kexec to load 2.6.22-rc3 at least.
got:
AAC0: adapter kernel panic'd fffffffd
AAC0: adapter kernel failed to start, init status=0
but can load 2.6.21.3
YH
[-- Attachment #2: aacraid_voodoo_reset.patch --]
[-- Type: application/octet-stream, Size: 507 bytes --]
diff -ru a//drivers/scsi/aacraid/rx.c b//drivers/scsi/aacraid/rx.c
--- a//drivers/scsi/aacraid/rx.c 2007-07-11 11:26:25.091066761 -0400
+++ b//drivers/scsi/aacraid/rx.c 2007-07-11 11:28:31.961859496 -0400
@@ -472,7 +472,7 @@
else {
bled = aac_adapter_sync_cmd(dev, IOP_RESET_ALWAYS,
0, 0, 0, 0, 0, 0, &var, NULL, NULL, NULL, NULL);
- if (!bled && (var != 0x00000001))
+ if (!bled && (var != 0x00000001) && (var != 0x3803000F))
bled = -EINVAL;
}
if (bled && (bled != -ETIMEDOUT))
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH 1/1] aacraid: prevent panic on adapter resource failure
2007-07-27 14:29 ` [PATCH 1/1] aacraid: fix Sunrise Lake reset handling Salyzyn, Mark
@ 2007-08-02 19:38 ` Salyzyn, Mark
2007-08-07 19:36 ` [PATCH 1/1] aacraid: default timeout for arrays too short Salyzyn, Mark
0 siblings, 1 reply; 28+ messages in thread
From: Salyzyn, Mark @ 2007-08-02 19:38 UTC (permalink / raw)
To: linux-scsi
[-- Attachment #1: Type: text/plain, Size: 652 bytes --]
If the driver fails to allocate the contiguous (DMAable) memory for
system reasons, we fail to load the instance, but then we try to free
the <nul> allocation in the cleanup code and we get a panic in
pci_free_consistent(). This is reported against an older kernel, hope
this is relevant for latest/greatest.
This attached patch is against current scsi-misc-2.6.
ObligatoryDisclaimer: Please accept my condolences regarding Outlook's
handling of patch attachments.
Signed-off-by: Mark Salyzyn <aacraid@adaptec.com>
drivers/scsi/aacraid/linit.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
Sincerely -- Mark Salyzyn
[-- Attachment #2: aacraid_fail_to_load_panic.patch --]
[-- Type: application/octet-stream, Size: 561 bytes --]
diff -ru a/drivers/scsi/aacraid/linit.c b/drivers/scsi/aacraid/linit.c
--- a/drivers/scsi/aacraid/linit.c 2007-08-02 15:30:35.489215671 -0400
+++ b/drivers/scsi/aacraid/linit.c 2007-08-02 15:30:41.567415315 -0400
@@ -1110,7 +1110,9 @@
__aac_shutdown(aac);
out_unmap:
aac_fib_map_free(aac);
- pci_free_consistent(aac->pdev, aac->comm_size, aac->comm_addr, aac->comm_phys);
+ if (aac->comm_addr)
+ pci_free_consistent(aac->pdev, aac->comm_size, aac->comm_addr,
+ aac->comm_phys);
kfree(aac->queues);
aac_adapter_ioremap(aac, 0);
kfree(aac->fibs);
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH 1/1] aacraid: default timeout for arrays too short
2007-08-02 19:38 ` [PATCH 1/1] aacraid: prevent panic on adapter resource failure Salyzyn, Mark
@ 2007-08-07 19:36 ` Salyzyn, Mark
2007-09-04 16:55 ` [PATCH 1/1] aacraid: Add documentation for new Adaptec, SMC and SUN cards Salyzyn, Mark
0 siblings, 1 reply; 28+ messages in thread
From: Salyzyn, Mark @ 2007-08-07 19:36 UTC (permalink / raw)
To: linux-scsi
[-- Attachment #1: Type: text/plain, Size: 1098 bytes --]
The default SCSI timeout is 30 seconds for a logical device. The aacraid
based controllers currently have a 35 second timeout for the array. We
are bumping up the default SCSI timeout for array devices, which
typically manage many physical disks, to 45 seconds to provide a small
margin to permit the controller to do what it is designed for. We have
not observed any bad side-effects either way because no significant
actions are taken by the aacraid timeout handler except to take
advantage of the quiesced state to allow completion of all outstanding
commands in the controller to provide a poor-mans guaranty of delivery.
This is merely a preferential decision to reduce the number of timeout
reports in the system logs to only the more serious conditions.
This attached patch is against current scsi-misc-2.6.
ObligatoryDisclaimer: Please accept my condolences regarding Outlook's
handling of patch attachments.
Signed-off-by: Mark Salyzyn <aacraid@adaptec.com>
drivers/scsi/aacraid/linit.c | 6 ++++++
1 file changed, 6 insertions(+)
Sincerely -- Mark Salyzyn
[-- Attachment #2: aacraid_array_timeout_too_short.patch --]
[-- Type: application/octet-stream, Size: 607 bytes --]
diff -ru a/drivers/scsi/aacraid/linit.c b/drivers/scsi/aacraid/linit.c
--- a/drivers/scsi/aacraid/linit.c 2007-08-07 14:50:42.087439732 -0400
+++ b/drivers/scsi/aacraid/linit.c 2007-08-07 14:55:54.973530300 -0400
@@ -420,6 +420,12 @@
unsigned num_one = 0;
unsigned depth;
+ /*
+ * Firmware has an individual device recovery time typically
+ * of 35 seconds, give us a margin.
+ */
+ if (sdev->timeout < (45 * HZ))
+ sdev->timeout = 45 * HZ;
__shost_for_each_device(dev, host) {
if (dev->tagged_supported && (dev->type == TYPE_DISK) &&
(sdev_channel(dev) == CONTAINER_CHANNEL))
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH 1/1] aacraid: Add documentation for new Adaptec, SMC and SUN cards
2007-08-07 19:36 ` [PATCH 1/1] aacraid: default timeout for arrays too short Salyzyn, Mark
@ 2007-09-04 16:55 ` Salyzyn, Mark
0 siblings, 0 replies; 28+ messages in thread
From: Salyzyn, Mark @ 2007-09-04 16:55 UTC (permalink / raw)
To: linux-scsi
[-- Attachment #1: Type: text/plain, Size: 528 bytes --]
Add the SMC LP, SUN EM and Adaptec 5405 cards to the aacraid
documentation list of supported products. These cards are picked up with
family match, so no associated code changes.
This attached patch is against current scsi-misc-2.6.
ObligatoryDisclaimer: Please accept my condolences regarding Outlook's
handling of patch attachments.
Signed-off-by: Mark Salyzyn <aacraid@adaptec.com>
Documentation/scsi/aacraid.txt | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
Sincerely -- Mark Salyzyn
[-- Attachment #2: aacraid_SMC_SUN.patch --]
[-- Type: application/octet-stream, Size: 1607 bytes --]
diff -ru a/Documentation/scsi/aacraid.txt b/Documentation/scsi/aacraid.txt
--- a/Documentation/scsi/aacraid.txt 2007-09-04 12:40:17.761048273 -0400
+++ b/Documentation/scsi/aacraid.txt 2007-09-04 12:50:05.785727810 -0400
@@ -38,10 +38,8 @@
9005:0286:9005:02ac Adaptec 1800 (Typhoon44)
9005:0285:9005:02b5 Adaptec 5445 (Voodoo44)
9005:0285:15d9:02b5 SMC AOC-USAS-S4i
- 9005:0285:15d9:02c9 SMC AOC-USAS-S4iR
9005:0285:9005:02b6 Adaptec 5805 (Voodoo80)
9005:0285:15d9:02b6 SMC AOC-USAS-S8i
- 9005:0285:15d9:02ca SMC AOC-USAS-S8iR
9005:0285:9005:02b7 Adaptec 5085 (Voodoo08)
9005:0285:9005:02bb Adaptec 3405 (Marauder40LP)
9005:0285:9005:02bc Adaptec 3805 (Marauder80LP)
@@ -50,9 +48,14 @@
9005:0285:9005:02be Adaptec 31605 (Marauder160)
9005:0285:9005:02c3 Adaptec 51205 (Voodoo120)
9005:0285:9005:02c4 Adaptec 51605 (Voodoo160)
+ 9005:0285:15d9:02c9 SMC AOC-USAS-S4iR
+ 9005:0285:15d9:02ca SMC AOC-USAS-S8iR
9005:0285:9005:02ce Adaptec 51245 (Voodoo124)
9005:0285:9005:02cf Adaptec 51645 (Voodoo164)
9005:0285:9005:02d0 Adaptec 52445 (Voodoo244)
+ 9005:0285:9005:02d1 Adaptec 5405 (Voodoo40)
+ 9005:0285:15d9:02d2 SMC AOC-USAS-S8i-LP
+ 9005:0285:15d9:02d3 SMC AOC-USAS-S8iR-LP
1011:0046:9005:0364 Adaptec 5400S (Mustang)
9005:0287:9005:0800 Adaptec Themisto (Jupiter)
9005:0200:9005:0200 Adaptec Themisto (Jupiter)
@@ -103,6 +106,7 @@
9005:0285:108e:7aac SUN STK RAID REM (Voodoo44 Coyote)
9005:0285:108e:0286 SUN STK RAID INT (Cougar)
9005:0285:108e:0287 SUN STK RAID EXT (Prometheus)
+ 9005:0285:108e:7aae SUN STK RAID EM (Narvi)
People
-------------------------
^ permalink raw reply [flat|nested] 28+ messages in thread
end of thread, other threads:[~2007-09-04 17:10 UTC | newest]
Thread overview: 28+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <86802c440705291859y39a4ca27uf5ddb84810f33510@mail.gmail.com>
2007-05-30 2:13 ` kexec and aacraid broken Andrew Morton
2007-05-30 11:44 ` Salyzyn, Mark
2007-05-30 13:24 ` Vivek Goyal
2007-05-30 13:57 ` Salyzyn, Mark
2007-05-30 14:17 ` Vivek Goyal
2007-05-30 14:30 ` Salyzyn, Mark
2007-05-30 15:59 ` [PATCH] aacraid: fix shutdown handler to also disable interrupts Salyzyn, Mark
2007-05-30 17:36 ` Yinghai Lu
2007-06-01 11:08 ` Vivek Goyal
2007-06-01 17:07 ` Yinghai Lu
2007-06-01 17:34 ` Salyzyn, Mark
2007-06-07 17:21 ` [PATCH] aacraid: add SCSI SYNCHONIZE_CACHE range checking Salyzyn, Mark
2007-06-11 20:17 ` [PATCH] aacraid: probe related code cleanup Salyzyn, Mark
2007-06-20 15:30 ` [PATCH] aacraid: add SCSI SYNCHONIZE_CACHE range checking (take 2) Salyzyn, Mark
2007-07-09 13:57 ` [PATCH] aacraid: add 51245, 51645 and 52245 adapters to documentation Salyzyn, Mark
2007-07-23 14:13 ` [PATCH] aacraid: sysfs adapter reset/status format change Salyzyn, Mark
2007-07-26 18:20 ` [PATCH 1/1] aacraid: draw line in sand, sundry cleanup and version update Salyzyn, Mark
2007-07-27 14:29 ` [PATCH 1/1] aacraid: fix Sunrise Lake reset handling Salyzyn, Mark
2007-08-02 19:38 ` [PATCH 1/1] aacraid: prevent panic on adapter resource failure Salyzyn, Mark
2007-08-07 19:36 ` [PATCH 1/1] aacraid: default timeout for arrays too short Salyzyn, Mark
2007-09-04 16:55 ` [PATCH 1/1] aacraid: Add documentation for new Adaptec, SMC and SUN cards Salyzyn, Mark
2007-05-30 21:19 ` kexec and aacraid broken Yinghai Lu
2007-05-30 21:22 ` Yinghai Lu
2007-05-30 21:49 ` Salyzyn, Mark
2007-05-30 22:11 ` Yinghai Lu
2007-05-31 12:37 ` Salyzyn, Mark
2007-05-31 19:59 ` Yinghai Lu
2007-05-31 20:45 ` Salyzyn, Mark
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox