* Kernel 2.6.5 - Compaq Fibre Channel 64-bit/66Mhz HBA
@ 2005-08-11 15:46 Bolke de Bruin
2005-08-11 15:51 ` Arjan van de Ven
` (3 more replies)
0 siblings, 4 replies; 7+ messages in thread
From: Bolke de Bruin @ 2005-08-11 15:46 UTC (permalink / raw)
To: linux-kernel
Hello,
The company I work for is investigating a switch from Windows 2000 to a
Linux based setup for its databases. Because of a dependancy on a third
party we need to settle on kernel 2.6.5. Although the database servers
will be moved to x86_64, we would like stay using our raid array
(StorageWorks 4100) as it has been quite a tremendous investment for the
size of our company and it still works.
However it is unclear if the current controller "Compaq Fibre Channel
64-bit/66Mhz HBA" is actually supported. This controller has been marked
broken from kernel 2.6.7 and onwards with a "This is too much stack"
(drivers/scsi/cpqfcTScontrol.c). Any additional documentation I cannot find.
So the basic question is. Does this controller work on kernel 2.6.5?
and if not and someone wants to give some advice, is it possible to
replace the controller without having to move to a different raid array
setup or is a fix pretty easy to create?
Kind regards and thanks in advance,
B. de Bruin
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Kernel 2.6.5 - Compaq Fibre Channel 64-bit/66Mhz HBA
2005-08-11 15:46 Kernel 2.6.5 - Compaq Fibre Channel 64-bit/66Mhz HBA Bolke de Bruin
@ 2005-08-11 15:51 ` Arjan van de Ven
2005-08-11 16:19 ` Rolf Eike Beer
` (2 subsequent siblings)
3 siblings, 0 replies; 7+ messages in thread
From: Arjan van de Ven @ 2005-08-11 15:51 UTC (permalink / raw)
To: Bolke de Bruin; +Cc: linux-kernel
On Thu, 2005-08-11 at 17:46 +0200, Bolke de Bruin wrote:
> Hello,
>
> The company I work for is investigating a switch from Windows 2000 to a
> Linux based setup for its databases. Because of a dependancy on a third
> party we need to settle on kernel 2.6.5.
kernel.org 2.6.5 or some vendor 2.6.5? If the later.. you should ask the
vendor as well
> So the basic question is. Does this controller work on kernel 2.6.5?
>
in kernel.org 2.6.5 the answer is no; it hasn't been adjusted to the 2.6
kernel really (heck hardly to the 2.4 kernel) so isn't reliable at all
for business use (or in fact any use). Think "no error handling" and
things like that.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Kernel 2.6.5 - Compaq Fibre Channel 64-bit/66Mhz HBA
2005-08-11 15:46 Kernel 2.6.5 - Compaq Fibre Channel 64-bit/66Mhz HBA Bolke de Bruin
2005-08-11 15:51 ` Arjan van de Ven
@ 2005-08-11 16:19 ` Rolf Eike Beer
2005-08-11 16:41 ` Bolke de Bruin
2005-08-11 17:58 ` Kernel 2.6.5 - Compaq Fibre Channel 64-bit/66Mhz HBA [PATCH] Rolf Eike Beer
[not found] ` <200508160955.49133@bilbo.math.uni-mannheim.de>
3 siblings, 1 reply; 7+ messages in thread
From: Rolf Eike Beer @ 2005-08-11 16:19 UTC (permalink / raw)
To: Linux Kernel Mailing List; +Cc: Bolke de Bruin
[-- Attachment #1: Type: text/plain, Size: 832 bytes --]
Bolke de Bruin wrote:
>So the basic question is. Does this controller work on kernel 2.6.5?
The problem is that the default stack size was reduced to 4kB. The local
arrays allocated by the driver eat 2kB each, so a stack overflow is very
likely. Even with 8kB stack it is still not impossible. Using the version
from 2.6.5 will not be a very good idea I think, it's likely to crash your
machine one day.
The right solution would be fixing the driver to use kmalloc()/kfree() when he
really needs the memory. There was a patch only a few days ago that tried to
do that, but it was not really well done and would have crashed. If you are
really interested in I can do such a patch. The code of this driver sucks
universes through nanotubes, but one day someone _will_ have to start
cleaning this up.
Eike
[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Kernel 2.6.5 - Compaq Fibre Channel 64-bit/66Mhz HBA
2005-08-11 16:19 ` Rolf Eike Beer
@ 2005-08-11 16:41 ` Bolke de Bruin
2005-08-11 16:58 ` Rolf Eike Beer
0 siblings, 1 reply; 7+ messages in thread
From: Bolke de Bruin @ 2005-08-11 16:41 UTC (permalink / raw)
To: Rolf Eike Beer; +Cc: Linux Kernel Mailing List
>arrays allocated by the driver eat 2kB each, so a stack overflow is very
>likely. Even with 8kB stack it is still not impossible. Using the version
>from 2.6.5 will not be a very good idea I think, it's likely to crash your
>machine one day.
>
>
>
:-(
>The right solution would be fixing the driver to use kmalloc()/kfree() when he
>really needs the memory. There was a patch only a few days ago that tried to
>do that, but it was not really well done and would have crashed. If you are
>really interested in I can do such a patch. The code of this driver sucks
>universes through nanotubes, but one day someone _will_ have to start
>cleaning this up.
>
>
Define: really interested
So, probably we are really interested. Though there are a couple of caveats:
- Testing can be done only very limited. We have only one raid array
available and it is in production
- Servers are not in yet, but will been in the next couple of weeks
- As Arjan noted the kernel will be "some vendor 2.6.5". More precisely
sles9 or rhle 3. This is dictated by the setup of informix 10 on those
machines, we are stuck with that unfortunately. To be really interesting
a patch should be backportable to 2.6.5 (or the equivalent rh kernel).
further more
- I am currently investigating if other controllers are able to support
this raid array and are supported. If so it might be a better idea to
use those
- We are willing to offer something in exchange. This ranges from 24
bottles of beer of your choice to something else. The something else
part needs to be discussed, but the beer part I can be held responsible
for :-)
Kind Regards,
B. de Bruin
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Kernel 2.6.5 - Compaq Fibre Channel 64-bit/66Mhz HBA
2005-08-11 16:41 ` Bolke de Bruin
@ 2005-08-11 16:58 ` Rolf Eike Beer
0 siblings, 0 replies; 7+ messages in thread
From: Rolf Eike Beer @ 2005-08-11 16:58 UTC (permalink / raw)
To: Bolke de Bruin; +Cc: Linux Kernel Mailing List
[-- Attachment #1: Type: text/plain, Size: 2068 bytes --]
Am Donnerstag, 11. August 2005 18:41 schrieben Sie:
>>arrays allocated by the driver eat 2kB each, so a stack overflow is very
>>likely. Even with 8kB stack it is still not impossible. Using the version
>>from 2.6.5 will not be a very good idea I think, it's likely to crash your
>>machine one day.
>>
>:-(
>:
>>The right solution would be fixing the driver to use kmalloc()/kfree() when
>> he really needs the memory. There was a patch only a few days ago that
>> tried to do that, but it was not really well done and would have crashed.
>> If you are really interested in I can do such a patch. The code of this
>> driver sucks universes through nanotubes, but one day someone _will_ have
>> to start cleaning this up.
>
>Define: really interested
>
>So, probably we are really interested. Though there are a couple of caveats:
>
>- Testing can be done only very limited. We have only one raid array
>available and it is in production
If whatever I do will go wrong you'll see it very fast. Then you can't receive
data ;)
>- Servers are not in yet, but will been in the next couple of weeks
>- As Arjan noted the kernel will be "some vendor 2.6.5". More precisely
>sles9 or rhle 3. This is dictated by the setup of informix 10 on those
>machines, we are stuck with that unfortunately. To be really interesting
>a patch should be backportable to 2.6.5 (or the equivalent rh kernel).
This should be rather simple. Just use their kernel sources, copy the files
from a newer kernel in and rebuild the module.
>- I am currently investigating if other controllers are able to support
>this raid array and are supported. If so it might be a better idea to
>use those
Yes, if you find some which have a driver that smells less it would be a good
idea to use them.
>- We are willing to offer something in exchange. This ranges from 24
>bottles of beer of your choice to something else. The something else
>part needs to be discussed, but the beer part I can be held responsible
>for :-)
*g* I'll remind you ;)
Eike
[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Kernel 2.6.5 - Compaq Fibre Channel 64-bit/66Mhz HBA [PATCH]
2005-08-11 15:46 Kernel 2.6.5 - Compaq Fibre Channel 64-bit/66Mhz HBA Bolke de Bruin
2005-08-11 15:51 ` Arjan van de Ven
2005-08-11 16:19 ` Rolf Eike Beer
@ 2005-08-11 17:58 ` Rolf Eike Beer
[not found] ` <200508160955.49133@bilbo.math.uni-mannheim.de>
3 siblings, 0 replies; 7+ messages in thread
From: Rolf Eike Beer @ 2005-08-11 17:58 UTC (permalink / raw)
To: Linux Kernel Mailing List; +Cc: Bolke de Bruin, Arjan van de Ven, linux-scsi
Bolke de Bruin wrote:
>So the basic question is. Does this controller work on kernel 2.6.5?
Don't think about it. This thing is a mess. I tried to remove the #errors
(which was rather simple) and replace them by kmalloc(). Of course then
someone should care about ENOMEM case. One function had no problem at all, the
huge buffer can be avoided at all. The other one is called from an interrupt
handler. This thing tries to handle the complete packet transfer in the
interrupt. Don't use it. It will blow up.
If someone has some spare time this interrupt handler has to be split up. Here
is a diff of what I've done so far. To apply this one you will have to use my
two patches sent in the last days first, the subject lines are
[PATCH 2.6.13-rc5] reduce whitespace bloat in drivers/scsi/cpqfcTScontrol.c
[PATCH 2.6.13-rc5] rewrite drivers/scsi/cpqfcTScontrol.c::CpqTsGetSFQEntry
This patch also kills cpqfcTS_reset() function which is never referenced to.
It causes a compile error by using SCSI_RESET_ERROR, which is undefined (now?).
Eike
--- a/drivers/scsi/cpqfcTScontrol.c 2005-08-11 19:04:26.000000000 +0200
+++ b/drivers/scsi/cpqfcTScontrol.c 2005-08-11 19:28:05.000000000 +0200
@@ -556,27 +556,21 @@ static int PeekIMQEntry( PTACHYON fcChip
// first, we need to find an Inbound Completion message,
// If we find it, check the incoming frame payload (1st word)
// for LILP frame
- if( (fcChip->IMQ->QEntry[CI].type & 0x1FF) == 0x104 )
- {
- TachFCHDR_GCMND* fchs;
-#error This is too much stack
- ULONG ulFibreFrame[2048/4]; // max DWORDS in incoming FC Frame
- ULONG SFQpi = fcChip->IMQ->QEntry[CI].word[0] & 0x0fffL;
-
- CpqTsGetSFQEntry( fcChip,
- SFQpi, // SFQ producer ndx
- ulFibreFrame, 0); // DON'T update chip--this is a "lookahead"
+ if( (fcChip->IMQ->QEntry[CI].type & 0x1FF) == 0x104 ) {
+ TachFCHDR_GCMND *fchs;
- fchs = (TachFCHDR_GCMND*)&ulFibreFrame;
- if( fchs->pl[0] == ELS_LILP_FRAME)
- {
- return 1; // found the LILP frame!
- }
- else
- {
- // keep looking...
- }
- }
+ /* Reference to the first chunk of this struct in QEntry
+ * buffer. We can only rely on the first 64 bytes of
+ * data because consumerIndex may have a wraparound.
+ * This is no problem, we only want to see the first
+ * double word of payload, which is within this range.
+ */
+ fchs = (TachFCHDR_GCMND*) &fcChip->SFQ->QEntry[fcChip->SFQ->consumerIndex];
+
+ if(fchs->pl[0] == ELS_LILP_FRAME) {
+ return 1;
+ }
+ }
}
break;
@@ -665,8 +659,7 @@ int CpqTsProcessIMQEntry(void *host)
ULONG x_ID;
ULONG ulBuff, dwStatus;
TachFCHDR_GCMND* fchs;
-#error This is too much stack
- ULONG ulFibreFrame[2048/4]; // max number of DWORDS in incoming Fibre Frame
+ void *ulFibreFrame = kmalloc(2048, GFP_KERNEL); /* max number of DWORDS in incoming Fibre Frame */
UCHAR ucInboundMessageType; // Inbound CM, dword 3 "type" field
ENTER("ProcessIMQEntry");
@@ -675,6 +668,9 @@ int CpqTsProcessIMQEntry(void *host)
// is a new message waiting for us?
// equal indexes means empty que
+ if (!ulFibreFrame)
+ return -ENOMEM;
+
if( fcChip->IMQ->producerIndex != fcChip->IMQ->consumerIndex )
{ // need to process message
@@ -881,7 +877,7 @@ int CpqTsProcessIMQEntry(void *host)
if( ucInboundMessageType == 1 )
{
- fchs = (TachFCHDR_GCMND*)ulFibreFrame; // cast to examine IB frame
+ fchs = ulFibreFrame; // cast to examine IB frame
// don't fill up our Q with garbage - only accept FCP-CMND
// or XRDY frames
if( (fchs->d_id & 0xFF000000) == 0x06000000 ) // CMND
@@ -1432,7 +1428,7 @@ int CpqTsProcessIMQEntry(void *host)
// to analyze data transfer (successful?), then send a response
// frame for this exchange
- ulFibreFrame[0] = x_ID; // copy for later reference
+ *((ULONG*) ulFibreFrame) = x_ID; // copy for later reference
// if this was a TWE, we have to send satus response
if( Exchanges->fcExchange[ x_ID].type == SCSI_TWE )
@@ -1500,6 +1496,7 @@ int CpqTsProcessIMQEntry(void *host)
LEAVE("ProcessIMQEntry");
+ kfree(ulFibreFrame);
return iStatus;
}
--- a/drivers/scsi/cpqfcTSstructs.h 2005-08-11 19:30:55.000000000 +0200
+++ b/drivers/scsi/cpqfcTSstructs.h 2005-08-11 19:31:31.000000000 +0200
@@ -813,7 +813,6 @@ typedef struct
void (*UnFreezeTachyon)(void*, int );
int (*InitializeTachyon)(void*, int, int );
int (*InitializeFrameManager)(void*, int );
- int (*ProcessIMQEntry)(void*);
int (*ReadWriteWWN)(void*, int ReadWrite);
int (*ReadWriteNVRAM)(void*, void*, int ReadWrite);
--- a/drivers/scsi/cpqfcTSinit.c 2005-08-11 19:29:22.000000000 +0200
+++ b/drivers/scsi/cpqfcTSinit.c 2005-08-11 19:46:07.000000000 +0200
@@ -215,7 +215,6 @@ static void Cpqfc_initHBAdata(CPQFCHBA *
cpqfcHBAdata->fcChip.DestroyTachyonQues = CpqTsDestroyTachLiteQues;
cpqfcHBAdata->fcChip.InitializeTachyon = CpqTsInitializeTachLite;
cpqfcHBAdata->fcChip.LaserControl = CpqTsLaserControl;
- cpqfcHBAdata->fcChip.ProcessIMQEntry = CpqTsProcessIMQEntry;
cpqfcHBAdata->fcChip.InitializeFrameManager = CpqTsInitializeFrameManager;
cpqfcHBAdata->fcChip.ReadWriteWWN = CpqTsReadWriteWWN;
cpqfcHBAdata->fcChip.ReadWriteNVRAM = CpqTsReadWriteNVRAM;
@@ -1693,16 +1692,6 @@ int cpqfcTS_eh_device_reset(Scsi_Cmnd *C
return retval;
}
-
-int cpqfcTS_reset(Scsi_Cmnd *Cmnd, unsigned int reset_flags)
-{
-
- ENTER("cpqfcTS_reset");
-
- LEAVE("cpqfcTS_reset");
- return SCSI_RESET_ERROR; /* Bus Reset Not supported */
-}
-
/* This function determines the bios parameters for a given
harddisk. These tend to be numbers that are made up by the
host adapter. Parameters:
@@ -1763,6 +1752,7 @@ irqreturn_t cpqfcTS_intr_handler( int ir
{
while( (++InfLoopBrk < INFINITE_IMQ_BREAK) && (MoreMessages ==1) )
{
+#error handle CpqTsProcessIMQEntry returning -ENOMEM
MoreMessages = CpqTsProcessIMQEntry( HostAdapter); // ret 0 when done
}
if( InfLoopBrk >= INFINITE_IMQ_BREAK )
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Kernel 2.6.5 - Compaq Fibre Channel 64-bit/66Mhz HBA [PATCH]
[not found] ` <4301A6BB.3020403@aub.nl>
@ 2005-08-16 8:58 ` Rolf Eike Beer
0 siblings, 0 replies; 7+ messages in thread
From: Rolf Eike Beer @ 2005-08-16 8:58 UTC (permalink / raw)
To: Linux Kernel Mailing List
[-- Attachment #1: Type: text/plain, Size: 3116 bytes --]
Bolke de Bruin wrote:
>Rolf Eike Beer wrote:
>>If I can still help you then depends on
>>your scheduling of this job. I'll try to do some hacking at the weekends.
>
>Does this mean you will not have time until October 5th or just time in
>the weekend until that time?
For the moment my weekends are more or less "free time to hack". My thesis is
some hardware design, I don't have the tools at home so there is little I can
do at the weekend. This will change in september I think, then I will TeX all
day ;)
>>Nevertheless without access to the hardware it will be a bit difficult.
>> Much more than compile-testing is not possible for me.
>
>Access to hardware can be arranged
Testing will very likely include crashing the machine more than once and doing
some nasty things with the attached storage. Someone will have to plug the
wire out of it while accessing the storage, reboot the switch and such stuff.
This can really hurt in a production environment.
>>I'll send some more cleanups to this driver in a few minutes (I'll CC you).
>>What has to be done from my point of view is:
>>
>>-split up the interrupt handler in one small interrupt handler and one
>> tasklet that does the actual work. I have a patch that would do this but
>> I'm not familiar with this stuff. There are probably some bugs.
>>-fix the stack abuse. I already sent a patch, this has to be made a bit
>> nicer and better.
>>-change the probe/remove stuff to use Linux 2.6 API. This is optional, but
>> I think it will make this piece of dirt a bit cleaner ;)
>>-the stopping of the kernel thread for every controller is a bit messy.
>> There is a cleaner (and simpler) API in Linux 2.6 so this should also be
>> adjusted. Not really critical, but it makes a lot of sense IMHO.
>>-there is not much error checking. From my point of view there is to much
>>trust in that everything is ok which can lead to serious trouble if the
>> link sends you crap.
>
>For us everything more or less comes down to:is this feasible?
I'm sure it is.
>Currently we have three options:
>
>- Keep the current windows platform and not use 'full strength' of the
>hardware (sunv20z), though controller and array will be EOL'd pretty
>shortly by HP
Before you throw it away send it to me ;) I like to have some obscure hardware
around.
>- Use a different controller (fyi HP's FCA2214) which needs a converter
>for the GBIC's (2gb --> 1gb). And will be unsupported by HP
Ugly.
>- Sponsor someone (you) to update the driver to support 2.6 / amd64 in a
>trusted fashion
*g*
>So for us it is little bit of risk management (jug :-) ) and we are
>trying to find out our best option.
I think the main stuff could be done in a week or two. My problem is that I
can't test it on my own (and it's likely to do some bad things with data
behind it) and that I would like to have someone experienced to review what I
do. I'll keep sending this stuff to linux-scsi, but I have no reaction from
anyone there. I hope they will wake up when the interesting stuff comes ;)
Eike
[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2005-08-16 8:57 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-08-11 15:46 Kernel 2.6.5 - Compaq Fibre Channel 64-bit/66Mhz HBA Bolke de Bruin
2005-08-11 15:51 ` Arjan van de Ven
2005-08-11 16:19 ` Rolf Eike Beer
2005-08-11 16:41 ` Bolke de Bruin
2005-08-11 16:58 ` Rolf Eike Beer
2005-08-11 17:58 ` Kernel 2.6.5 - Compaq Fibre Channel 64-bit/66Mhz HBA [PATCH] Rolf Eike Beer
[not found] ` <200508160955.49133@bilbo.math.uni-mannheim.de>
[not found] ` <4301A6BB.3020403@aub.nl>
2005-08-16 8:58 ` Rolf Eike Beer
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox