public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* 2.4.17: Bug?
@ 2002-02-03 16:04 Alexander Sandler
  2002-02-03 16:26 ` arjan
  2002-02-04  0:24 ` Tim Pepper
  0 siblings, 2 replies; 7+ messages in thread
From: Alexander Sandler @ 2002-02-03 16:04 UTC (permalink / raw)
  To: Linux Kernel Mailing List (E-mail)

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="x-user-defined", Size: 1065 bytes --]

Hi all.

I found something that looks like a bug.

The configuration is the following:
Dual CPU machine with Linux RedHat 7.1 running kernel 2.4.17 
(official), connected to SAN with two FC-HBAs (QLogic 2200).

Bug appears when I am starting two processes, first doing I/O 
to first LUN through first HBA and second doing I/O to second 
LUN through second HBA. When I am disconnecting first HBA 
from the SAN, machine getting into four minute SCSI error 
recovery and then first process exits with I/O error as it 
should, while second process getting stacked and never 
returns (this is the problem - it should continue doing I/O 
like nothing happend).

This problem appearing on SMP kernel. On UP kernel, 
everything works fine.
I found this while I was working on volume manager driver. 
This driver should be able to do fail over to another HBA (if 
available) in case of error.

I have all required hardware and software to work out this 
problem so I'll be glad to give a hand to who ever can 
(should?) or/and will start working on this.

Alexandr Sandler.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: 2.4.17: Bug?
  2002-02-03 16:04 2.4.17: Bug? Alexander Sandler
@ 2002-02-03 16:26 ` arjan
  2002-02-04  0:24 ` Tim Pepper
  1 sibling, 0 replies; 7+ messages in thread
From: arjan @ 2002-02-03 16:26 UTC (permalink / raw)
  To: Alexander Sandler; +Cc: linux-kernel

In article <BDE817654148D51189AC00306E063AAE054619@exchange.store-age.com> you wrote:
> Hi all.
> The configuration is the following:
> Dual CPU machine with Linux RedHat 7.1 running kernel 2.4.17 
> (official), connected to SAN with two FC-HBAs (QLogic 2200).

which driver are you using for that ?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: 2.4.17: Bug?
@ 2002-02-03 16:31 Alexander Sandler
  0 siblings, 0 replies; 7+ messages in thread
From: Alexander Sandler @ 2002-02-03 16:31 UTC (permalink / raw)
  To: 'arjan@fenrus.demon.nl', Alexander Sandler; +Cc: linux-kernel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="x-user-defined", Size: 863 bytes --]

It's 4.27beta.
QLogic currently has three different drivers on their web site. This one is
the oldest and the most stable. With other two I wan't even able to those
LUNs.

One more thing I didn't tell. According to 'ps', stacked process is sleeping
in __get_request_wait() from ll_rw_blk.c

Alexandr Sandler.

> -----Original Message-----
> From: arjan@fenrus.demon.nl [mailto:arjan@fenrus.demon.nl]
> Sent: Sunday, February 03, 2002 4:27 PM
> To: ASandler@store-age.com
> Cc: linux-kernel@vger.kernel.org
> Subject: Re: 2.4.17: Bug?
> 
> 
> In article 
> <BDE817654148D51189AC00306E063AAE054619@exchange.store-age.com
> > you wrote:
> > Hi all.
> > The configuration is the following:
> > Dual CPU machine with Linux RedHat 7.1 running kernel 2.4.17 
> > (official), connected to SAN with two FC-HBAs (QLogic 2200).
> 
> which driver are you using for that ?
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: 2.4.17: Bug?
  2002-02-03 16:04 2.4.17: Bug? Alexander Sandler
  2002-02-03 16:26 ` arjan
@ 2002-02-04  0:24 ` Tim Pepper
  1 sibling, 0 replies; 7+ messages in thread
From: Tim Pepper @ 2002-02-04  0:24 UTC (permalink / raw)
  To: Alexander Sandler; +Cc: Linux Kernel Mailing List (E-mail)

Sounds like you're using the qlogic 4.27beta or 4.36beta from the qlogic
website.  The 4.46.12beta has a shorter time out.  In any of them you can
control this...Look at qla2x00.h.

t.

-- 
*********************************************************
*  tpepper@vato dot org             * Venimus, Vidimus, *
*  http://www.vato.org/~tpepper     * Dolavimus         *
*********************************************************

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: 2.4.17: Bug?
       [not found] <BDE817654148D51189AC00306E063AAE054620@exchange.store-age.com>
@ 2002-02-04 18:45 ` Tim Pepper
  2002-02-04 20:43   ` Arjan van de Ven
  0 siblings, 1 reply; 7+ messages in thread
From: Tim Pepper @ 2002-02-04 18:45 UTC (permalink / raw)
  To: Alexander Sandler, arjan; +Cc: Linux Kernel Mailing List (E-mail)

On Mon 04 Feb at 11:12:27 +0200 ASandler@store-age.com done said:
> No no no no.
> 
> This is a bug. For me, it took two hours to get released
> from that. There is no such thing two hours timeout.
> And who said this is only two hours? I spoke with Arjan van de Ven
> and he told me that it may take for up to 14 hours.
> 
> Anyway, Arjan told me that he fixed this bug in the version that
> will be out with 2.4.18.

We're talking about different things.  Looking at the original post I see
I missed that the concern was the hung process not the "long" error retry.

Anybody have a link to what Arjan fixed?  I used to have occasional hangs like
this but they seemed to have gone away with qlogic's latest (4.46.12beta)
driver.

t.

-- 
*********************************************************
*  tpepper@vato dot org             * Venimus, Vidimus, *
*  http://www.vato.org/~tpepper     * Dolavimus         *
*********************************************************

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: 2.4.17: Bug?
  2002-02-04 18:45 ` Tim Pepper
@ 2002-02-04 20:43   ` Arjan van de Ven
  0 siblings, 0 replies; 7+ messages in thread
From: Arjan van de Ven @ 2002-02-04 20:43 UTC (permalink / raw)
  To: Tim Pepper; +Cc: Alexander Sandler, Linux Kernel Mailing List (E-mail)

On Mon, Feb 04, 2002 at 10:45:25AM -0800, Tim Pepper wrote:
> On Mon 04 Feb at 11:12:27 +0200 ASandler@store-age.com done said:
> > No no no no.
> > 
> > This is a bug. For me, it took two hours to get released
> > from that. There is no such thing two hours timeout.
> > And who said this is only two hours? I spoke with Arjan van de Ven
> > and he told me that it may take for up to 14 hours.
> > 
> > Anyway, Arjan told me that he fixed this bug in the version that
> > will be out with 2.4.18.

Misunderstanding; I did not say (or intend to say) that it will go into
2.4.18; it's not good enough yet.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: 2.4.17: Bug?
@ 2002-02-05 19:02 Alexander Sandler
  0 siblings, 0 replies; 7+ messages in thread
From: Alexander Sandler @ 2002-02-05 19:02 UTC (permalink / raw)
  To: 'Arjan van de Ven', Tim Pepper
  Cc: Alexander Sandler, Linux Kernel Mailing List (E-mail)

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="x-user-defined", Size: 381 bytes --]

Sorry about this. I though it is good anough.

Anyway, Arjan, do you have any suggestions for me? With problems in device
detection QLogic's drivers have (those from their web site) it appears that
there is no solution for this problem right now. Am I correct?

> Misunderstanding; I did not say (or intend to say) that it 
> will go into 2.4.18; it's not good enough yet.

Sasha.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2002-02-05 19:03 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-02-03 16:04 2.4.17: Bug? Alexander Sandler
2002-02-03 16:26 ` arjan
2002-02-04  0:24 ` Tim Pepper
  -- strict thread matches above, loose matches on Subject: below --
2002-02-03 16:31 Alexander Sandler
     [not found] <BDE817654148D51189AC00306E063AAE054620@exchange.store-age.com>
2002-02-04 18:45 ` Tim Pepper
2002-02-04 20:43   ` Arjan van de Ven
2002-02-05 19:02 Alexander Sandler

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox