Re: qla2xxx BUG: workqueue leaked lock or atomic

public inbox for linux-ext4@vger.kernel.org
 help / color / mirror / Atom feed

* Re: qla2xxx BUG: workqueue leaked lock or atomic
       [not found]             ` <20070307170955.GA4252@skl-net.de>
@ 2007-03-07 19:45               ` Andrew Morton
  2007-03-07 20:05                 ` Mingming Cao
  0 siblings, 1 reply; 6+ messages in thread
From: Andrew Morton @ 2007-03-07 19:45 UTC (permalink / raw)
  To: Andre Noll
  Cc: Andrew Vasquez, linux-kernel, linux-scsi, James Bottomley,
	Jens Axboe, Alasdair G Kergon, Adrian Bunk,
	linux-ext4@vger.kernel.org

On Wed, 7 Mar 2007 18:09:55 +0100 Andre Noll <maan@systemlinux.org> wrote:

> On 20:39, Andrew Morton wrote:
> > On Wed, 28 Feb 2007 16:37:22 +0100 Andre Noll <maan@systemlinux.org> wrote:
> > 
> > > On 16:18, Andre Noll wrote:
> > > 
> > > > With 2.6.21-rc2 I am unable to reproduce this BUG message. However,
> > > > writing to both raid systems at the same time via lvm still locks up
> > > > the system within minutes.
> > > 
> > > Screenshot of the resulting kernel panic:
> > > 
> > > 	http://systemlinux.org/~maan/shots/kernel-panic-21-rc2-huangho2.png
> > > 
> > 
> > It died in CFQ.  Please try a different IO scheduler.  Use something
> > like
> > 
> > 	echo deadline > /sys/block/sda/queue/scheduler
> > 
> > This could still be the old qla2xxx bug, or it could be a new qla2xxx bug,
> > or it could be a block bug, or it could be an LVM bug.
> 
> OK. I'm running with deadline right now. But I guess this kernel
> panic was caused by an LVM bug because lockdep reported problems with
> LVM. Nobody responded to my bug report on the LVM mailing list (see
> http://www.redhat.com/archives/linux-lvm/2007-February/msg00102.html).
> 
> Non-working snapshots and no help from the mailing list convinced me
> to ditch the lvm setup [1] in favour of linear software raid. This
> means I can't do lvm-related tests any more.

Sigh.

> BTW: Are ext3 filesystem sizes greater than 8T now officially
> supported?

I think so, but I don't know how much 16TB testing developers and
distros are doing - perhaps the linux-ext4 denizens can tell us?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: qla2xxx BUG: workqueue leaked lock or atomic
  2007-03-07 19:45               ` qla2xxx BUG: workqueue leaked lock or atomic Andrew Morton
@ 2007-03-07 20:05                 ` Mingming Cao
  2007-03-09  9:36                   ` Andre Noll
  2007-03-12 15:22                   ` Valerie Clement
  0 siblings, 2 replies; 6+ messages in thread
From: Mingming Cao @ 2007-03-07 20:05 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Andre Noll, Andrew Vasquez, linux-kernel, linux-scsi,
	James Bottomley, Jens Axboe, Alasdair G Kergon, Adrian Bunk,
	linux-ext4@vger.kernel.org

On Wed, 2007-03-07 at 11:45 -0800, Andrew Morton wrote:
> On Wed, 7 Mar 2007 18:09:55 +0100 Andre Noll <maan@systemlinux.org> wrote:
> 
> > On 20:39, Andrew Morton wrote:
> > > On Wed, 28 Feb 2007 16:37:22 +0100 Andre Noll <maan@systemlinux.org> wrote:
> > > 
> > > > On 16:18, Andre Noll wrote:
> > > > 
> > > > > With 2.6.21-rc2 I am unable to reproduce this BUG message. However,
> > > > > writing to both raid systems at the same time via lvm still locks up
> > > > > the system within minutes.
> > > > 
> > > > Screenshot of the resulting kernel panic:
> > > > 
> > > > 	http://systemlinux.org/~maan/shots/kernel-panic-21-rc2-huangho2.png
> > > > 
> > > 
> > > It died in CFQ.  Please try a different IO scheduler.  Use something
> > > like
> > > 
> > > 	echo deadline > /sys/block/sda/queue/scheduler
> > > 
> > > This could still be the old qla2xxx bug, or it could be a new qla2xxx bug,
> > > or it could be a block bug, or it could be an LVM bug.
> > 
> > OK. I'm running with deadline right now. But I guess this kernel
> > panic was caused by an LVM bug because lockdep reported problems with
> > LVM. Nobody responded to my bug report on the LVM mailing list (see
> > http://www.redhat.com/archives/linux-lvm/2007-February/msg00102.html).
> > 
> > Non-working snapshots and no help from the mailing list convinced me
> > to ditch the lvm setup [1] in favour of linear software raid. This
> > means I can't do lvm-related tests any more.
> 
> Sigh.
> 
> > BTW: Are ext3 filesystem sizes greater than 8T now officially
> > supported?
> 
> I think so, but I don't know how much 16TB testing developers and
> distros are doing - perhaps the linux-ext4 denizens can tell us?
> -

IBM has done some testing (dbench, fsstress, fsx, tiobench, iozone etc)
on 10TB ext3, I think RedHat and BULL have done similar test on >8TB
ext3 too.

Mingming

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: qla2xxx BUG: workqueue leaked lock or atomic
  2007-03-07 20:05                 ` Mingming Cao
@ 2007-03-09  9:36                   ` Andre Noll
  2007-03-12 15:22                   ` Valerie Clement
  1 sibling, 0 replies; 6+ messages in thread
From: Andre Noll @ 2007-03-09  9:36 UTC (permalink / raw)
  To: Mingming Cao
  Cc: Andrew Morton, Andrew Vasquez, linux-kernel, linux-scsi,
	James Bottomley, Jens Axboe, Alasdair G Kergon, Adrian Bunk,
	linux-ext4@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 1347 bytes --]

On 12:05, Mingming Cao wrote:
> > > BTW: Are ext3 filesystem sizes greater than 8T now officially
> > > supported?
> > 
> > I think so, but I don't know how much 16TB testing developers and
> > distros are doing - perhaps the linux-ext4 denizens can tell us?
> > -
> 
> IBM has done some testing (dbench, fsstress, fsx, tiobench, iozone etc)
> on 10TB ext3, I think RedHat and BULL have done similar test on >8TB
> ext3 too.

Thanks. I'm asking because some days ago I tried to create a 10T ext3
filesytem on a linear software raid over two hardware raids, and it
failed horribly. mke2fs from e2fsprogs-1.39 refused to create such a
large filesystem but did it with -F, and I could mount it afterwards.
But writing data immediately produced zillions of errors and only
power-cycling the box helped.

We're now using a 7.9T filesystem on the same hardware. That seems
to work fine on 2.6.21-rc2, so I think this is an ext3 problem. I
cannot completely rule out other reasons though as the underlying
qla2xxx driver also had some problems on earlier kernels.

We'd much rather have a 10T filesystem if possible. So if you have
time to look into the issue I would be willing to recreate the 10T
filesystem and send details.

Regards
Andre
-- 
The only person who always got his work done by Friday was Robinson Crusoe

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: qla2xxx BUG: workqueue leaked lock or atomic
  2007-03-07 20:05                 ` Mingming Cao
  2007-03-09  9:36                   ` Andre Noll
@ 2007-03-12 15:22                   ` Valerie Clement
  2007-03-13  7:01                     ` Andreas Dilger
  1 sibling, 1 reply; 6+ messages in thread
From: Valerie Clement @ 2007-03-12 15:22 UTC (permalink / raw)
  To: cmm; +Cc: Andrew Morton, Andre Noll, Theodore Tso,
	linux-ext4@vger.kernel.org

Mingming Cao wrote:
> On Wed, 2007-03-07 at 11:45 -0800, Andrew Morton wrote:
>> On Wed, 7 Mar 2007 18:09:55 +0100 Andre Noll <maan@systemlinux.org> wrote:
>>
>>> On 20:39, Andrew Morton wrote:
>>>> On Wed, 28 Feb 2007 16:37:22 +0100 Andre Noll <maan@systemlinux.org> wrote:
>>>>
>>> BTW: Are ext3 filesystem sizes greater than 8T now officially
>>> supported?
>> I think so, but I don't know how much 16TB testing developers and
>> distros are doing - perhaps the linux-ext4 denizens can tell us?
>> -
> 
> IBM has done some testing (dbench, fsstress, fsx, tiobench, iozone etc)
> on 10TB ext3, I think RedHat and BULL have done similar test on >8TB
> ext3 too.
> 
> Mingming

Is there not a problem of backward-compatibility with old kernels?
Doesn't we need to handle a new INCOMPAT flag in e2fsprogs and kernel
before allowing ext3 filesystems greater than 8T?

    Valérie

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: qla2xxx BUG: workqueue leaked lock or atomic
  2007-03-12 15:22                   ` Valerie Clement
@ 2007-03-13  7:01                     ` Andreas Dilger
  2007-03-13  8:23                       ` Valerie Clement
  0 siblings, 1 reply; 6+ messages in thread
From: Andreas Dilger @ 2007-03-13  7:01 UTC (permalink / raw)
  To: Valerie Clement
  Cc: cmm, Andrew Morton, Andre Noll, Theodore Tso,
	linux-ext4@vger.kernel.org

On Mar 12, 2007  16:22 +0100, Valerie Clement wrote:
> Mingming Cao wrote:
> >IBM has done some testing (dbench, fsstress, fsx, tiobench, iozone etc)
> >on 10TB ext3, I think RedHat and BULL have done similar test on >8TB
> >ext3 too.
> 
> Is there not a problem of backward-compatibility with old kernels?
> Doesn't we need to handle a new INCOMPAT flag in e2fsprogs and kernel
> before allowing ext3 filesystems greater than 8T?

No, it really depends on the kernel.  There were some bugs that caused
problems with > 8TB because of signed 32-bit int problems, so it isn't
really recommended to use > 8TB unless you know this is fixed in your
kernel (and any older kernel you might have to downgrade to).

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: qla2xxx BUG: workqueue leaked lock or atomic
  2007-03-13  7:01                     ` Andreas Dilger
@ 2007-03-13  8:23                       ` Valerie Clement
  0 siblings, 0 replies; 6+ messages in thread
From: Valerie Clement @ 2007-03-13  8:23 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: Andre Noll, Theodore Tso, linux-ext4@vger.kernel.org

Andreas Dilger wrote:
> On Mar 12, 2007  16:22 +0100, Valerie Clement wrote:
>> Mingming Cao wrote:
>>> IBM has done some testing (dbench, fsstress, fsx, tiobench, iozone etc)
>>> on 10TB ext3, I think RedHat and BULL have done similar test on >8TB
>>> ext3 too.
>> Is there not a problem of backward-compatibility with old kernels?
>> Doesn't we need to handle a new INCOMPAT flag in e2fsprogs and kernel
>> before allowing ext3 filesystems greater than 8T?
> 
> No, it really depends on the kernel.  There were some bugs that caused
> problems with > 8TB because of signed 32-bit int problems, so it isn't
> really recommended to use > 8TB unless you know this is fixed in your
> kernel (and any older kernel you might have to downgrade to).
> 

OK. Thanks.
As Andre mentions it, it seems that the option "-F" for mkfs is 
necessary to create an ext3 FS > 8T.
(I've got the same behavior but I didn't apply the latest patches 
against my current version of e2fsprogs, so I can't check if that has 
changed since).
Is it the right way?

     Valérie

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2007-03-13  8:23 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20070226133153.GC4095@skl-net.de>
     [not found] ` <20070226182617.GC9968@andrew-vasquezs-computer.local>
     [not found]   ` <20070227101100.GA22572@skl-net.de>
     [not found]     ` <20070227185134.GJ20397@andrew-vasquezs-computer.local>
     [not found]       ` <20070228151829.GI22572@skl-net.de>
     [not found]         ` <20070228153722.GJ22572@skl-net.de>
     [not found]           ` <20070306203952.471218df.akpm@linux-foundation.org>
     [not found]             ` <20070307170955.GA4252@skl-net.de>
2007-03-07 19:45               ` qla2xxx BUG: workqueue leaked lock or atomic Andrew Morton
2007-03-07 20:05                 ` Mingming Cao
2007-03-09  9:36                   ` Andre Noll
2007-03-12 15:22                   ` Valerie Clement
2007-03-13  7:01                     ` Andreas Dilger
2007-03-13  8:23                       ` Valerie Clement

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox