ext4: (2.6.34-rc4): This should not happen!! Data will be lost

linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* ext4: (2.6.34-rc4): This should not happen!!  Data will be lost
@ 2010-04-16 12:35 Andre Noll
  2010-04-16 14:29 ` tytso
  2010-04-16 15:02 ` Eric Sandeen
  0 siblings, 2 replies; 24+ messages in thread
From: Andre Noll @ 2010-04-16 12:35 UTC (permalink / raw)
  To: linux-ext4; +Cc: linux-driver, Thomas Helle

[-- Attachment #1: Type: text/plain, Size: 1854 bytes --]

Hi

I'm seeing the following errors when writing to an ext4 file system:

	qla2xxx 0000:06:09.0: scsi(0:0:0): Abort command issued -- 1 fa6a73 2002.
	end_request: I/O error, dev sda, sector 7812771880
	Aborting journal on device sda-8.
	EXT4-fs (sda): delayed block allocation failed for inode 752 at logical offset 1982464 with max blocks 17588 with error -30

	This should not happen!!  Data will be lost
	EXT4-fs error (device sda) in ext4_da_writepages: Journal has aborted
	EXT4-fs error (device sda): ext4_journal_start_sb: Detected aborted journal
	EXT4-fs (sda): Remounting filesystem read-only
	EXT4-fs (sda): delayed block allocation failed for inode 756 at logical offset 177656 with max blocks 1205 with error -30
	EXT4-fs error (device sda) in ext4_da_write_end: IO failure

	This should not happen!!  Data will be lost
	EXT4-fs (sda): ext4_da_writepages: jbd2_start: 1205 pages, ino 756; err -30

	EXT4-fs (sda): ext4_da_writepages: jbd2_start: 17588 pages, ino 752; err -30

	EXT4-fs (sda): delayed block allocation failed for inode 755 at logical offset 669696 with max blocks 3268 with error -30

	This should not happen!!  Data will be lost
	EXT4-fs (sda): ext4_da_writepages: jbd2_start: 3268 pages, ino 755; err -30

This happened while stress-testing the machine [1] with

	stress -d 5 --hdd-bytes 10G --hdd-noclean

This command is supposed to fill the disk and then exit with "no space
left". Instead, the above errors occured when the fs was 100% full.
There's nothing in the logs of the Raid system.

Is this a known issue? Who's to blame, qla2xxx or ext4?

Thanks
Andre

[1] Vanilla 2.6.34-rc4, a 7.3T ext4 fs on a FC Infortrend hardware
raid, connected to a qlogic hba. Let me know if you need more info.
-- 
The only person who always got his work done by Friday was Robinson Crusoe

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: ext4: (2.6.34-rc4): This should not happen!!  Data will be lost
  2010-04-16 12:35 ext4: (2.6.34-rc4): This should not happen!! Data will be lost Andre Noll
@ 2010-04-16 14:29 ` tytso
  2010-04-16 15:02 ` Eric Sandeen
  1 sibling, 0 replies; 24+ messages in thread
From: tytso @ 2010-04-16 14:29 UTC (permalink / raw)
  To: Andre Noll; +Cc: linux-ext4, linux-driver, Thomas Helle

On Fri, Apr 16, 2010 at 02:35:26PM +0200, Andre Noll wrote:
> Hi
> 
> I'm seeing the following errors when writing to an ext4 file system:
> 
> 	qla2xxx 0000:06:09.0: scsi(0:0:0): Abort command issued -- 1 fa6a73 2002.
> 	end_request: I/O error, dev sda, sector 7812771880

These look like hardware errors; it may not be the qla2xxx driver's
fault, but actually the fault of the hard disk....

       	   	    	      	     	  - Ted

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: ext4: (2.6.34-rc4): This should not happen!!  Data will be lost
  2010-04-16 12:35 ext4: (2.6.34-rc4): This should not happen!! Data will be lost Andre Noll
  2010-04-16 14:29 ` tytso
@ 2010-04-16 15:02 ` Eric Sandeen
  2010-04-16 15:30   ` Andre Noll
  1 sibling, 1 reply; 24+ messages in thread
From: Eric Sandeen @ 2010-04-16 15:02 UTC (permalink / raw)
  To: Andre Noll; +Cc: linux-ext4, linux-driver, Thomas Helle

On 04/16/2010 07:35 AM, Andre Noll wrote:
> Hi
> 
> I'm seeing the following errors when writing to an ext4 file system:
> 
> 	qla2xxx 0000:06:09.0: scsi(0:0:0): Abort command issued -- 1 fa6a73 2002.
> 	end_request: I/O error, dev sda, sector 7812771880

Sounds like a hardware failure

> 	Aborting journal on device sda-8.
> 	EXT4-fs (sda): delayed block allocation failed for inode 752 at logical offset 1982464 with max blocks 17588 with error -30
> 
> 	This should not happen!!  Data will be lost

ext4 is just the messenger here, I think.

-Eric

> 	EXT4-fs error (device sda) in ext4_da_writepages: Journal has aborted
> 	EXT4-fs error (device sda): ext4_journal_start_sb: Detected aborted journal
> 	EXT4-fs (sda): Remounting filesystem read-only
> 	EXT4-fs (sda): delayed block allocation failed for inode 756 at logical offset 177656 with max blocks 1205 with error -30
> 	EXT4-fs error (device sda) in ext4_da_write_end: IO failure
> 
> 	This should not happen!!  Data will be lost
> 	EXT4-fs (sda): ext4_da_writepages: jbd2_start: 1205 pages, ino 756; err -30
> 
> 	EXT4-fs (sda): ext4_da_writepages: jbd2_start: 17588 pages, ino 752; err -30
> 
> 	EXT4-fs (sda): delayed block allocation failed for inode 755 at logical offset 669696 with max blocks 3268 with error -30
> 
> 	This should not happen!!  Data will be lost
> 	EXT4-fs (sda): ext4_da_writepages: jbd2_start: 3268 pages, ino 755; err -30
> 
> This happened while stress-testing the machine [1] with
> 
> 	stress -d 5 --hdd-bytes 10G --hdd-noclean
> 
> This command is supposed to fill the disk and then exit with "no space
> left". Instead, the above errors occured when the fs was 100% full.
> There's nothing in the logs of the Raid system.
> 
> Is this a known issue? Who's to blame, qla2xxx or ext4?
> 
> Thanks
> Andre
> 
> [1] Vanilla 2.6.34-rc4, a 7.3T ext4 fs on a FC Infortrend hardware
> raid, connected to a qlogic hba. Let me know if you need more info.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: ext4: (2.6.34-rc4): This should not happen!!  Data will be lost
  2010-04-16 15:02 ` Eric Sandeen
@ 2010-04-16 15:30   ` Andre Noll
  2010-04-16 15:43     ` Eric Sandeen
  0 siblings, 1 reply; 24+ messages in thread
From: Andre Noll @ 2010-04-16 15:30 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: linux-ext4, linux-driver, Thomas Helle

[-- Attachment #1: Type: text/plain, Size: 864 bytes --]

On 10:02, Eric Sandeen wrote:

> > I'm seeing the following errors when writing to an ext4 file system:
> > 
> > 	qla2xxx 0000:06:09.0: scsi(0:0:0): Abort command issued -- 1 fa6a73 2002.
> > 	end_request: I/O error, dev sda, sector 7812771880
> 
> Sounds like a hardware failure

Yes, or a problem with the qla2xxx driver. However, it's a bit
surprising that the hardware survived the stress test for more than
one day and went south just at the moment the fs was completely filled.

My first guess was that the problem might be caused by a reported
device size which is slightly larger than the actual size. But then

	ddrescue -i 7440Gi /dev/zero /dev/sda

would have resulted in a similar error. This command completed without
errors though.

Thanks
Andre
-- 
The only person who always got his work done by Friday was Robinson Crusoe

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: ext4: (2.6.34-rc4): This should not happen!!  Data will be lost
  2010-04-16 15:30   ` Andre Noll
@ 2010-04-16 15:43     ` Eric Sandeen
  2010-04-16 15:52       ` Andrew Vasquez
  2010-04-16 15:57       ` Andre Noll
  0 siblings, 2 replies; 24+ messages in thread
From: Eric Sandeen @ 2010-04-16 15:43 UTC (permalink / raw)
  To: Andre Noll; +Cc: linux-ext4, linux-driver, Thomas Helle

On 04/16/2010 10:30 AM, Andre Noll wrote:
> On 10:02, Eric Sandeen wrote:
> 
>>> I'm seeing the following errors when writing to an ext4 file system:
>>>
>>> 	qla2xxx 0000:06:09.0: scsi(0:0:0): Abort command issued -- 1 fa6a73 2002.
>>> 	end_request: I/O error, dev sda, sector 7812771880
>>
>> Sounds like a hardware failure
> 
> Yes, or a problem with the qla2xxx driver. However, it's a bit
> surprising that the hardware survived the stress test for more than
> one day and went south just at the moment the fs was completely filled.

Hm, that's true.

Well, the error was on sector 7812771880 of the device ...

can you reliably write directly to that sector w/o the filesystem in
between?

-Eric

> My first guess was that the problem might be caused by a reported
> device size which is slightly larger than the actual size. But then
> 
> 	ddrescue -i 7440Gi /dev/zero /dev/sda
> 
> would have resulted in a similar error. This command completed without
> errors though.
> 
> Thanks
> Andre


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: ext4: (2.6.34-rc4): This should not happen!!  Data will be lost
  2010-04-16 15:43     ` Eric Sandeen
@ 2010-04-16 15:52       ` Andrew Vasquez
  2010-04-16 16:08         ` Andre Noll
  2010-04-16 15:57       ` Andre Noll
  1 sibling, 1 reply; 24+ messages in thread
From: Andrew Vasquez @ 2010-04-16 15:52 UTC (permalink / raw)
  To: Eric Sandeen
  Cc: Andre Noll, linux-ext4@vger.kernel.org, Linux Driver,
	Thomas Helle

On Fri, 16 Apr 2010, Eric Sandeen wrote:

> On 04/16/2010 10:30 AM, Andre Noll wrote:
> > On 10:02, Eric Sandeen wrote:
> > 
> >>> I'm seeing the following errors when writing to an ext4 file system:
> >>>
> >>> 	qla2xxx 0000:06:09.0: scsi(0:0:0): Abort command issued -- 1 fa6a73 2002.
> >>> 	end_request: I/O error, dev sda, sector 7812771880
> >>
> >> Sounds like a hardware failure
> > 
> > Yes, or a problem with the qla2xxx driver. However, it's a bit
> > surprising that the hardware survived the stress test for more than
> > one day and went south just at the moment the fs was completely filled.
> 
> Hm, that's true.
> 
> Well, the error was on sector 7812771880 of the device ...
> 
> can you reliably write directly to that sector w/o the filesystem in
> between?

The driver here is simply reporting that the a command's timeout value
had been exceeded and the mid-layer began error-recovery with a request
to 'abort' the command.  The abort-request completed successfully:

	qla2xxx 0000:06:09.0: scsi(0:0:0): Abort command issued -- 1 fa6a73 2002.

I can't explain why the storage did not complete the request in the
allotted time.

-- av

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: ext4: (2.6.34-rc4): This should not happen!!  Data will be lost
  2010-04-16 15:52       ` Andrew Vasquez
@ 2010-04-16 16:08         ` Andre Noll
  2010-04-16 16:36           ` Andrew Vasquez
  0 siblings, 1 reply; 24+ messages in thread
From: Andre Noll @ 2010-04-16 16:08 UTC (permalink / raw)
  To: Andrew Vasquez
  Cc: Eric Sandeen, linux-ext4@vger.kernel.org, Linux Driver,
	Thomas Helle

[-- Attachment #1: Type: text/plain, Size: 848 bytes --]

On 08:52, Andrew Vasquez wrote:
> The driver here is simply reporting that the a command's timeout value
> had been exceeded and the mid-layer began error-recovery with a request
> to 'abort' the command.  The abort-request completed successfully:
> 
> 	qla2xxx 0000:06:09.0: scsi(0:0:0): Abort command issued -- 1 fa6a73 2002.
> 
> I can't explain why the storage did not complete the request in the
> allotted time.

Ah, that's valuable information, thanks. The underlying Infortrend
Raid System is rather old but worked without any problems for several
years. We recently replaced its 400G disks by new 2T WD disks. Maybe
the new disks have longer response times, could that be the reason? And
is there a way to increase the timeout value?

Andre
-- 
The only person who always got his work done by Friday was Robinson Crusoe

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: ext4: (2.6.34-rc4): This should not happen!!  Data will be lost
  2010-04-16 16:08         ` Andre Noll
@ 2010-04-16 16:36           ` Andrew Vasquez
  2010-04-16 17:07             ` Andre Noll
  0 siblings, 1 reply; 24+ messages in thread
From: Andrew Vasquez @ 2010-04-16 16:36 UTC (permalink / raw)
  To: Andre Noll
  Cc: Eric Sandeen, linux-ext4@vger.kernel.org, Linux Driver,
	Thomas Helle

On Fri, 16 Apr 2010, Andre Noll wrote:

> On 08:52, Andrew Vasquez wrote:
> > The driver here is simply reporting that the a command's timeout value
> > had been exceeded and the mid-layer began error-recovery with a request
> > to 'abort' the command.  The abort-request completed successfully:
> > 
> > 	qla2xxx 0000:06:09.0: scsi(0:0:0): Abort command issued -- 1 fa6a73 2002.
> > 
> > I can't explain why the storage did not complete the request in the
> > allotted time.
> 
> Ah, that's valuable information, thanks. The underlying Infortrend
> Raid System is rather old but worked without any problems for several
> years. We recently replaced its 400G disks by new 2T WD disks. Maybe
> the new disks have longer response times, could that be the reason? And
> is there a way to increase the timeout value?

To update the default timeout value (30 seconds) for commands
submitted to /dev/sdn to 60 seconds:

        $ echo 60 > /sys/block/sdn/device/timeout

-- av

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: ext4: (2.6.34-rc4): This should not happen!!  Data will be lost
  2010-04-16 16:36           ` Andrew Vasquez
@ 2010-04-16 17:07             ` Andre Noll
  2010-04-17 16:55               ` Bernd Schubert
  0 siblings, 1 reply; 24+ messages in thread
From: Andre Noll @ 2010-04-16 17:07 UTC (permalink / raw)
  To: Andrew Vasquez
  Cc: Eric Sandeen, linux-ext4@vger.kernel.org, Linux Driver,
	Thomas Helle

[-- Attachment #1: Type: text/plain, Size: 947 bytes --]

On 09:36, Andrew Vasquez wrote:
> > > 	qla2xxx 0000:06:09.0: scsi(0:0:0): Abort command issued -- 1 fa6a73 2002.
> > > 
> > > I can't explain why the storage did not complete the request in the
> > > allotted time.
> > 
> > Ah, that's valuable information, thanks. The underlying Infortrend
> > Raid System is rather old but worked without any problems for several
> > years. We recently replaced its 400G disks by new 2T WD disks. Maybe
> > the new disks have longer response times, could that be the reason? And
> > is there a way to increase the timeout value?
> 
> To update the default timeout value (30 seconds) for commands
> submitted to /dev/sdn to 60 seconds:
> 
>         $ echo 60 > /sys/block/sdn/device/timeout

I will re-run the stress test with a 60 seconds timeout value and follow
up if this did not help.

Thanks again
Andre
-- 
The only person who always got his work done by Friday was Robinson Crusoe

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: ext4: (2.6.34-rc4): This should not happen!!  Data will be lost
  2010-04-16 17:07             ` Andre Noll
@ 2010-04-17 16:55               ` Bernd Schubert
  2010-04-17 18:19                 ` Andre Noll
  2010-04-17 20:45                 ` Eric Sandeen
  0 siblings, 2 replies; 24+ messages in thread
From: Bernd Schubert @ 2010-04-17 16:55 UTC (permalink / raw)
  To: Andre Noll
  Cc: Andrew Vasquez, Eric Sandeen, linux-ext4@vger.kernel.org,
	Linux Driver, Thomas Helle

On Friday 16 April 2010, Andre Noll wrote:
> On 09:36, Andrew Vasquez wrote:
> > > > 	qla2xxx 0000:06:09.0: scsi(0:0:0): Abort command issued -- 1 fa6a73
> > > > 2002.
> > > >
> > > > I can't explain why the storage did not complete the request in the
> > > > allotted time.
> > >
> > > Ah, that's valuable information, thanks. The underlying Infortrend
> > > Raid System is rather old but worked without any problems for several
> > > years. We recently replaced its 400G disks by new 2T WD disks. Maybe
> > > the new disks have longer response times, could that be the reason? And
> > > is there a way to increase the timeout value?
> >
> > To update the default timeout value (30 seconds) for commands
> > submitted to /dev/sdn to 60 seconds:
> >
> >         $ echo 60 > /sys/block/sdn/device/timeout
> 
> I will re-run the stress test with a 60 seconds timeout value and follow
> up if this did not help.

That will not help if the command is "SYNCHRONIZE_CACHE", as that ignores 
device settings, but uses scsi default timeout (30s), which is far too small 
for SATA based raid units. Scsi maintainers ignored that and a couple of other 
patches I wrote to improve error handling with Infortrend units. Will send the 
patches again soon.
Also, if the abort command succeeds, it the command should be re-queued and 
should not result in an error. I think my patches also would increase 
verbosity to point out what exactly happened (possibly a wrong return code in 
the qla2xxx driver, although that should activate the next step in error 
handling, I need to find some to go through the code...).

Altogether filesystem unrelated. The filesystem just might be the reason for a 
synchronize-cache, e.g. barriers, etc.

Greetings from Tübingen,
Bernd 
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: ext4: (2.6.34-rc4): This should not happen!!  Data will be lost
  2010-04-17 16:55               ` Bernd Schubert
@ 2010-04-17 18:19                 ` Andre Noll
  2010-04-17 18:43                   ` Bernd Schubert
  2010-04-17 20:45                 ` Eric Sandeen
  1 sibling, 1 reply; 24+ messages in thread
From: Andre Noll @ 2010-04-17 18:19 UTC (permalink / raw)
  To: Bernd Schubert
  Cc: Andrew Vasquez, Eric Sandeen, linux-ext4@vger.kernel.org,
	Linux Driver, Thomas Helle

[-- Attachment #1: Type: text/plain, Size: 949 bytes --]

On 18:55, Bernd Schubert wrote:
> > > To update the default timeout value (30 seconds) for commands
> > > submitted to /dev/sdn to 60 seconds:
> > >
> > >         $ echo 60 > /sys/block/sdn/device/timeout
> > 
> > I will re-run the stress test with a 60 seconds timeout value and follow
> > up if this did not help.
> 
> That will not help if the command is "SYNCHRONIZE_CACHE", as that ignores 
> device settings, but uses scsi default timeout (30s), which is far too small 
> for SATA based raid units. Scsi maintainers ignored that and a couple of other 
> patches I wrote to improve error handling with Infortrend units. Will send the 
> patches again soon.

Please CC me when you do so. The machine I am having trouble with is
only our fallback server. I can use it freely for testing and am willing
to give your patches a try.

Thanks
Andre
-- 
The only person who always got his work done by Friday was Robinson Crusoe

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: ext4: (2.6.34-rc4): This should not happen!!  Data will be lost
  2010-04-17 18:19                 ` Andre Noll
@ 2010-04-17 18:43                   ` Bernd Schubert
  0 siblings, 0 replies; 24+ messages in thread
From: Bernd Schubert @ 2010-04-17 18:43 UTC (permalink / raw)
  To: Andre Noll
  Cc: Andrew Vasquez, Eric Sandeen, linux-ext4@vger.kernel.org,
	Linux Driver, Thomas Helle

On Saturday 17 April 2010, Andre Noll wrote:
> On 18:55, Bernd Schubert wrote:
> > > > To update the default timeout value (30 seconds) for commands
> > > > submitted to /dev/sdn to 60 seconds:
> > > >
> > > >         $ echo 60 > /sys/block/sdn/device/timeout
> > >
> > > I will re-run the stress test with a 60 seconds timeout value and
> > > follow up if this did not help.
> >
> > That will not help if the command is "SYNCHRONIZE_CACHE", as that ignores
> > device settings, but uses scsi default timeout (30s), which is far too
> > small for SATA based raid units. Scsi maintainers ignored that and a
> > couple of other patches I wrote to improve error handling with Infortrend
> > units. Will send the patches again soon.
> 
> Please CC me when you do so. The machine I am having trouble with is
> only our fallback server. I can use it freely for testing and am willing
> to give your patches a try.

There is actually not much to test, as the patches had been the only solution 
to stabilize a large Lustre environment with dozens of Infortrend Raids. I 
spent months to debug Infortrend Raids, scsi stack and the LSI MPT fusion 
driver. Nowadays some patches are also used for DDN customers. 
I'm just always out of time to forward port it to more recent linux-git and to 
resend.


Cheers,
Bernd


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: ext4: (2.6.34-rc4): This should not happen!!  Data will be lost
  2010-04-17 16:55               ` Bernd Schubert
  2010-04-17 18:19                 ` Andre Noll
@ 2010-04-17 20:45                 ` Eric Sandeen
  2010-04-17 22:38                   ` Andre Noll
  1 sibling, 1 reply; 24+ messages in thread
From: Eric Sandeen @ 2010-04-17 20:45 UTC (permalink / raw)
  To: Bernd Schubert
  Cc: Andre Noll, Andrew Vasquez, linux-ext4@vger.kernel.org,
	Linux Driver, Thomas Helle

Bernd Schubert wrote:
...

> Altogether filesystem unrelated. The filesystem just might be the reason for a 
> synchronize-cache, e.g. barriers, etc.

Interesting that it only happens at enospc time though - the fs would
be sending barriers in general usage.

Although, as ext4 fills, it does do more syncing/flushing to try to
eke out that last bit of space.... hmm....

Andre, just for fun, you might try mounting -o nodelalloc, fill it again,
and see if you see the same behavior.

If not, you might do it again w/ defaults, and capture blktrace data, which
I think will trace barrier requests as well.

(or, maybe also try testing with -o nobarrier)

I still don't think it's likely a filesystem problem but maybe you can
pinpoint the fs behavior that triggers it.

-Eric

> Greetings from Tübingen,
> Bernd 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: ext4: (2.6.34-rc4): This should not happen!!  Data will be lost
  2010-04-17 20:45                 ` Eric Sandeen
@ 2010-04-17 22:38                   ` Andre Noll
  2010-04-20 15:37                     ` Andre Noll
  0 siblings, 1 reply; 24+ messages in thread
From: Andre Noll @ 2010-04-17 22:38 UTC (permalink / raw)
  To: Eric Sandeen
  Cc: Bernd Schubert, Andrew Vasquez, linux-ext4@vger.kernel.org,
	Linux Driver, Thomas Helle

[-- Attachment #1: Type: text/plain, Size: 1211 bytes --]

On 15:45, Eric Sandeen wrote:
> Interesting that it only happens at enospc time though - the fs would
> be sending barriers in general usage.
> 
> Although, as ext4 fills, it does do more syncing/flushing to try to
> eke out that last bit of space.... hmm....
> 
> Andre, just for fun, you might try mounting -o nodelalloc, fill it again,
> and see if you see the same behavior.

Will do. This will take some time though as I destroyed the fs by
writing directly to the device, so I had to recreate it and start from
scratch. I'm currently filling the new fs using a 60s device timeout as
suggested by Andrew.

> If not, you might do it again w/ defaults, and capture blktrace data, which
> I think will trace barrier requests as well.
> 
> (or, maybe also try testing with -o nobarrier)
> 
> I still don't think it's likely a filesystem problem but maybe you can
> pinpoint the fs behavior that triggers it.

I'll try to reproduce the problem using different timeout values and the
ext4 options you suggest. If I can find a reliable reproducer, I'll run
blktrace and post the results.

Regards
Andre
-- 
The only person who always got his work done by Friday was Robinson Crusoe

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: ext4: (2.6.34-rc4): This should not happen!!  Data will be lost
  2010-04-17 22:38                   ` Andre Noll
@ 2010-04-20 15:37                     ` Andre Noll
  2010-04-20 16:51                       ` Eric Sandeen
  2010-04-21  8:57                       ` Dmitry Monakhov
  0 siblings, 2 replies; 24+ messages in thread
From: Andre Noll @ 2010-04-20 15:37 UTC (permalink / raw)
  To: Eric Sandeen
  Cc: Bernd Schubert, Andrew Vasquez, linux-ext4@vger.kernel.org,
	Linux Driver, Thomas Helle

[-- Attachment #1: Type: text/plain, Size: 3493 bytes --]

On 00:38, Andre Noll wrote:
> > I still don't think it's likely a filesystem problem but maybe you can
> > pinpoint the fs behavior that triggers it.
> 
> I'll try to reproduce the problem using different timeout values and the
> ext4 options you suggest. If I can find a reliable reproducer, I'll run
> blktrace and post the results.

Here are some results. Prior to running the tests I wrote a bunch of
10G files and then filled the fs completely with 2T files containing
zeros.  Each of the tests below consisted of three runs of

	- remove 5 of the above 10G files to make 50G space available
	- run stress -d 5 --hdd-bytes 10G --hdd-noclean until it dies
	- run fsck if any fs errors occured

Summary: Increasing the device timeout to 60s _or_ disabling barriers
makes the problem go away. Deactivating delayed allocation makes the
problem worse.

- device timeout 60s, default ext4 parameters
	No problems at all, all three runs OK

- device timeout 30s, default ext4 parameters
	1. OK
	2. dmesg:
		qla2xxx 0000:06:09.0: scsi(0:0:0): Abort command issued -- 1 2ea270b 2002.
		end_request: I/O error, dev sda, sector 7812889640
		Aborting journal on device sda-8.
		EXT4-fs error (device sda): ext4_journal_start_sb: Detected aborted journal
		EXT4-fs (sda): Remounting filesystem read-only

	fsck:
			Inode 287, i_blocks is 4294918568, should be 416.  Fix? yes
			Inode 288, i_size is 2198897426432, should be 2199023251456.  Fix? yes
			Inode 288, i_blocks is 4294721960, should be 416.  Fix? yes

	3.
		qla2xxx 0000:06:09.0: scsi(0:0:0): Abort command issued -- 1 2ece6a8 2002.
		qla2xxx 0000:06:09.0: scsi(0:0:0): Abort command issued -- 1 2ece6dc 2002.
		end_request: I/O error, dev sda, sector 7812690136
		Aborting journal on device sda-8.
		EXT4-fs error (device sda) in ext4_free_blocks: Journal has aborted
		EXT4-fs error (device sda) in ext4_ext_remove_space: Journal has aborted
		EXT4-fs error (device sda) in ext4_reserve_inode_write: Journal has aborted
		EXT4-fs error (device sda) in ext4_ext_truncate: Journal has aborted
		EXT4-fs error (device sda) in ext4_reserve_inode_write: Journal has aborted
		EXT4-fs error (device sda) in ext4_orphan_del: Journal has aborted
		EXT4-fs error (device sda) in ext4_reserve_inode_write: Journal has aborted
		EXT4-fs error (device sda) in ext4_delete_inode: Journal has aborted
		EXT4-fs error (device sda): ext4_journal_start_sb: Detected aborted journal
		EXT4-fs (sda): Remounting filesystem read-only

		fsck:
			e2fsck 1.41.10 (10-Feb-2009)
			/dev/sda: recovering journal
			Clearing orphaned inode 179 (uid=0, gid=0, mode=0100600, size=0)
			/dev/sda: clean, 301/244158464 files, 1935004235/1953247232 blocks


- device timeout 30s, nodelalloc

	This seems to trigger the problem more reliably:

	1.
		dmesg: same qla, ext4 errors as above
		fsck: orphaned inodes as above
	2. and 3.
		errors already while removing files:
			rm: cannot remove `stress.98q1gG': Read-only file system
		dmesg: same qla/ext4 errors, but also
			JBD2: Detected IO errors while flushing file data on sda-8
		fsck: clean

- device timeout 30s, nobarrier
	No problem at all, all three runs OK.

Eric, are you still interested in seeing the blktrace output? Suppose,
I should use a 30s timeout, nodealloc and barriers=1 as this triggers
the problem within minutes.


Regards
Andre
-- 
The only person who always got his work done by Friday was Robinson Crusoe

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: ext4: (2.6.34-rc4): This should not happen!!  Data will be lost
  2010-04-20 15:37                     ` Andre Noll
@ 2010-04-20 16:51                       ` Eric Sandeen
  2010-04-20 17:26                         ` Bernd Schubert
  2010-04-20 17:46                         ` Andre Noll
  2010-04-21  8:57                       ` Dmitry Monakhov
  1 sibling, 2 replies; 24+ messages in thread
From: Eric Sandeen @ 2010-04-20 16:51 UTC (permalink / raw)
  To: Andre Noll
  Cc: Bernd Schubert, Andrew Vasquez, linux-ext4@vger.kernel.org,
	Linux Driver, Thomas Helle

On 04/20/2010 10:37 AM, Andre Noll wrote:
...

> - device timeout 30s, nobarrier
> 	No problem at all, all three runs OK.
> 
> Eric, are you still interested in seeing the blktrace output? Suppose,
> I should use a 30s timeout, nodealloc and barriers=1 as this triggers
> the problem within minutes.

Hm, so something about barriers being issued is causing timeout
problems on the device...?

since nodelalloc made it -worse- I suppose that rules out the problems
with fs flushing near ensopc, that should only be happening with delalloc.

To be honest I'm not sure what I would do with the blktrace data now,
but perhaps someone might be interested to see it; I'd at least gather it.

-Eric

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: ext4: (2.6.34-rc4): This should not happen!!  Data will be lost
  2010-04-20 16:51                       ` Eric Sandeen
@ 2010-04-20 17:26                         ` Bernd Schubert
  2010-04-20 18:35                           ` Andre Noll
  2010-04-20 17:46                         ` Andre Noll
  1 sibling, 1 reply; 24+ messages in thread
From: Bernd Schubert @ 2010-04-20 17:26 UTC (permalink / raw)
  To: Eric Sandeen
  Cc: Andre Noll, Andrew Vasquez, linux-ext4@vger.kernel.org,
	Linux Driver, Thomas Helle

On Tuesday 20 April 2010, Eric Sandeen wrote:
> On 04/20/2010 10:37 AM, Andre Noll wrote:
> ...
> 
> > - device timeout 30s, nobarrier
> > 	No problem at all, all three runs OK.
> >
> > Eric, are you still interested in seeing the blktrace output? Suppose,
> > I should use a 30s timeout, nodealloc and barriers=1 as this triggers
> > the problem within minutes.
> 
> Hm, so something about barriers being issued is causing timeout
> problems on the device...?

I think interesting at this point would be the exact model of the Infortrend 
device. There are some completely broken models (IMHO), which have two 
controllers for redundancy. Now with enabled write-back cache, it can happen 
that those units run into some kind of firmware bug. It then takes about 2h to 
flush 2GB of write-back cache. The telnet interface will show the status of 
the cache. More recent IFT dual-controller units do not suffer from this bug 
anymore, but as Andre said, they are using an old unit...

Thanks,
Bernd

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: ext4: (2.6.34-rc4): This should not happen!!  Data will be lost
  2010-04-20 17:26                         ` Bernd Schubert
@ 2010-04-20 18:35                           ` Andre Noll
  2010-04-20 20:09                             ` Bernd Schubert
  0 siblings, 1 reply; 24+ messages in thread
From: Andre Noll @ 2010-04-20 18:35 UTC (permalink / raw)
  To: Bernd Schubert
  Cc: Eric Sandeen, Andrew Vasquez, linux-ext4@vger.kernel.org,
	Linux Driver, Thomas Helle

[-- Attachment #1: Type: text/plain, Size: 1458 bytes --]

On 19:26, Bernd Schubert wrote:
> On Tuesday 20 April 2010, Eric Sandeen wrote:
> I think interesting at this point would be the exact model of the Infortrend 
> device.

Here's the system information as reported by the telnet interface:

	CPU Type             PPC750FX        
	Total Cache Size     2048MB DDR(ECC) 
	Firmware Version     3.42I.03        
	Bootrecord Version   1.23A           
	FW Upgradability     Rev. C          
	Serial Number        6912121         
	Battery Backup Unit  Present         
	Base Board Rev. ID   0               
	Base Board ID        81              
	ID of NVRAM Defaults A16F-G2221 V6.10
	Controller Position  Slot A          

> There are some completely broken models (IMHO), which have two 
> controllers for redundancy.

This is a 4 year old system (which does not support Raid6). It has only
a single controller though.

> Now with enabled write-back cache, it can happen that those units run
> into some kind of firmware bug. It then takes about 2h to flush 2GB of
> write-back cache.  The telnet interface will show the status of the
> cache.

Hey, I saw this once on a different (newer) infortrend system. However,
it might still be hapening on this system as well and cause the timeout
problems.

Guess I'll have to check if there's a more recent firmware for this
system..

Thanks
Andre
-- 
The only person who always got his work done by Friday was Robinson Crusoe

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: ext4: (2.6.34-rc4): This should not happen!!  Data will be lost
  2010-04-20 18:35                           ` Andre Noll
@ 2010-04-20 20:09                             ` Bernd Schubert
  0 siblings, 0 replies; 24+ messages in thread
From: Bernd Schubert @ 2010-04-20 20:09 UTC (permalink / raw)
  To: Andre Noll
  Cc: Eric Sandeen, Andrew Vasquez, linux-ext4@vger.kernel.org,
	Linux Driver, Thomas Helle

On Tuesday 20 April 2010, Andre Noll wrote:
> On 19:26, Bernd Schubert wrote:
> > On Tuesday 20 April 2010, Eric Sandeen wrote:
> > I think interesting at this point would be the exact model of the
> > Infortrend device.
> 
> Here's the system information as reported by the telnet interface:
> 
> 	CPU Type             PPC750FX
> 	Total Cache Size     2048MB DDR(ECC)
> 	Firmware Version     3.42I.03
> 	Bootrecord Version   1.23A
> 	FW Upgradability     Rev. C
> 	Serial Number        6912121
> 	Battery Backup Unit  Present
> 	Base Board Rev. ID   0
> 	Base Board ID        81
> 	ID of NVRAM Defaults A16F-G2221 V6.10
> 	Controller Position  Slot A
> 
> > There are some completely broken models (IMHO), which have two
> > controllers for redundancy.
> 
> This is a 4 year old system (which does not support Raid6). It has only
> a single controller though.

I don't have any experience with that model.

> 
> > Now with enabled write-back cache, it can happen that those units run
> > into some kind of firmware bug. It then takes about 2h to flush 2GB of
> > write-back cache.  The telnet interface will show the status of the
> > cache.
> 
> Hey, I saw this once on a different (newer) infortrend system. However,
> it might still be hapening on this system as well and cause the timeout
> problems.

I think the dual-controller models that work fine have have SAS-interlink. 
Infortrend never confirmed the issue, but I guess it is related to cache-
coherency between both controllers. 
There are also other cache related firmware bugs, when it fails to flush the 
cache at all. Scsi commands then time out, it enters recovery, properly 
responds to scsi commands, resumes normal operation and fails those commands 
again. Even with software raid out of several of those hardware raids, this 
fail-recover-fail loop prevents suitable operation. Also part of my scsi 
patches to limit number of recoveries within a time limit. This issue should 
be fixed with recent firmware version, though. But depending on your model, 
those fixed version might not be available.

> 
> Guess I'll have to check if there's a more recent firmware for this
> system..

At least worth a try.

Cheers,
Bernd

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: ext4: (2.6.34-rc4): This should not happen!!  Data will be lost
  2010-04-20 16:51                       ` Eric Sandeen
  2010-04-20 17:26                         ` Bernd Schubert
@ 2010-04-20 17:46                         ` Andre Noll
  2010-04-22  8:21                           ` Andre Noll
  1 sibling, 1 reply; 24+ messages in thread
From: Andre Noll @ 2010-04-20 17:46 UTC (permalink / raw)
  To: Eric Sandeen
  Cc: Bernd Schubert, Andrew Vasquez, linux-ext4@vger.kernel.org,
	Linux Driver, Thomas Helle

[-- Attachment #1: Type: text/plain, Size: 775 bytes --]

On 11:51, Eric Sandeen wrote:
> On 04/20/2010 10:37 AM, Andre Noll wrote:
> ...
> 
> > - device timeout 30s, nobarrier
> > 	No problem at all, all three runs OK.
> > 
> > Eric, are you still interested in seeing the blktrace output? Suppose,
> > I should use a 30s timeout, nodealloc and barriers=1 as this triggers
> > the problem within minutes.
> 
> Hm, so something about barriers being issued is causing timeout
> problems on the device...?

Yes, that's what it looks like.

> To be honest I'm not sure what I would do with the blktrace data now,
> but perhaps someone might be interested to see it; I'd at least gather it.

OK, will send it tomorrow.

Thanks
Andre
-- 
The only person who always got his work done by Friday was Robinson Crusoe

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: ext4: (2.6.34-rc4): This should not happen!!  Data will be lost
  2010-04-20 17:46                         ` Andre Noll
@ 2010-04-22  8:21                           ` Andre Noll
  0 siblings, 0 replies; 24+ messages in thread
From: Andre Noll @ 2010-04-22  8:21 UTC (permalink / raw)
  To: Eric Sandeen
  Cc: Bernd Schubert, Andrew Vasquez, linux-ext4@vger.kernel.org,
	Linux Driver, Thomas Helle

[-- Attachment #1: Type: text/plain, Size: 2830 bytes --]

On 19:46, Andre Noll wrote:
> > To be honest I'm not sure what I would do with the blktrace data now,
> > but perhaps someone might be interested to see it; I'd at least gather it.
> 
> OK, will send it tomorrow.

Here we go.

mount | grep sda:

        /dev/sda on /mnt type ext4 (ro,relatime,barrier=1,data=ordered)

command:

        ls stress.* | head -n 5 | xargs rm && stress -d 5 --hdd-noclean --hdd-bytes 10G

dmesg:

        EXT4-fs (sda): mounted filesystem with ordered data mode
        qla2xxx 0000:06:09.0: scsi(0:0:0): Abort command issued -- 1 2f82d1 2002.
        qla2xxx 0000:06:09.0: scsi(0:0:0): Abort command issued -- 1 2f82d3 2002.
        end_request: I/O error, dev sda, sector 7812689752
        Aborting journal on device sda-8.
        qla2xxx 0000:06:09.0: scsi(0:0:0): Abort command issued -- 1 2f82d8 2002.
        EXT4-fs (sda): delayed block allocation failed for inode 4198406 at logical offset 0 with max blocks 32768 with error -30

        This should not happen!!  Data will be lost
        EXT4-fs error (device sda) in ext4_da_writepages: Journal has aborted
        qla2xxx 0000:06:09.0: scsi(0:0:0): Abort command issued -- 1 2f82da 2002.
        qla2xxx 0000:06:09.0: scsi(0:0:0): Abort command issued -- 1 2f82db 2002.
        EXT4-fs (sda): delayed block allocation failed for inode 4198435 at logical offset 0 with max blocks 32768 with error -30

        This should not happen!!  Data will be lost
        EXT4-fs error (device sda): ext4_journal_start_sb: Detected aborted journal
        EXT4-fs (sda): Remounting filesystem read-only
        EXT4-fs (sda): ext4_da_writepages: jbd2_start: 32768 pages, ino 4198406; err -30

        EXT4-fs (sda): ext4_da_writepages: jbd2_start: 32768 pages, ino 4198435; err -30

        qla2xxx 0000:06:09.0: scsi(0:0:0): Abort command issued -- 1 2f82df 2002.
        EXT4-fs (sda): delayed block allocation failed for inode 4198433 at logical offset 0 with max blocks 32768 with error -30

        This should not happen!!  Data will be lost
        EXT4-fs (sda): delayed block allocation failed for inode 4198437 at logical offset 0 with max blocks 32768 with error -30

        This should not happen!!  Data will be lost
        EXT4-fs (sda): ext4_da_writepages: jbd2_start: 32768 pages, ino 4198433; err -30

        EXT4-fs (sda): ext4_da_writepages: jbd2_start: 32768 pages, ino 4198437; err -30

        qla2xxx 0000:06:09.0: scsi(0:0:0): Abort command issued -- 1 2f82e2 2002.
        ...

The output of "btrace /dev/sda" during the above command is 1.7M
large. It's at

        http://systemlinux.org/~maan/btrace.sda

I'll run further tests with barrier=0 and 60s timeout.

Thanks
Andre
-- 
The only person who always got his work done by Friday was Robinson Crusoe

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: ext4: (2.6.34-rc4): This should not happen!!  Data will be lost
  2010-04-20 15:37                     ` Andre Noll
  2010-04-20 16:51                       ` Eric Sandeen
@ 2010-04-21  8:57                       ` Dmitry Monakhov
  2010-04-21 13:47                         ` Andre Noll
  1 sibling, 1 reply; 24+ messages in thread
From: Dmitry Monakhov @ 2010-04-21  8:57 UTC (permalink / raw)
  To: Andre Noll
  Cc: Eric Sandeen, Bernd Schubert, Andrew Vasquez,
	linux-ext4@vger.kernel.org, Linux Driver, Thomas Helle

Andre Noll <maan@systemlinux.org> writes:

> On 00:38, Andre Noll wrote:
>> > I still don't think it's likely a filesystem problem but maybe you can
>> > pinpoint the fs behavior that triggers it.
>> 
>> I'll try to reproduce the problem using different timeout values and the
>> ext4 options you suggest. If I can find a reliable reproducer, I'll run
>> blktrace and post the results.
>
> Here are some results. Prior to running the tests I wrote a bunch of
> 10G files and then filled the fs completely with 2T files containing
> zeros.  Each of the tests below consisted of three runs of
>
> 	- remove 5 of the above 10G files to make 50G space available
> 	- run stress -d 5 --hdd-bytes 10G --hdd-noclean until it dies
what 'stress' process do?  was it posted already?
> 	- run fsck if any fs errors occured
>
> Summary: Increasing the device timeout to 60s _or_ disabling barriers
> makes the problem go away. Deactivating delayed allocation makes the
> problem worse.
2Gb cache is really huge.
barriers=0 , result in less disk wcache activity, but more real IO
And nodelaloc result in more real IO due, so imho this is looks like
device issue.
about nodelalloc: It is unlikely to see "This should not happen!! 
Data will be lost" because this message appear from writepage
so may happens only when you rewrite an existing file(below i_size).
BTW, you already noted that you have performed some stress on the device
without filesystem. What was they doing?
IMHO you will able to reproduce the issue without fs by performing
random writes/reads to the device from several tasks in parallel.
>
> - device timeout 60s, default ext4 parameters
> 	No problems at all, all three runs OK
>
> - device timeout 30s, default ext4 parameters
> 	1. OK
> 	2. dmesg:
> 		qla2xxx 0000:06:09.0: scsi(0:0:0): Abort command issued -- 1 2ea270b 2002.
> 		end_request: I/O error, dev sda, sector 7812889640
> 		Aborting journal on device sda-8.
> 		EXT4-fs error (device sda): ext4_journal_start_sb: Detected aborted journal
> 		EXT4-fs (sda): Remounting filesystem read-only
>
> 	fsck:
> 			Inode 287, i_blocks is 4294918568, should be 416.  Fix? yes
> 			Inode 288, i_size is 2198897426432, should be 2199023251456.  Fix? yes
> 			Inode 288, i_blocks is 4294721960, should be 416.  Fix? yes
>
> 	3.
> 		qla2xxx 0000:06:09.0: scsi(0:0:0): Abort command issued -- 1 2ece6a8 2002.
> 		qla2xxx 0000:06:09.0: scsi(0:0:0): Abort command issued -- 1 2ece6dc 2002.
> 		end_request: I/O error, dev sda, sector 7812690136
> 		Aborting journal on device sda-8.
> 		EXT4-fs error (device sda) in ext4_free_blocks: Journal has aborted
> 		EXT4-fs error (device sda) in ext4_ext_remove_space: Journal has aborted
> 		EXT4-fs error (device sda) in ext4_reserve_inode_write: Journal has aborted
> 		EXT4-fs error (device sda) in ext4_ext_truncate: Journal has aborted
> 		EXT4-fs error (device sda) in ext4_reserve_inode_write: Journal has aborted
> 		EXT4-fs error (device sda) in ext4_orphan_del: Journal has aborted
> 		EXT4-fs error (device sda) in ext4_reserve_inode_write: Journal has aborted
> 		EXT4-fs error (device sda) in ext4_delete_inode: Journal has aborted
> 		EXT4-fs error (device sda): ext4_journal_start_sb: Detected aborted journal
> 		EXT4-fs (sda): Remounting filesystem read-only
>
> 		fsck:
> 			e2fsck 1.41.10 (10-Feb-2009)
> 			/dev/sda: recovering journal
> 			Clearing orphaned inode 179 (uid=0, gid=0, mode=0100600, size=0)
> 			/dev/sda: clean, 301/244158464 files, 1935004235/1953247232 blocks
>
>
> - device timeout 30s, nodelalloc
>
> 	This seems to trigger the problem more reliably:
>
> 	1.
> 		dmesg: same qla, ext4 errors as above
> 		fsck: orphaned inodes as above
> 	2. and 3.
> 		errors already while removing files:
> 			rm: cannot remove `stress.98q1gG': Read-only file system
> 		dmesg: same qla/ext4 errors, but also
> 			JBD2: Detected IO errors while flushing file data on sda-8
> 		fsck: clean
>
> - device timeout 30s, nobarrier
> 	No problem at all, all three runs OK.
>
> Eric, are you still interested in seeing the blktrace output? Suppose,
> I should use a 30s timeout, nodealloc and barriers=1 as this triggers
> the problem within minutes.
>
>
> Regards
> Andre

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: ext4: (2.6.34-rc4): This should not happen!!  Data will be lost
  2010-04-21  8:57                       ` Dmitry Monakhov
@ 2010-04-21 13:47                         ` Andre Noll
  0 siblings, 0 replies; 24+ messages in thread
From: Andre Noll @ 2010-04-21 13:47 UTC (permalink / raw)
  To: Dmitry Monakhov
  Cc: Eric Sandeen, Bernd Schubert, Andrew Vasquez,
	linux-ext4@vger.kernel.org, Linux Driver, Thomas Helle

[-- Attachment #1: Type: text/plain, Size: 2209 bytes --]

On 12:57, Dmitry Monakhov wrote:
> > 	- run stress -d 5 --hdd-bytes 10G --hdd-noclean until it dies
> what 'stress' process do?  was it posted already?

stress is a simple, yet useful program which imposes certain types of
stress on a machine. With the above command line options, it simply
writes 5 files in parallel, each 10G large, in an endless loop until
the file system is full (or becomes read-only due to errors). It
helped me more than once to identify hardware or software problems,
_before_ the machine went into production use.

> > Summary: Increasing the device timeout to 60s _or_ disabling barriers
> > makes the problem go away. Deactivating delayed allocation makes the
> > problem worse.
> 2Gb cache is really huge.

Really? This is a four year old el-cheapo hardware raid system with
16 SATA slots. You can easily spend twice the money and get much more
cache memory then.

> barriers=0 , result in less disk wcache activity, but more real IO
> And nodelaloc result in more real IO due, so imho this is looks like
> device issue.

Yes, I think we all agree that the problem is not ext4-related but
is most likely an issue with the infortrend hardware. However, ext4
seems to be very good at triggering that particular problem.

> about nodelalloc: It is unlikely to see "This should not happen!! 
> Data will be lost" because this message appear from writepage
> so may happens only when you rewrite an existing file(below i_size).

Nope, this definitely occured while stress was writing new files and
the file system was nearly full.

> BTW, you already noted that you have performed some stress on the device
> without filesystem. What was they doing?

I only ran ddrescue /dev/sda /dev/null once to make sure everything
is readable. This completed with no problems, so I created an ext4
file system and used the above stress command which resulted in write
errors. I then used ddrescue again to rewrite the sector on which
the error occured. This also succeeded which indicates a transient
problem, i.e. no problem with the particular sector.

Regards
Andre
-- 
The only person who always got his work done by Friday was Robinson Crusoe

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: ext4: (2.6.34-rc4): This should not happen!!  Data will be lost
  2010-04-16 15:43     ` Eric Sandeen
  2010-04-16 15:52       ` Andrew Vasquez
@ 2010-04-16 15:57       ` Andre Noll
  1 sibling, 0 replies; 24+ messages in thread
From: Andre Noll @ 2010-04-16 15:57 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: linux-ext4, linux-driver, Thomas Helle

[-- Attachment #1: Type: text/plain, Size: 317 bytes --]

On 10:43, Eric Sandeen wrote:

> Well, the error was on sector 7812771880 of the device ...
> 
> can you reliably write directly to that sector w/o the filesystem in
> between?

Yes, that does not result in any errors.

Andre
-- 
The only person who always got his work done by Friday was Robinson Crusoe

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2010-04-22  8:21 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-04-16 12:35 ext4: (2.6.34-rc4): This should not happen!! Data will be lost Andre Noll
2010-04-16 14:29 ` tytso
2010-04-16 15:02 ` Eric Sandeen
2010-04-16 15:30   ` Andre Noll
2010-04-16 15:43     ` Eric Sandeen
2010-04-16 15:52       ` Andrew Vasquez
2010-04-16 16:08         ` Andre Noll
2010-04-16 16:36           ` Andrew Vasquez
2010-04-16 17:07             ` Andre Noll
2010-04-17 16:55               ` Bernd Schubert
2010-04-17 18:19                 ` Andre Noll
2010-04-17 18:43                   ` Bernd Schubert
2010-04-17 20:45                 ` Eric Sandeen
2010-04-17 22:38                   ` Andre Noll
2010-04-20 15:37                     ` Andre Noll
2010-04-20 16:51                       ` Eric Sandeen
2010-04-20 17:26                         ` Bernd Schubert
2010-04-20 18:35                           ` Andre Noll
2010-04-20 20:09                             ` Bernd Schubert
2010-04-20 17:46                         ` Andre Noll
2010-04-22  8:21                           ` Andre Noll
2010-04-21  8:57                       ` Dmitry Monakhov
2010-04-21 13:47                         ` Andre Noll
2010-04-16 15:57       ` Andre Noll

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).