public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Gene Heskett <gene.heskett@gmail.com>
To: "Lars Täuber" <taeuber@bbaw.de>
Cc: linux-kernel@vger.kernel.org
Subject: Re: [PROBLEM] reproduceable storage errors on high IO load
Date: Mon, 6 Jun 2011 05:59:02 -0400	[thread overview]
Message-ID: <201106060559.03131.gene.heskett@gmail.com> (raw)
In-Reply-To: <20110606095127.21d23a70.taeuber@bbaw.de>

On Monday, June 06, 2011, Lars Täuber wrote:
>Hallo!
>
>This is a message originally sent to linux-scsi.
>I got no reply so I think this was the wrong ML.
>Please tell me if I should send more specific information about
>something. Since january I struggle with this problem. It prevents me
>from running a backup server productively.
>
>Thank you.
>Lars
>
>
>
>Hi there,
>
>I have a problem with a SW-RAID6. It is reproduceable also after changing
>the hole hardware. I startet with a Suse 11.2. The problem occured
>during writing much data to the array (high io load). This is hopefully
>the right ML for my problem. Otherwise please excuse me and point me the
>the right ML.
>
>
>Then I changed the PSU. Still errors on high load.
>Then I changed the sata controller (Sil 3114 - sata_sil) with one with a
>different chipset (driver: sata_mv). Still errors on high load. Then I
>changed the disk enclosure and all cables. Still errors.
>Then I changed the mainboard (tyan opteron) with one from supermicro
>(H8SCM-F) with 6-core opteron. Still errors. Then I changed to ubuntu
>10.04 -> 10.10. Still errors
>Then I tried different schedulars (noop,anticipatory,cfq,deadline). Still
>errors. Then I tried kernel options: noapic + acpi=off without luck.
>Then I changed the sata controller with a areca sas (driver: mvsas).
>Still errors. Then I tried some different hdds (orig: Western Digital
>WDC WD2002FYPS + WDC WD2003FYYS; new: Seagate ST3320620NS). Still
>errors. Then I tried some different kernel versions from ubuntu without
>luck: 2.6.32-22-server
>2.6.35-25-server
>
>Then I tried self compiled kernels without luck:
>2.6.35.13
>2.6.38.6
>2.6.39: same problem occurs but later
>
>The current configuration:
>- tested only 64-bit kernels
>- Supermicro H8SCM-F (AMD SR5650+SP5100) with 6-core opteron
>- Areca (non-raid) ARC-1300ix-16 sas controller
>- SW-RAID6 over 8 Western Digital HDDs (sone WDC WD2002FYPS + some WDC
>WD2003FYYS) - redundant PSU
>
>How to reproduce my problem:
>mdadm -C /dev/md3 -l6 -n8 /dev/sd[c-h] missing missing
>(the two missing hdds prevent this raid from initial sync)
>
>Everything is just fine till yet.
>Now produce high io-load:
>mke2fs -j /dev/md3
>
>The detailed history (search for Lars to get my posts):
>https://bugs.launchpad.net/ubuntu/+bug/550559
>
>The error messages changed a bit during the kernel versions.
>The nearly complete dmesg output:
>https://launchpadlibrarian.net/72325163/20110524.dmesg.out
>
>Is there something I do wrong? Could someone help me to debug this?
>Thanks
>Lars

Looking at your dmesg, I get the impression you have a bunch of disks that 
are in need of a firmware update.  Unforch, the dmesg snippet does not 
include the drive discovery and identification data.

However, I would back that data up to another medium before I did that as I 
had the seagate firmware update scramble the blkid's and partition names of 
one of two 1Tb drives I have.  Neither drive errors now, but the read/write 
speeds for the 2nd identical drive are about 1/3rd the rate of the first.

Firmware updates are in the form of a bootable cd .iso, and you can 
download the cd image from the makers site.

Cheers, gene
-- 
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Eisenhower!!  Your mimeograph machine upsets my stomach!!

  reply	other threads:[~2011-06-06  9:59 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-06-06  7:51 [PROBLEM] reproduceable storage errors on high IO load Lars Täuber
2011-06-06  9:59 ` Gene Heskett [this message]
2011-06-06 11:06   ` Lars Täuber
2011-06-06 13:20     ` Gene Heskett
2011-06-06 13:56       ` Lars Täuber
2011-06-07 10:43         ` Lars Täuber
2011-07-01  8:23           ` Lars Täuber
2011-07-01 13:13             ` Lars Täuber
2011-07-01 14:37               ` Lars Täuber
2011-07-04 13:39                 ` Lars Täuber
2011-07-05 22:34                   ` Valdis.Kletnieks
2011-07-06  7:11                     ` Lars Täuber
2011-07-14  9:16                       ` Lars Täuber
2011-07-14 10:30                         ` [PROBLEM] reproduceable storage errors on high IO load - mvsas & 3.0-rc7 Lars Täuber
2011-07-27  8:26                           ` Lars Täuber
2011-06-08 10:30 ` [PROBLEM] reproduceable storage errors on high IO load Lars Täuber
2011-06-08 21:39   ` Henrique de Moraes Holschuh
2011-06-09  7:56     ` Lars Täuber

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=201106060559.03131.gene.heskett@gmail.com \
    --to=gene.heskett@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=taeuber@bbaw.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox