From: Thomas Fjellstrom <thomas@fjellstrom.ca>
To: "jack_wang" <jack_wang@usish.com>
Cc: "David Milburn" <dmilburn@redhat.com>,
"Andre Tomt" <andre@tomt.net>,
"Linux Kernel List" <linux-kernel@vger.kernel.org>,
"linux-scsi" <linux-scsi@vger.kernel.org>
Subject: Re: mvsas errors in 2.6.36
Date: Sat, 4 Dec 2010 08:44:47 -0700 [thread overview]
Message-ID: <201012040844.47337.thomas@fjellstrom.ca> (raw)
In-Reply-To: <201012040554.31111.thomas@fjellstrom.ca>
On December 4, 2010, Thomas Fjellstrom wrote:
> On December 4, 2010, jack_wang wrote:
> >
> > Here is what I get with that returning 0 rather than -1 as you requested:
> > [19107.040031] sas: command 0xffff88011c77f9c0, task 0xffff88022ae51600, timed out: BLK_EH_NOT_HANDLED
> > [19107.040062] sas: Enter sas_scsi_recover_host
> > [19107.040072] sas: trying to find task 0xffff88022ae51600
> > [19107.040079] sas: sas_scsi_find_task: aborting task 0xffff88022ae51600
> > [19107.040089] drivers/scsi/mvsas/mv_sas.c 1703:<7>mv_abort_task() mvi=ffff880224040000 task=ffff88022ae51600 slot=ffff880224066680 slot_idx=x4
> > [19107.040101] sas: sas_scsi_find_task: task 0xffff88022ae51600 is aborted
> > [19107.040107] sas: sas_eh_handle_sas_errors: task 0xffff88022ae51600 is aborted
> > [19107.040113] sas: sas_ata_task_done: SAS error 8d
> > [19107.040124] ata21: translated ATA stat/err 0x01/04 to SCSI SK/ASC/ASCQ 0xb/00/00
> > [19107.040860] ata21: status=0x01 { Error }
> > [19107.040866] ata21: error=0x04 { DriveStatusError }
> > [19107.040886] sas: --- Exit sas_scsi_recover_host
> > [19318.000085] sas: command 0xffff8801250291c0, task 0xffff88018a8e5b80, timed out: BLK_EH_NOT_HANDLED
> > [19318.000125] sas: Enter sas_scsi_recover_host
> > [19318.000135] sas: trying to find task 0xffff88018a8e5b80
> > [19318.000141] sas: sas_scsi_find_task: aborting task 0xffff88018a8e5b80
> > [19318.000152] drivers/scsi/mvsas/mv_sas.c 1703:<7>mv_abort_task() mvi=ffff880224040000 task=ffff88018a8e5b80 slot=ffff8802240666d8 slot_idx=x5
> > [19318.000163] sas: sas_scsi_find_task: task 0xffff88018a8e5b80 is aborted
> > [19318.000169] sas: sas_eh_handle_sas_errors: task 0xffff88018a8e5b80 is aborted
> > [19318.000175] sas: sas_ata_task_done: SAS error 8d
> > [19318.000185] ata24: translated ATA stat/err 0x01/04 to SCSI SK/ASC/ASCQ 0xb/00/00
> > [19318.000896] ata24: status=0x01 { Error }
> > [19318.000902] ata24: error=0x04 { DriveStatusError }
> > [19318.000922] sas: --- Exit sas_scsi_recover_host
> >
> >
> >
> > [Jack] Do all the drives discoverd? There are still commands timeout, maybe the disks need more time to response, or something
> > wrong with the driver, I'm not sure.
>
> All drives come up. That last set of logs is something that happens once
> or twice an hour while running. I just rebooted again to see what
> difference the change makes with a fresh startup. So far it seems that
> the controller is running properly in SATA II/3Gbps mode after the reboot.
>
> Just to contrast what the kernel reports in the two scenarios:
> rmmod+modprobe:
> sas: DOING DISCOVERY on port 0, pid:7283
> drivers/scsi/mvsas/mv_sas.c 1388:found dev[0:5] is gone.
> sas: sas_ata_phy_reset: Found ATA device.
> ata15.00: ATA-8: ST31000528AS, CC34, max UDMA/133
> ata15.00: 1953525168 sectors, multi 0: LBA48 NCQ (depth 31/32)
> ata15.00: qc timeout (cmd 0xef)
> [snip mvsas reset]
> sas: sas_ata_phy_reset: Found ATA device.
> sas: sas_to_ata_err: Saw error 2. What to do?
> sas: sas_ata_task_done: SAS error 2
> ata15.00: failed to IDENTIFY (I/O error, err_mask=0x100)
> sas: STUB sas_ata_scr_read
> ata15: limiting SATA link speed to 1.5 Gbps
> ata15.00: limiting speed to UDMA/133:PIO3
>
> fresh boot:
> sas: DOING DISCOVERY on port 0, pid:312
> drivers/scsi/mvsas/mv_sas.c 1388:found dev[0:5] is gone.
> sas: sas_ata_phy_reset: Found ATA device.
> ata9.00: ATA-8: ST31000528AS, CC34, max UDMA/133
> ata9.00: 1953525168 sectors, multi 0: LBA48 NCQ (depth 31/32)
> ata9.00: configured for UDMA/133
>
> This seems to happen on all ports. As does my original issue, though it
> (the original issue) doesn't happen to all ports at the same time, rather
> events seem to randomly happen, to one or more ports at random times.
>
> As you can see, the drive are 1TB Seagate SATAII drives. They are setup
> in a md-raid 5 array. Luckily these events don't bubble any errors up
> the stack causing a rebuild.
Even after the reboot it still happens, though with that change, it /seems/
as if the pause is gone, but I can't be sure yet.
[ 6080.020026] sas: command 0xffff880172dfbe80, task 0xffff8800379cbb40, timed out: BLK_EH_NOT_HANDLED
[ 6080.020053] sas: Enter sas_scsi_recover_host
[ 6080.020062] sas: trying to find task 0xffff8800379cbb40
[ 6080.020069] sas: sas_scsi_find_task: aborting task 0xffff8800379cbb40
[ 6080.020079] drivers/scsi/mvsas/mv_sas.c 1703:<7>mv_abort_task() mvi=ffff880222a00000 task=ffff8800379cbb40 slot=ffff880222a26680 slot_idx=x4
[ 6080.020090] sas: sas_scsi_find_task: task 0xffff8800379cbb40 is aborted
[ 6080.020096] sas: sas_eh_handle_sas_errors: task 0xffff8800379cbb40 is aborted
[ 6080.020102] sas: sas_ata_task_done: SAS error 8d
[ 6080.020113] ata9: translated ATA stat/err 0x01/04 to SCSI SK/ASC/ASCQ 0xb/00/00
[ 6080.020931] ata9: status=0x01 { Error }
[ 6080.020937] ata9: error=0x04 { DriveStatusError }
[ 6080.021008] sas: --- Exit sas_scsi_recover_host
Hopefully we can figure out whats causing these errors.
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at http://www.tux.org/lkml/
> >
>
>
>
--
Thomas Fjellstrom
thomas@fjellstrom.ca
next prev parent reply other threads:[~2010-12-04 15:44 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-10-29 12:50 mvsas errors in 2.6.36 Thomas Fjellstrom
2010-10-31 15:11 ` Thomas Fjellstrom
2010-11-02 17:02 ` Audio Haven
2010-11-17 7:53 ` Thomas Fjellstrom
2010-11-17 8:24 ` Andre Tomt
2010-12-02 6:29 ` Thomas Fjellstrom
2010-12-02 9:48 ` Thomas Fjellstrom
2010-12-03 16:39 ` Thomas Fjellstrom
2010-12-03 20:31 ` David Milburn
2010-12-04 6:57 ` Thomas Fjellstrom
[not found] ` <201012041550372348573@usish.com>
2010-12-04 8:37 ` Thomas Fjellstrom
2010-12-04 11:52 ` Thomas Fjellstrom
2010-12-04 12:33 ` jack_wang
2010-12-04 12:54 ` Thomas Fjellstrom
2010-12-04 15:44 ` Thomas Fjellstrom [this message]
2010-12-04 18:22 ` Thomas Fjellstrom
2010-12-05 2:08 ` jack_wang
2010-12-05 20:01 ` Thomas Fjellstrom
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=201012040844.47337.thomas@fjellstrom.ca \
--to=thomas@fjellstrom.ca \
--cc=andre@tomt.net \
--cc=dmilburn@redhat.com \
--cc=jack_wang@usish.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox