From: Tejun Heo <htejun@gmail.com>
To: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
David Greaves <david@dgreaves.com>, David Chinner <dgc@sgi.com>,
xfs@oss.sgi.com,
"'linux-kernel@vger.kernel.org'" <linux-kernel@vger.kernel.org>,
linux-pm <linux-pm@lists.osdl.org>, Neil Brown <neilb@suse.de>,
Jeff Garzik <jgarzik@pobox.com>
Subject: Re: 2.6.22-rc3 hibernate(?) fails totally - regression (xfs on raid6)
Date: Thu, 14 Jun 2007 23:21:03 +0900 [thread overview]
Message-ID: <46714ECF.8080203@gmail.com> (raw)
In-Reply-To: <200706140115.58733.rjw@sisk.pl>
Hello,
Rafael J. Wysocki wrote:
> On Thursday, 14 June 2007 00:12, Linus Torvalds wrote:
>> On Wed, 13 Jun 2007, David Greaves wrote:
>>>> I'm not seeing anything really obvious. The traces would probably look
>>>> better if you enabled CONFIG_FRAME_POINTER, though. That should cut down on
>>>> some of the noise and make the traces a bit more readable.
>>> I can do that...
>> Thanks. That makes a big difference to the readability of the traces.
>>
>> That said, I'm so used to reading even the messy ones that this didn't
>> actually tell me anything new (it made it clear that the SCSI error
>> handler noise was just noise), but for people who aren't quite as used to
>> seeing crap backtraces, your new trace might hopefully put them on the
>> right track.
>>
>> I threw out the parts that didn't look all that relevant, and left the
>> ata_aux/md0_raid5/hibernate traces here for others to look at without all
>> the other noise. Those _seem_ to be the primary suspects in this saga.
>
> Hmm, it looks like both hibernate and ata_aux are waiting for the same
> completion. I wonder who's supposed to complete it.
They're waiting for the commands they issued to complete. ata_aux is
trying to revalidate the scsi device after libata EH finished waking up
the port and hibernate is trying to resume scsi disk device. ata_aux is
issuing either TEST UNIT READY or START STOP. hibernate is issuing
START STOP.
This can be caused by one of the followings.
1. SCSI EH thread (ATA EH runs off it) for the SCSI device hasn't
finished yet. All commands are deferred while EH is in progress.
2. request_queue is stuck - somehow somebody forgot to kick the queue at
some point.
3. command is stuck somewhere in SCSI/ATA land.
#1 doesn't seem to be the case as all scsi_eh threads seems idle. I'm
looking at the code but can't find anything which could cause #2 or #3.
Also, these code paths are traveled really frequently.
I'm also trying to reproduce the problem here with xfs over RAID-6 array
but haven't been successful yet.
David, do you store the hibernation image on the RAID-6 array? Can you
post the captured kernel log when it locks up?
--
tejun
next prev parent reply other threads:[~2007-06-14 14:21 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <46608E3F.4060201@dgreaves.com>
2007-06-01 21:42 ` 2.6.22-rc3 hibernate(?) disables skge wol Rafael J. Wysocki
2007-06-01 22:37 ` 2.6.22-rc3 hibernate(?) fails totally - regression David Greaves
2007-06-01 23:22 ` Rafael J. Wysocki
2007-06-02 22:31 ` David Greaves
2007-06-02 22:46 ` Linus Torvalds
2007-06-03 15:03 ` David Greaves
2007-06-06 8:33 ` Tejun Heo
2007-06-06 10:18 ` [PATCH] sata_promise: use TF interface for polling NODATA commands Tejun Heo
2007-06-06 10:19 ` Tejun Heo
2007-06-06 10:39 ` 2.6.22-rc3 hibernate(?) fails totally - regression (xfs on raid6) David Greaves
2007-06-07 5:53 ` Tejun Heo
2007-06-07 10:30 ` David Greaves
2007-06-07 11:07 ` David Chinner
2007-06-07 13:59 ` David Greaves
2007-06-07 22:28 ` David Chinner
2007-06-08 19:09 ` David Greaves
2007-06-12 18:43 ` Linus Torvalds
2007-06-13 11:16 ` David Greaves
2007-06-13 21:04 ` Linus Torvalds
2007-06-13 21:22 ` Jeff Garzik
2007-06-13 22:02 ` David Greaves
2007-06-13 22:12 ` Linus Torvalds
2007-06-13 23:15 ` Rafael J. Wysocki
2007-06-14 14:21 ` Tejun Heo [this message]
2007-06-14 15:10 ` Tejun Heo
2007-06-15 9:42 ` [PATCH] block: always requeue !fs requests at the front Tejun Heo
2007-06-15 11:05 ` Jens Axboe
2007-06-15 11:17 ` Tejun Heo
2007-06-15 11:21 ` Jens Axboe
2007-06-15 15:08 ` Jeff Garzik
2007-06-16 19:54 ` Christoph Hellwig
2007-06-17 7:29 ` Jens Axboe
2007-06-17 8:03 ` Tejun Heo
2007-06-15 13:58 ` David Greaves
2007-06-14 15:19 ` 2.6.22-rc3 hibernate(?) fails totally - regression (xfs on raid6) David Greaves
2007-06-14 0:28 ` David Chinner
2007-06-12 12:31 ` David Greaves
2007-06-10 18:43 ` Pavel Machek
2007-06-12 18:00 ` David Greaves
2007-06-12 21:31 ` Pavel Machek
2007-06-07 13:45 ` Duane Griffin
2007-06-07 14:00 ` David Greaves
2007-06-07 14:05 ` Tejun Heo
2007-06-07 14:36 ` Mark Lord
2007-06-07 15:20 ` David Greaves
2007-06-07 16:58 ` Rafael J. Wysocki
2007-06-07 20:12 ` Pavel Machek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=46714ECF.8080203@gmail.com \
--to=htejun@gmail.com \
--cc=david@dgreaves.com \
--cc=dgc@sgi.com \
--cc=jgarzik@pobox.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@lists.osdl.org \
--cc=neilb@suse.de \
--cc=rjw@sisk.pl \
--cc=torvalds@linux-foundation.org \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox