From: David Chinner <dgc@sgi.com>
To: David Greaves <david@dgreaves.com>
Cc: David Chinner <dgc@sgi.com>, Tejun Heo <htejun@gmail.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
"Rafael J. Wysocki" <rjw@sisk.pl>,
xfs@oss.sgi.com,
"'linux-kernel@vger.kernel.org'" <linux-kernel@vger.kernel.org>,
linux-pm <linux-pm@lists.osdl.org>, Neil Brown <neilb@suse.de>
Subject: Re: 2.6.22-rc3 hibernate(?) fails totally - regression (xfs on raid6)
Date: Fri, 8 Jun 2007 08:28:13 +1000 [thread overview]
Message-ID: <20070607222813.GG85884050@sgi.com> (raw)
In-Reply-To: <46680F5E.6070806@dgreaves.com>
On Thu, Jun 07, 2007 at 02:59:58PM +0100, David Greaves wrote:
> David Chinner wrote:
> >On Thu, Jun 07, 2007 at 11:30:05AM +0100, David Greaves wrote:
> >>Tejun Heo wrote:
> >>>Hello,
> >>>
> >>>David Greaves wrote:
> >>>>Just to be clear. This problem is where my system won't resume after s2d
> >>>>unless I umount my xfs over raid6 filesystem.
> >>>This is really weird. I don't see how xfs mount can affect this at all.
> >>Indeed.
> >>It does :)
> >
> >Ok, so lets determine if it really is XFS.
> Seems like a good next step...
>
> >Does the lockup happen with a
> >different filesystem on the md device? Or if you can't test that, does
> >any other XFS filesystem you have show the same problem?
> It's a rather full 1.2Tb raid6 array - can't reformat it - sorry :)
I suspected as much :/
> I only noticed the problem when I umounted the fs during tests to prevent
> corruption - and it worked. I'm doing a sync each time it hibernates (see
> below) and a couple of paranoia xfs_repairs haven't shown any problems.
sync just guarantees that metadata changes are logged and data is
on disk - it doesn't stop the filesystem from doing anything after
the sync...
> I do have another xfs filesystem on /dev/hdb2 (mentioned when I noticed the
> md/XFS correlation). It doesn't seem to have/cause any problems.
Ok, so it's not an obvious XFS problem...
> >If it is xfs that is causing the problem, what happens if you
> >remount read-only instead of unmounting before shutting down?
> Yes, I'm happy to try these tests.
> nb, the hibernate script is:
> ethtool -s eth0 wol g
> sync
> echo platform > /sys/power/disk
> echo disk > /sys/power/state
>
> So there has always been a sync before any hibernate.
>
>
> cu:~# mount -oremount,ro /huge
.....
> [this works and resumes]
Ok.
> cu:~# mount -oremount,rw /huge
> cu:~# /usr/net/bin/hibernate
> [this works and resumes too !]
Interesting. That means something in the generic remount code
is affecting this.
> cu:~# touch /huge/tst
> cu:~# /usr/net/bin/hibernate
> [but this doesn't even hibernate]
Ok, so a clean inode is sufficient to prevent hibernate from working.
So, what's different between a sync and a remount?
do_remount_sb() does:
599 shrink_dcache_sb(sb);
600 fsync_super(sb);
of which a sync does neither. sync does what fsync_super() does in
different sort of way, but does not call sync_blockdev() on each
block device. It looks like that is the two main differences between
sync and remount - remount trims the dentry cache and syncs the blockdev,
sync doesn't.
> > What about freezing the filesystem?
> cu:~# xfs_freeze -f /huge
> cu:~# /usr/net/bin/hibernate
> [but this doesn't even hibernate - same as the 'touch']
I suspect that the frozen filesystem might cause other problems
in the hibernate process. However, while a freeze calls sync_blockdev()
it does not trim the dentry cache.....
So, rather than a remount before hibernate, lets see if we can
remove the dentries some other way to determine if removing excess
dentries/inodes from the caches makes a difference. Can you do:
# touch /huge/foo
# sync
# echo 1 > /proc/sys/vm/drop_caches
# hibernate
# touch /huge/bar
# sync
# echo 2 > /proc/sys/vm/drop_caches
# hibernate
# touch /huge/baz
# sync
# echo 3 > /proc/sys/vm/drop_caches
# hibernate
And see if any of those survive the suspend/resume?
Cheers,
Dave.
--
Dave Chinner
Principal Engineer
SGI Australian Software Group
next prev parent reply other threads:[~2007-06-07 22:28 UTC|newest]
Thread overview: 58+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-06-01 21:23 2.6.22-rc3 hibernate(?) disables skge wol David Greaves
2007-06-01 21:42 ` Rafael J. Wysocki
2007-06-01 22:37 ` 2.6.22-rc3 hibernate(?) fails totally - regression David Greaves
2007-06-01 23:22 ` Rafael J. Wysocki
2007-06-01 23:22 ` Rafael J. Wysocki
2007-06-02 22:31 ` David Greaves
2007-06-02 22:46 ` Linus Torvalds
2007-06-03 15:03 ` David Greaves
2007-06-06 8:33 ` Tejun Heo
2007-06-06 10:18 ` [PATCH] sata_promise: use TF interface for polling NODATA commands Tejun Heo
2007-06-06 10:19 ` Tejun Heo
2007-06-06 10:19 ` Tejun Heo
2007-06-06 10:39 ` 2.6.22-rc3 hibernate(?) fails totally - regression (xfs on raid6) David Greaves
2007-06-06 10:39 ` David Greaves
2007-06-07 5:53 ` Tejun Heo
2007-06-07 10:30 ` David Greaves
2007-06-07 11:07 ` David Chinner
2007-06-07 11:07 ` David Chinner
2007-06-07 13:59 ` David Greaves
2007-06-07 22:28 ` David Chinner [this message]
2007-06-08 19:09 ` David Greaves
2007-06-12 18:43 ` Linus Torvalds
2007-06-13 11:16 ` David Greaves
2007-06-13 11:16 ` David Greaves
2007-06-13 21:04 ` Linus Torvalds
2007-06-13 21:04 ` Linus Torvalds
2007-06-13 21:22 ` Jeff Garzik
2007-06-13 22:02 ` David Greaves
2007-06-13 22:12 ` Linus Torvalds
2007-06-13 22:12 ` Linus Torvalds
2007-06-13 23:15 ` Rafael J. Wysocki
2007-06-14 14:21 ` Tejun Heo
2007-06-14 15:10 ` Tejun Heo
2007-06-15 9:42 ` [PATCH] block: always requeue !fs requests at the front Tejun Heo
2007-06-15 11:05 ` Jens Axboe
2007-06-15 11:17 ` Tejun Heo
2007-06-15 11:21 ` Jens Axboe
2007-06-15 15:08 ` Jeff Garzik
2007-06-16 19:54 ` Christoph Hellwig
2007-06-17 7:29 ` Jens Axboe
2007-06-17 8:03 ` Tejun Heo
2007-06-15 13:58 ` David Greaves
2007-06-14 15:19 ` 2.6.22-rc3 hibernate(?) fails totally - regression (xfs on raid6) David Greaves
2007-06-14 0:28 ` David Chinner
2007-06-12 12:31 ` David Greaves
2007-06-10 18:43 ` Pavel Machek
2007-06-10 18:43 ` Pavel Machek
2007-06-12 18:00 ` David Greaves
2007-06-12 21:31 ` Pavel Machek
2007-06-07 13:45 ` Duane Griffin
2007-06-07 14:00 ` David Greaves
2007-06-07 14:05 ` Tejun Heo
2007-06-07 14:36 ` Mark Lord
2007-06-07 15:20 ` David Greaves
2007-06-07 16:58 ` Rafael J. Wysocki
2007-06-07 16:58 ` Rafael J. Wysocki
2007-06-07 20:12 ` Pavel Machek
2007-06-07 20:12 ` Pavel Machek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20070607222813.GG85884050@sgi.com \
--to=dgc@sgi.com \
--cc=david@dgreaves.com \
--cc=htejun@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@lists.osdl.org \
--cc=neilb@suse.de \
--cc=rjw@sisk.pl \
--cc=torvalds@linux-foundation.org \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.