From: Martin Papik <mp6058@gmail.com>
To: Stefan Ring <stefanrin@gmail.com>
Cc: Linux fs XFS <xfs@oss.sgi.com>
Subject: Re: XFS filesystem claims to be mounted after a disconnect
Date: Tue, 03 Jun 2014 13:48:31 +0300 [thread overview]
Message-ID: <538DA7FF.4080002@gmail.com> (raw)
In-Reply-To: <CAAxjCEzz5n85zAH5HuUQkfxKvzZt5_+cPCj3uzZR7U69H+2tDw@mail.gmail.com>
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512
On 06/03/2014 12:55 PM, Stefan Ring wrote:
> From skimming this thread, it seems that there is some hardware
> issue at work here, but nonetheless, I had a very similar situation
> a while ago that was rather puzzling to me at the time, having to
> do with mount namespaces:
> http://oss.sgi.com/pipermail/xfs/2012-August/020910.html
>
Hardware issue or not, IMHO XFS has some issues. Specifically, thus
far I have not seen any other filesystem prevent fsck on a USB disk
that disconnected and was reconnected. After all the reconnected
device is a new device. But the new device (different from the
previous one, e.g. sda and sdb) can't be checked (xfs_repair) or mounted.
All right, here's a bit of an experiment. I have a hard drive I use
for testing with several small partitions with several filesystems.
After automounting I see this:
$ cat /proc/mounts | grep media/T
/dev/sdf101 /media/T2 ext2
rw,nosuid,nodev,relatime,errors=continue,user_xattr,acl 0 0
/dev/sdf102 /media/T4 btrfs rw,nosuid,nodev,relatime,nospace_cache 0 0
/dev/sdf104 /media/T5 ext4 rw,nosuid,nodev,relatime,data=ordered 0 0
/dev/sdf103 /media/T4_ ext3
rw,nosuid,nodev,relatime,errors=continue,user_xattr,acl,barrier=1,data=ordered
0 0
/dev/sdf100 /media/TEST xfs
rw,nosuid,nodev,relatime,attr2,inode64,noquota 0 0
I open hexedit on some files on ext4 and xfs
and I see this:
$ lsof | grep TEST
hexedit 24010 martin 3u REG 259,2
4198400 131 /media/TEST/TEST...FILE
hexedit 24011 martin 3u REG 259,6
4198400 12 /media/T5/TEST...FILE
After yanking the USB cable I see this:
$ cat /proc/mounts | grep media/T
--- no output ---
$ lsof | grep TEST
hexedit 24010 martin 3u unknown
/TEST...FILE (stat: Input/output error)
hexedit 24011 martin 3u REG 259,6
4198400 12 /TEST...FILE
After reconnecting the device ext4 mounts, xfs does not.
dmegs contains this (among other [unrelated] things):
[3095915.107117] sd 60:0:0:0: [sdf] 976773167 512-byte logical blocks:
(500 GB/465 GiB)
[3095915.108343] sd 60:0:0:0: [sdf] Write Protect is off
[3095915.108360] sd 60:0:0:0: [sdf] Mode Sense: 1c 00 00 00
[3095915.110633] sd 60:0:0:0: [sdf] Write cache: enabled, read cache:
enabled, doesn't support DPO or FUA
[3095915.207622] sdf: sdf69 sdf100 sdf101 sdf102 sdf103 sdf104 sdf105
[3095915.210148] sd 60:0:0:0: [sdf] Attached SCSI disk
[3095917.969887] XFS (sdf100): Mounting Filesystem
[3095918.209464] XFS (sdf100): Starting recovery (logdev: internal)
[3095918.260450] XFS (sdf100): Ending recovery (logdev: internal)
[3096069.218797] XFS (sdf100): metadata I/O error: block 0xa02007
("xlog_iodone") error 19 numblks 64
[3096069.218808] XFS (sdf100): xfs_do_force_shutdown(0x2) called from
line 1115 of file
/build/buildd/linux-lts-raring-3.8.0/fs/xfs/xfs_log.c. Return address
= 0xffffffffa07f4fd1
[3096069.218830] XFS (sdf100): Log I/O Error Detected. Shutting down
filesystem
[3096069.218833] XFS (sdf100): Please umount the filesystem and
rectify the problem(s)
[3096099.254131] XFS (sdf100): xfs_log_force: error 5 returned.
[3096129.289338] XFS (sdf100): xfs_log_force: error 5 returned.
[3096159.324525] XFS (sdf100): xfs_log_force: error 5 returned.
[3096185.296795] sd 61:0:0:0: [sdg] 976773167 512-byte logical blocks:
(500 GB/465 GiB)
[3096185.297431] sd 61:0:0:0: [sdg] Write Protect is off
[3096185.297447] sd 61:0:0:0: [sdg] Mode Sense: 1c 00 00 00
[3096185.298022] sd 61:0:0:0: [sdg] Write cache: enabled, read cache:
enabled, doesn't support DPO or FUA
[3096185.392940] sdg: sdg69 sdg100 sdg101 sdg102 sdg103 sdg104 sdg105
[3096185.395247] sd 61:0:0:0: [sdg] Attached SCSI disk
[3096189.359859] XFS (sdf100): xfs_log_force: error 5 returned.
[3096219.395200] XFS (sdf100): xfs_log_force: error 5 returned.
[3096249.430490] XFS (sdf100): xfs_log_force: error 5 returned.
[3096279.465765] XFS (sdf100): xfs_log_force: error 5 returned.
[3096309.501089] XFS (sdf100): xfs_log_force: error 5 returned.
[3096339.536371] XFS (sdf100): xfs_log_force: error 5 returned.
[3096369.571713] XFS (sdf100): xfs_log_force: error 5 returned.
[3096399.607003] XFS (sdf100): xfs_log_force: error 5 returned.
[3096429.642332] XFS (sdf100): xfs_log_force: error 5 returned.
[3096459.677730] XFS (sdf100): xfs_log_force: error 5 returned.
[3096489.712934] XFS (sdf100): xfs_log_force: error 5 returned.
[3096519.748242] XFS (sdf100): xfs_log_force: error 5 returned.
[3096549.783642] XFS (sdf100): xfs_log_force: error 5 returned.
sdf100 (the old device) and sdg100 (the reconnected device) are
different, but XFS won't touch it.
# xfs_repair /dev/sdg100
xfs_repair: /dev/sdg100 contains a mounted filesystem
fatal error -- couldn't initialize XFS library
Also please do carefully note the difference between the lsof output
for the hung file descriptor for xfs and ext4. ext4 reports everything
the same as before, except for the mount path. xfs report changes, the
device ID is missing, the file changes from REG to unknown.
So, AFAIK and IMHO this is an issue with XFS. The impact can be the
inability to recover from a device disconnect, since so far I don't
see a good way to figure out which processes are holding up the FS.
And besides, having to kill processes to mount a filesystem (xfs) is
not a happy state of affairs.
Oh yes, there is a hardware issue somewhere, but that is not the cause
of the XFS behavior, only the trigger. Since the experiment in this
email was without my USB HUB going nuts, I merely did a good old
fashioned cable yank. And yes, it's not an every day occurrence, but a
stable and reliable FS should deal with it. At least I think so, don't
you? Sadly I can't help with the coding, I am not familiar with the
code base, I got a bit lost trying to follow the path of ustat and
proc mounts, it was ages since I touched the kernel sources. But I can
provide information about what happened. :-) I hope it helps us all
have a good FS.
Martin
PS
# xfs_repair /dev/sdg100
xfs_repair: /dev/sdg100 contains a mounted filesystem
fatal error -- couldn't initialize XFS library
# kill 24010
# xfs_repair /dev/sdg100
Phase 1 - find and verify superblock...
Phase 2 - using internal log
- zero log...
ERROR: The filesystem has valuable metadata changes in a log which
needs to
be replayed. Mount the filesystem to replay the log, and unmount it
before
re-running xfs_repair. If you are unable to mount the filesystem,
then use
the -L option to destroy the log and attempt a repair.
Note that destroying the log may cause corruption -- please attempt a
mount
of the filesystem before doing this.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
iQIcBAEBCgAGBQJTjaf4AAoJELsEaSRwbVYrJfsP/3z/WI5+dkk2XduRayB2FdOo
S97IMjGHSEbNDNEAKvTsahYwZENE5TizuhyOrvQORl+fsMaedIdn2QYVS6fGAnJR
llhNMQezUKOfwBZtpf3S3FmvFZCoN+q3BTfl2qkmY29c0aivLyxyTCsGlDprHY2Q
pxv3QzsXRtM1FYk6+FFtc9XQYCiLU3KOAq4I7GoGcAMjFRpH8xpuogI2fQQQkFo8
NGxZBmtTq3xbOd/7237tug44Z98iM/uz+tT2xE5g3iJSqcEhaMTJbAkv9d6uBY8G
xLb+yT5M2O6Z6xuZowk3ySFtO+Ia5Row3BhQrpuySdkRNueiJf9KTLMleMNxVqj8
DcNL2hFS6Fyog6g0wVfoUM3txm5wx80w15K2zN2cPnOsdDO11QKUbV9ktFjQ7f++
CLcmxGHtuq7SFM0bMgbcxvA5B9Gs/9tlzXDiN/jag3ixMZYTmOC15ayJevAM3Nru
xN/lPBMiFO+Rr89yZz303M+hRRRD4pQL1VxcyPjs0f6l0tWqb2Xx0wpFBjantUyF
EzIUwgekwMktzLefhTgXumDH/aE9xlY2au+sJtL255uX1XBq4qE4sxrGv73+L9Ti
M+tToCi7sQPoMwzCqJqHHbYWwaisgbq9AFymy2FUFUSqiiV21NMdIZeu7zcDEzuj
pG51qhnHCz5O48cPBpZx
=ecc3
-----END PGP SIGNATURE-----
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2014-06-03 10:48 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-05-02 13:47 XFS filesystem claims to be mounted after a disconnect Martin Papik
2014-05-02 15:04 ` Eric Sandeen
2014-05-02 15:07 ` Eric Sandeen
2014-05-02 15:44 ` Mark Tinguely
2014-05-02 16:26 ` Martin Papik
2014-05-02 16:44 ` Martin Papik
2014-05-02 16:53 ` Eric Sandeen
2014-05-02 17:54 ` Martin Papik
2014-05-02 18:39 ` Eric Sandeen
2014-05-02 19:07 ` Martin Papik
2014-05-02 19:16 ` Eric Sandeen
2014-05-02 19:29 ` Martin Papik
2014-05-02 23:38 ` Dave Chinner
2014-05-02 23:35 ` Dave Chinner
2014-05-03 0:04 ` Martin Papik
2014-05-03 3:02 ` Dave Chinner
2014-06-02 11:22 ` Martin Papik
2014-06-02 23:41 ` Dave Chinner
2014-06-03 9:23 ` Martin Papik
2014-06-03 9:55 ` Stefan Ring
2014-06-03 10:48 ` Martin Papik [this message]
2014-06-03 21:28 ` Dave Chinner
2014-06-03 22:37 ` Martin Papik
2014-06-05 0:55 ` Dave Chinner
2014-06-05 1:38 ` Martin Papik
2014-06-05 19:39 ` Martin Papik
2014-06-05 22:41 ` Dave Chinner
2014-06-06 0:47 ` Martin Papik
2014-06-03 22:58 ` Martin Papik
2014-06-05 0:08 ` Dave Chinner
2014-06-05 1:07 ` Martin Papik
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=538DA7FF.4080002@gmail.com \
--to=mp6058@gmail.com \
--cc=stefanrin@gmail.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).