From: Chris Dunlop <chris@onthe.net.au>
To: Brian Foster <bfoster@redhat.com>
Cc: xfs@oss.sgi.com
Subject: Re: Disk error, then endless loop
Date: Wed, 18 Nov 2015 09:16:09 +1100 [thread overview]
Message-ID: <20151117221609.GA3563@onthe.net.au> (raw)
In-Reply-To: <20151117203455.GB43800@bfoster.bfoster>
On Tue, Nov 17, 2015 at 03:34:55PM -0500, Brian Foster wrote:
> On Tue, Nov 17, 2015 at 03:21:31PM -0500, Brian Foster wrote:
>> On Wed, Nov 18, 2015 at 06:35:34AM +1100, Chris Dunlop wrote:
>>> On Tue, Nov 17, 2015 at 12:37:24PM -0500, Brian Foster wrote:
>>>> If the device has already dropped and reconnected as a new dev node,
>>>> it's probably harmless at this point to just try to forcibly shut down
>>>> the fs on the old one. Could you try the following?
>>>>
>>>> xfs_io -x -c shutdown <mnt>
>>>
>>> # xfs_io -x -c shutdown /var/lib/ceph/osd/ceph-18
>>> foreign file active, shutdown command is for XFS filesystems only
>>>
>>> # grep ceph-18 /etc/mtab
>>> <<< crickets >>>
>>>
>>> I don't know when the fs disappeared from mtab, it could have been when I
>>> first did the umount I guess, I didn't think to check at the time. But the
>>> umount is still there:
>>>
>>> # date; ps -opid,lstart,time,stat,wchan='WCHAN-xxxxxxxxxxxxxxxxxx',cmd -C umount
>>> Wed Nov 18 06:23:21 AEDT 2015
>>> PID STARTED TIME STAT WCHAN-xxxxxxxxxxxxxxxxxx CMD
>>> 23946 Tue Nov 17 17:30:41 2015 00:00:00 D+ xfs_ail_push_all_sync umount /var/lib/ceph/osd/ceph-18
>>
>> Ah, so it's already been removed from the namespace. Apparently it's
>> stuck at some point after the mount is made inaccessible and before it
>> actually finishes with I/O. I'm not sure we have any other option other
>> than a reset at this point, unfortunately. :/
Yes, I thought this would likely be the case.
> One last thought... it occurred to me that scsi devs have a delete
> option under the /sysfs fs. Does the old/stale device still exist under
> /sys/block/<dev>? If so, perhaps an 'echo 1 >
> /sys/block/<dev>/device/delete' would move things along..?
Unfortunately, no, it's not there.
> Note that I have no idea what effect that will have beyond removing the
> device node (so if it is still accessible now, it probably won't be
> after that command). I just tried it while doing I/O to a test device
> and it looked like it caused an fs shutdown, so it could be worth a try
> as a last resort before a system restart.
>
> Brian
Thanks again,
Chris
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
prev parent reply other threads:[~2015-11-17 22:16 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-11-17 8:03 Disk error, then endless loop Chris Dunlop
2015-11-17 12:41 ` Brian Foster
2015-11-17 16:28 ` Chris Dunlop
2015-11-17 17:37 ` Brian Foster
2015-11-17 19:35 ` Chris Dunlop
2015-11-17 20:21 ` Brian Foster
2015-11-17 20:34 ` Brian Foster
2015-11-17 22:16 ` Chris Dunlop [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20151117221609.GA3563@onthe.net.au \
--to=chris@onthe.net.au \
--cc=bfoster@redhat.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.