From: Zdenek Kabelac <zkabelac@redhat.com>
To: dm-devel@redhat.com, Adam Nielsen <a.nielsen@shikadi.net>
Subject: Re: How do you force-close a dm device after a disk failure?
Date: Wed, 16 Sep 2015 15:03:15 +0200 [thread overview]
Message-ID: <55F96893.2010201@redhat.com> (raw)
In-Reply-To: <20150916223512.40687a03@korath.teln.shikadi.net>
Dne 16.9.2015 v 14:35 Adam Nielsen napsal(a):
>>> It always seems to freeze at DM_DEV_SUSPEND. This ioctl never
>>> seems to return.
>>
>> As with any other kernel frozen task - try to capture kernel stack
>> trace. If you properly configured sysrq trigger - easiest is to use:
>>
>> 'echo t >/proc/sysrq-trigger'
>>
>> (Just make sure you have large enough kernel log buffer so lines are
>> not lost) Attach compressed trace - this should likely reveal where
>> it blocks. (I'll try to reproduce myself)
>
> Thanks for the advice. I'm getting a warning that the buffer is
> overflowing. Is there anything in particular you need? Here is
> something that seems relevant:
>
> dmsetup D ffff880394467b98 0 24732 24717 0x00000000
> ffff880394467b98 ffff88040d7a1e90 ffff88027b738a30 ffff88040ba67458
> ffff880394468000 ffff8801eaa7b8dc ffff88027b738a30 00000000ffffffff
> ffff8801eaa7b8e0 ffff880394467bb8 ffffffff81588247 ffff8801eaa7b8d8
> Call Trace:
> [<ffffffff81588247>] schedule+0x37/0x90
> [<ffffffff81588615>] schedule_preempt_disabled+0x15/0x20
> [<ffffffff81589b55>] __mutex_lock_slowpath+0xd5/0x150
> [<ffffffff81589beb>] mutex_lock+0x1b/0x30
> [<ffffffffa0857968>] dm_suspend+0x38/0xf0 [dm_mod]
> [<ffffffffa085d030>] ? table_load+0x370/0x370 [dm_mod]
> [<ffffffffa085d1c0>] dev_suspend+0x190/0x260 [dm_mod]
> [<ffffffffa085d030>] ? table_load+0x370/0x370 [dm_mod]
> [<ffffffffa085da72>] ctl_ioctl+0x232/0x520 [dm_mod]
> [<ffffffffa085dd73>] dm_ctl_ioctl+0x13/0x20 [dm_mod]
> [<ffffffff811f4606>] do_vfs_ioctl+0x2c6/0x4d0
> [<ffffffff811f4891>] SyS_ioctl+0x81/0xa0
> [<ffffffff8158beae>] system_call_fastpath+0x12/0x71
>
> Assuming 24732 is the PID, that's the "dmsetup suspend --noflush
> --nolockfs" one. There are heaps like the one above (from all my
> attempts) with only one like the following, from an unknown command
> line:
>
> dmsetup D ffff88012e2d7a88 0 28744 23911 0x00000004
> ffff88012e2d7a88 ffff88040d74f010 ffff88040398e5e0 ffff88012e2d7b38
> ffff88012e2d8000 ffff8800d9df5080 ffff8800d9df5068 ffffffff00000000
> fffffffe00000001 ffff88012e2d7aa8 ffffffff81588247 0000000000000002
> Call Trace:
> [<ffffffff81588247>] schedule+0x37/0x90
> [<ffffffff8158a885>] rwsem_down_write_failed+0x165/0x370
> [<ffffffff810b2ad6>] ? enqueue_entity+0x266/0xd60
> [<ffffffff812d7aa3>] call_rwsem_down_write_failed+0x13/0x20
> [<ffffffff8158a0d4>] ? down_write+0x24/0x40
> [<ffffffff811e3aee>] grab_super+0x2e/0xb0
> [<ffffffff811e4a20>] get_active_super+0x70/0x90
> [<ffffffff8121ab9d>] freeze_bdev+0x6d/0x100
> [<ffffffffa0854f3b>] __dm_suspend+0xeb/0x230 [dm_mod]
> [<ffffffffa08579fa>] dm_suspend+0xca/0xf0 [dm_mod]
> [<ffffffffa085d1db>] dev_suspend+0x1ab/0x260 [dm_mod]
> [<ffffffffa085d030>] ? table_load+0x370/0x370 [dm_mod]
> [<ffffffffa085da72>] ctl_ioctl+0x232/0x520 [dm_mod]
> [<ffffffffa085dd73>] dm_ctl_ioctl+0x13/0x20 [dm_mod]
> [<ffffffff811f4606>] do_vfs_ioctl+0x2c6/0x4d0
> [<ffffffff811f4891>] SyS_ioctl+0x81/0xa0
> [<ffffffff8158beae>] system_call_fastpath+0x12/0x71
Was this the 'ONLY' dmsetup in your listing (i.e. you reproduced case again)?
I mean - your existing reported situation was already hopeless and needed
reboot - as if flushing suspend holds some mutexes - no other suspend call
can fix it -> you usually have just 1 chance to fix it in right way,
if you go wrong way reboot is unavoidable.
Zdenek
next prev parent reply other threads:[~2015-09-16 13:03 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-09-14 0:29 How do you force-close a dm device after a disk failure? Adam Nielsen
2015-09-14 6:43 ` Zdenek Kabelac
2015-09-14 8:59 ` Adam Nielsen
2015-09-14 9:16 ` Zdenek Kabelac
2015-09-14 9:45 ` Adam Nielsen
2015-09-14 10:04 ` Zdenek Kabelac
2015-09-16 0:58 ` Adam Nielsen
2015-09-16 8:04 ` Zdenek Kabelac
2015-09-16 12:35 ` Adam Nielsen
2015-09-16 13:03 ` Zdenek Kabelac [this message]
2015-09-19 9:47 ` Adam Nielsen
2015-09-21 11:39 ` Lars Ellenberg
2015-09-21 17:50 ` Zdenek Kabelac
2015-09-17 11:41 ` Zdenek Kabelac
2015-09-17 14:04 ` Lars Ellenberg
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=55F96893.2010201@redhat.com \
--to=zkabelac@redhat.com \
--cc=a.nielsen@shikadi.net \
--cc=dm-devel@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).