From: Zdenek Kabelac <zkabelac@redhat.com>
To: dm-devel@redhat.com, Adam Nielsen <a.nielsen@shikadi.net>
Subject: Re: How do you force-close a dm device after a disk failure?
Date: Wed, 16 Sep 2015 15:03:15 +0200 [thread overview]
Message-ID: <55F96893.2010201@redhat.com> (raw)
In-Reply-To: <20150916223512.40687a03@korath.teln.shikadi.net>
Dne 16.9.2015 v 14:35 Adam Nielsen napsal(a):
>>> It always seems to freeze at DM_DEV_SUSPEND. This ioctl never
>>> seems to return.
>>
>> As with any other kernel frozen task - try to capture kernel stack
>> trace. If you properly configured sysrq trigger - easiest is to use:
>>
>> 'echo t >/proc/sysrq-trigger'
>>
>> (Just make sure you have large enough kernel log buffer so lines are
>> not lost) Attach compressed trace - this should likely reveal where
>> it blocks. (I'll try to reproduce myself)
>
> Thanks for the advice. I'm getting a warning that the buffer is
> overflowing. Is there anything in particular you need? Here is
> something that seems relevant:
>
> dmsetup D ffff880394467b98 0 24732 24717 0x00000000
> ffff880394467b98 ffff88040d7a1e90 ffff88027b738a30 ffff88040ba67458
> ffff880394468000 ffff8801eaa7b8dc ffff88027b738a30 00000000ffffffff
> ffff8801eaa7b8e0 ffff880394467bb8 ffffffff81588247 ffff8801eaa7b8d8
> Call Trace:
> [<ffffffff81588247>] schedule+0x37/0x90
> [<ffffffff81588615>] schedule_preempt_disabled+0x15/0x20
> [<ffffffff81589b55>] __mutex_lock_slowpath+0xd5/0x150
> [<ffffffff81589beb>] mutex_lock+0x1b/0x30
> [<ffffffffa0857968>] dm_suspend+0x38/0xf0 [dm_mod]
> [<ffffffffa085d030>] ? table_load+0x370/0x370 [dm_mod]
> [<ffffffffa085d1c0>] dev_suspend+0x190/0x260 [dm_mod]
> [<ffffffffa085d030>] ? table_load+0x370/0x370 [dm_mod]
> [<ffffffffa085da72>] ctl_ioctl+0x232/0x520 [dm_mod]
> [<ffffffffa085dd73>] dm_ctl_ioctl+0x13/0x20 [dm_mod]
> [<ffffffff811f4606>] do_vfs_ioctl+0x2c6/0x4d0
> [<ffffffff811f4891>] SyS_ioctl+0x81/0xa0
> [<ffffffff8158beae>] system_call_fastpath+0x12/0x71
>
> Assuming 24732 is the PID, that's the "dmsetup suspend --noflush
> --nolockfs" one. There are heaps like the one above (from all my
> attempts) with only one like the following, from an unknown command
> line:
>
> dmsetup D ffff88012e2d7a88 0 28744 23911 0x00000004
> ffff88012e2d7a88 ffff88040d74f010 ffff88040398e5e0 ffff88012e2d7b38
> ffff88012e2d8000 ffff8800d9df5080 ffff8800d9df5068 ffffffff00000000
> fffffffe00000001 ffff88012e2d7aa8 ffffffff81588247 0000000000000002
> Call Trace:
> [<ffffffff81588247>] schedule+0x37/0x90
> [<ffffffff8158a885>] rwsem_down_write_failed+0x165/0x370
> [<ffffffff810b2ad6>] ? enqueue_entity+0x266/0xd60
> [<ffffffff812d7aa3>] call_rwsem_down_write_failed+0x13/0x20
> [<ffffffff8158a0d4>] ? down_write+0x24/0x40
> [<ffffffff811e3aee>] grab_super+0x2e/0xb0
> [<ffffffff811e4a20>] get_active_super+0x70/0x90
> [<ffffffff8121ab9d>] freeze_bdev+0x6d/0x100
> [<ffffffffa0854f3b>] __dm_suspend+0xeb/0x230 [dm_mod]
> [<ffffffffa08579fa>] dm_suspend+0xca/0xf0 [dm_mod]
> [<ffffffffa085d1db>] dev_suspend+0x1ab/0x260 [dm_mod]
> [<ffffffffa085d030>] ? table_load+0x370/0x370 [dm_mod]
> [<ffffffffa085da72>] ctl_ioctl+0x232/0x520 [dm_mod]
> [<ffffffffa085dd73>] dm_ctl_ioctl+0x13/0x20 [dm_mod]
> [<ffffffff811f4606>] do_vfs_ioctl+0x2c6/0x4d0
> [<ffffffff811f4891>] SyS_ioctl+0x81/0xa0
> [<ffffffff8158beae>] system_call_fastpath+0x12/0x71
Was this the 'ONLY' dmsetup in your listing (i.e. you reproduced case again)?
I mean - your existing reported situation was already hopeless and needed
reboot - as if flushing suspend holds some mutexes - no other suspend call
can fix it -> you usually have just 1 chance to fix it in right way,
if you go wrong way reboot is unavoidable.
Zdenek
next prev parent reply other threads:[~2015-09-16 13:03 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-09-14 0:29 How do you force-close a dm device after a disk failure? Adam Nielsen
2015-09-14 6:43 ` Zdenek Kabelac
2015-09-14 8:59 ` Adam Nielsen
2015-09-14 9:16 ` Zdenek Kabelac
2015-09-14 9:45 ` Adam Nielsen
2015-09-14 10:04 ` Zdenek Kabelac
2015-09-16 0:58 ` Adam Nielsen
2015-09-16 8:04 ` Zdenek Kabelac
2015-09-16 12:35 ` Adam Nielsen
2015-09-16 13:03 ` Zdenek Kabelac [this message]
2015-09-19 9:47 ` Adam Nielsen
2015-09-21 11:39 ` Lars Ellenberg
2015-09-21 17:50 ` Zdenek Kabelac
2015-09-17 11:41 ` Zdenek Kabelac
2015-09-17 14:04 ` Lars Ellenberg
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=55F96893.2010201@redhat.com \
--to=zkabelac@redhat.com \
--cc=a.nielsen@shikadi.net \
--cc=dm-devel@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.