From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.kernel.org ([198.145.29.99]:56722 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726674AbeKLIKO (ORCPT ); Mon, 12 Nov 2018 03:10:14 -0500 Date: Sun, 11 Nov 2018 17:20:15 -0500 From: Sasha Levin To: Corey Wright Cc: stable@vger.kernel.org Subject: Re: [PATCH 3.18 0/1] dm: remove duplicate dm_get_live_table() in __dm_destroy() Message-ID: <20181111222015.GC2642@sasha-vm> References: <20181111020654.b09fd2621617745e39ba694c@pobox.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Disposition: inline In-Reply-To: <20181111020654.b09fd2621617745e39ba694c@pobox.com> Sender: stable-owner@vger.kernel.org List-ID: On Sun, Nov 11, 2018 at 02:06:54AM -0600, Corey Wright wrote: >The recently released stable version 3.18.125 introduced a deadlock >because dm_get_live_table() is called twice within __dm_destroy(). > >The backported commit e1db66a5 "dm: fix AB-BA deadlock in >__dm_destroy()" doesn't *move* the dm_get_live_table() call from >before the mutex_lock(), as the original commit 2a708cff does, but >instead *adds* a new dm_get_live_table() call after the mutex_lock(). >The two dm_get_live_table() calls result in a deadlock: > >[ 311.291323] INFO: task cryptsetup:209 blocked for more than 120 seconds. >[ 311.420925] Not tainted 3.18.125+1-amd64 #1 >[ 311.559858] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >[ 311.651116] cryptsetup D 0000000000000000 0 209 203 0x00000000 >[ 311.732304] ffff88007abfd5f0 0000000000000082 0000000000000001 ffff88007a470390 >[ 311.873420] 00000000000136c0 ffff88007a78bfd8 00000000000136c0 ffff88007abfd5f0 >[ 311.934275] 0000000000000001 ffff88007a78bc70 7fffffffffffffff ffff88007a78bc68 >[ 311.940115] Call Trace: >[ 311.949956] [] ? dev_suspend+0x260/0x260 [dm_mod] >[ 312.179891] [] ? schedule_timeout+0x24a/0x2d0 >[ 312.375447] [] ? __wake_up+0x34/0x50 >[ 312.377825] [] ? srcu_readers_seq_idx.isra.8+0x54/0x70 >[ 312.557921] [] ? wait_for_completion+0xb0/0x120 >[ 312.561314] [] ? wake_up_state+0x20/0x20 >[ 312.664457] [] ? __synchronize_srcu+0xd8/0x120 >[ 312.768794] [] ? call_srcu+0x70/0x70 >[ 312.790337] [] ? __dm_destroy+0x107/0x2e0 [dm_mod] >[ 312.909878] [] ? dev_suspend+0x260/0x260 [dm_mod] >[ 312.978804] [] ? dev_remove+0xde/0x120 [dm_mod] >[ 313.082322] [] ? ctl_ioctl+0x203/0x4c0 [dm_mod] >[ 313.175957] [] ? dm_ctl_ioctl+0x13/0x20 [dm_mod] >[ 313.301981] [] ? do_vfs_ioctl+0x2d0/0x4a0 >[ 313.384648] [] ? task_work_run+0xbc/0xf0 >[ 313.489669] [] ? SyS_ioctl+0x81/0xa0 >[ 313.510846] [] ? system_call_fastpath+0x16/0x1b > >Removing the original dm_get_live_table() call from before the >mutex_lock() prevents the deadlock. > >Thanks for maintaining 3.18! > >PS Greg, Was this a subtle attempt to get someone to speak up and say > "I am using this!" as you requested in the 3.18.125 release > announcement? ;) Hm, interesting. It looks like git did the wrong thing here, sorry for that :( -- Thanks, Sasha