From: Nikolay Borisov <n.borisov@siteground.com>
To: Mike Snitzer <snitzer@redhat.com>
Cc: Tejun Heo <tj@kernel.org>, Nikolay Borisov <kernel@kyup.com>,
"Linux-Kernel@Vger. Kernel. Org" <linux-kernel@vger.kernel.org>,
SiteGround Operations <operations@siteground.com>,
Alasdair Kergon <agk@redhat.com>,
device-mapper development <dm-devel@redhat.com>
Subject: Re: corruption causing crash in __queue_work
Date: Sat, 12 Dec 2015 13:49:50 +0200 [thread overview]
Message-ID: <566C09DE.9020802@siteground.com> (raw)
In-Reply-To: <20151211191400.GA24229@redhat.com>
On 12/11/2015 09:14 PM, Mike Snitzer wrote:
> On Fri, Dec 11 2015 at 1:00pm -0500,
> Nikolay Borisov <n.borisov@siteground.com> wrote:
>
>> On Fri, Dec 11, 2015 at 7:08 PM, Tejun Heo <tj@kernel.org> wrote:
>>>
>>> Hmmm... No idea why it didn't show up in the debug log but the only
>>> way a workqueue could be in the above state is either it got
>>> explicitly destroyed or somehow pwq refcnting is messed up, in both
>>> cases it should have shown up in the log.
>>>
>>> cc'ing dm people. Is there any chance dm-thinp could be using
>>> workqueue after destroying it?
>
> Not that I'm aware of. But never say never?
>
> Plus I'd think we'd see other dm-thinp specific use-after-free issues
> aside from the thin-pool's workqueue.
>
>> In __pool_destroy in dm-thin.c I don't see a call to
>> cancel_delayed_work before destroying the workqueue. Is it possible
>> that this is the causeI
>
> Cannot see how, __pool_destroy()'s destroy_workqueue() would spew a
> bunch of WARN_ONs (and the wq wouldn't be destroyed) if the workqueue
> had outstanding work.
>
> __pool_destroy() is called once the thin-pool's ref count drops to 0
> (see __pool_dec which is called when the thin-pool is removed --
> e.g. with 'dmsetup remove'). This code is only reachable when nothing
> else is using the thin-pool.
>
> And the thin-pool is only able to be removed if all thin devices that
> depend on it have first been removed. And each individual thin device
> waits for all outstanding IO before they can be removed.
Ok, I had a look at the code closer now and it indeed seems that when
the pool is suspended in its postsuspend callback the delay work is
indeed canceled and the workqueue is being flushed. But given that I see
those failures on at least 2-3 servers perday I doubt it it is a
hardware/machine-specific issue. Furthermore, the fact that it is always
a dm-thin queue that's being referenced points to the direction of
dm-thin, even though the code looks solid in that regard.
Regards,
Nikolay
next prev parent reply other threads:[~2015-12-12 11:49 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-12-09 12:08 corruption causing crash in __queue_work Nikolay Borisov
2015-12-09 16:08 ` Tejun Heo
2015-12-09 16:23 ` Nikolay Borisov
2015-12-09 16:27 ` Tejun Heo
2015-12-10 9:28 ` Nikolay Borisov
2015-12-10 15:29 ` Tejun Heo
2015-12-11 15:57 ` Nikolay Borisov
2015-12-11 17:08 ` Tejun Heo
2015-12-11 18:00 ` Nikolay Borisov
2015-12-11 19:14 ` Mike Snitzer
2015-12-12 11:49 ` Nikolay Borisov [this message]
2015-12-14 8:41 ` Nikolay Borisov
2015-12-14 15:31 ` Mike Snitzer
2015-12-14 20:11 ` Nikolay Borisov
2015-12-14 20:31 ` Mike Snitzer
2015-12-17 10:46 ` Nikolay Borisov
2015-12-17 15:33 ` Tejun Heo
2015-12-17 15:43 ` Nikolay Borisov
2015-12-17 15:50 ` Tejun Heo
2015-12-17 17:15 ` Mike Snitzer
[not found] ` <CAJFSNy5Lqv_xy7Lf1GEDPczHpZU8+a2CYCM-3ZR=VkDPJptmcg@mail.gmail.com>
2015-12-21 21:44 ` Tejun Heo
2015-12-21 21:45 ` Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=566C09DE.9020802@siteground.com \
--to=n.borisov@siteground.com \
--cc=agk@redhat.com \
--cc=dm-devel@redhat.com \
--cc=kernel@kyup.com \
--cc=linux-kernel@vger.kernel.org \
--cc=operations@siteground.com \
--cc=snitzer@redhat.com \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).