From mboxrd@z Thu Jan 1 00:00:00 1970 From: Heinz Mauelshagen Subject: Re: dm-raid: unsynced raid snapshot creation/deletion causes panic (work queue teardown race) Date: Tue, 12 May 2015 12:38:10 +0200 Message-ID: <5551D812.1050106@redhat.com> References: <1430310789-564-1-git-send-email-heinzm@redhat.com> <20150511203003.GB3804@redhat.com> Reply-To: device-mapper development Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20150511203003.GB3804@redhat.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: dm-devel-bounces@redhat.com Errors-To: dm-devel-bounces@redhat.com To: Mike Snitzer Cc: dm-devel@redhat.com List-Id: dm-devel.ids On 05/11/2015 10:30 PM, Mike Snitzer wrote: > On Wed, Apr 29 2015 at 8:33am -0400, > heinzm@redhat.com wrote: > >> From: Heinz Mauelshagen >> >> This patch avoids oopses caused by a callback racing >> with the destrution of a mapping. >> >> >> Signed-off-by: Heinz Mauelshagen >> Tested-by: Heinz Mauelshagen >> >> >> --- >> drivers/md/dm-raid.c | 18 ++++++++++++++++++ >> 1 file changed, 18 insertions(+) >> >> diff --git a/drivers/md/dm-raid.c b/drivers/md/dm-raid.c >> index 88e4c7f..fc4bc83 100644 >> --- a/drivers/md/dm-raid.c >> +++ b/drivers/md/dm-raid.c >> @@ -1614,6 +1614,17 @@ static void raid_presuspend(struct dm_target *ti) >> { >> struct raid_set *rs = ti->private; >> >> + /* >> + * Address a teardown race when calling >> + * raid_(pre|post)suspend followed by raid_dtr: >> + * >> + * MD's call chain md_stop_writes()->md_reap_sync_thread() >> + * causes work to be queued on the md_misc_wq queue >> + * not flushing it, hence the callback can occur after >> + * a potential destruction of the raid set causing an oops. >> + */ >> + rs->md.event_work.func = NULL; >> + >> md_stop_writes(&rs->md); >> } >> >> @@ -1684,6 +1695,13 @@ static void raid_resume(struct dm_target *ti) >> { >> struct raid_set *rs = ti->private; >> >> + /* >> + * See "Address a teardown race" in raid_presuspend() >> + * >> + * Reenable the worker function. >> + */ >> + rs->md.event_work.func = do_table_event; >> + >> set_bit(MD_CHANGE_DEVS, &rs->md.flags); >> if (!rs->bitmap_loaded) { >> bitmap_load(&rs->md); > The subject of this patch is strange (snapshot?) Mike, it was hit it with a test where a snapshot LV was created on a raid LV.. > > Anyway, what flushes the md_misc_wq before you set it to NULL? The work queue is empty when I set it to NULL, because recovery is still ongoing. The OOPs is being triggered because md_reap_sync_thread() being called from __md_stop_writes queues work again in case the worker function is not NULL, hence causing it to be called when the raid_dtr() has teared down the dm mapping already. > Shouldn't that happen as part of presuspend? > > Also, is it really safe to set this to NULL out from underneath MD? Rethinking based on your remark there may still be a race, because setting the worker function to NULL in raid_preresume() may occur after an md worker has queued event work for dm-raid already (e.g. in the cours of md error processng) but md_misc_wq has not yet been run. > Should a helper get exposed from MD with some extra locking? A lightweight solution could be an MD recovery flag (MD_RECOVERY_DONT_CALLBACK ?) set in raid_presuspend() preventing md_reap_sync_thread() from queueing work? I.e.: if (mddev->event_work.func && !test_bit(MD_RECOVERY_DONT_CALLBACK, &mddev->recovery)) queue_work(md_misc_wq, &mddev->event_work); Neil, what do you think about this idea? Heinz > > Mike