From: Lars Ellenberg <lars.ellenberg@linbit.com>
To: dm-devel@redhat.com
Subject: Re: I/O block when removing thin device on the same pool
Date: Fri, 22 Jan 2016 14:38:28 +0100 [thread overview]
Message-ID: <20160122133828.GN26774@soda.linbit> (raw)
In-Reply-To: <20160121194405.GA22766@redhat.com>
On Thu, Jan 21, 2016 at 02:44:06PM -0500, Mike Snitzer wrote:
> > > Dne 20.1.2016 v 11:05 Dennis Yang napsal(a):
> > >>
> > >> Hi,
> > >>
> > >> I had noticed that I/O requests to one thin device will be blocked
> > >> when the other thin device is being deleting. The root cause of this
> > >> is that to delete a thin device will eventually call dm_btree_del()
> > >> which is a slow function and can block. This means that the device
> > >> deleting process will need to hold the pool lock for a very long time
> > >> to wait for this function to delete the whole data mapping subtree.
> > >> Since I/O to the devices on the same pool needs to held the same pool
> > >> lock to lookup/insert/delete data mapping, all I/O will be blocked
> > >> until the delete process finish.
> > >>
> > >> For now, I have to discard all the mappings of a thin device before
> > >> deleting it to prevent I/O from being blocked. Since these discard
> > >> requests not only take lots of time to finish but hurt the pool I/O
> > >> throughput, I am still looking for other better solutions to fix this
> > >> issue.
> > >>
> > >> I think the main problem is still the big pool lock in dm-thin which
> > >> hurts both the scalability and performance of. I am wondering if there
> > >> is any plan on improving this or any better fix for the I/O block
> > >> problem.
>
> Just so I'm aware: which kernel are you using?
>
> dm_pool_delete_thin_device() takes pmd->root_lock so yes it is very
> coarse-grained; especially when you consider concurrent IO to another
> thin device from the same pool will call interfaces, like
> dm_thin_find_block(), which also take the same pmd->root_lock.
We have seen lvremove of thin snapshots sometimes minutes,
even ~20 minutes before.
So that means blocking IO to other devices in that pool
(e.g. the typically currently in-use "origin") for minutes.
That was, iirc, with ~10 TB origin, mostly allocated,
tens of "rotating" snapshots, 64k chunk size,
and considerable random write change rate on the origin.
I'd like to propose a different approach for lvremove of thin devices
(using "made up terms" instead of the correct device mapper vocabulary,
because I'm lazy):
on lvremove of a thin device, take all the locks you need,
even if that implies blocking IO to other devices,
BUT
then don't do all the "delete" right there while holding those
locks, but convert the device into a "i-am-currently-removing-myself"
target, and release all the locks. That should be fast (enough).
Then this "i-am-currently-removing-myself" target would have its .open()
return some error, so it cannot even be opened anymore (or something
with similar effect), start some kernel thread that does the actual
"wipe" and "unref/unmap" from the tree and all that stuff "in the
background", using much finer granular temporary locking for each
processed region.
If that then takes 20 minutes, someone may still care, but at least it
does not block IO to the other active devices in the pool.
Or is something like this already going on?
Lars Ellenberg
next prev parent reply other threads:[~2016-01-22 13:38 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-01-20 10:05 I/O block when removing thin device on the same pool Dennis Yang
2016-01-20 11:27 ` Zdenek Kabelac
2016-01-20 16:17 ` Dennis Yang
2016-01-21 17:33 ` Nikolay Borisov
2016-01-21 19:44 ` Mike Snitzer
2016-01-22 13:38 ` Lars Ellenberg [this message]
2016-01-22 13:58 ` Zdenek Kabelac
2016-01-22 16:07 ` Mike Snitzer
2016-01-22 16:43 ` Joe Thornber
2016-01-25 9:13 ` Dennis Yang
2016-01-26 16:19 ` Joe Thornber
2016-01-27 4:51 ` Dennis Yang
2016-01-28 10:44 ` Joe Thornber
2016-01-29 11:01 ` Dennis Yang
2016-01-29 16:05 ` Joe Thornber
2016-02-01 3:52 ` Dennis Yang
2016-01-29 14:50 ` Lars Ellenberg
2016-01-29 16:04 ` Joe Thornber
2016-02-01 17:40 ` Lars Ellenberg
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160122133828.GN26774@soda.linbit \
--to=lars.ellenberg@linbit.com \
--cc=dm-devel@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.