All of lore.kernel.org
 help / color / mirror / Atom feed
From: Zdenek Kabelac <zkabelac@redhat.com>
To: device-mapper development <dm-devel@redhat.com>
Subject: Re: I/O block when removing thin device on the same pool
Date: Fri, 22 Jan 2016 14:58:07 +0100	[thread overview]
Message-ID: <56A2356F.2040705@redhat.com> (raw)
In-Reply-To: <20160122133828.GN26774@soda.linbit>

Dne 22.1.2016 v 14:38 Lars Ellenberg napsal(a):
> On Thu, Jan 21, 2016 at 02:44:06PM -0500, Mike Snitzer wrote:
>>>> Dne 20.1.2016 v 11:05 Dennis Yang napsal(a):
>>>>>
>>>>> Hi,
>>>>>
>>>>> I had noticed that I/O requests to one thin device will be blocked
>>>>> when the other thin device is being deleting. The root cause of this
>>>>> is that to delete a thin device will eventually call dm_btree_del()
>>>>> which is a slow function and can block. This means that the device
>>>>> deleting process will need to hold the pool lock for a very long time
>>>>> to wait for this function to delete the whole data mapping subtree.
>>>>> Since I/O to the devices on the same pool needs to held the same pool
>>>>> lock to lookup/insert/delete data mapping, all I/O will be blocked
>>>>> until the delete process finish.
>>>>>
>>>>> For now, I have to discard all the mappings of a thin device before
>>>>> deleting it to prevent I/O from being blocked. Since these discard
>>>>> requests not only take lots of time to finish but hurt the pool I/O
>>>>> throughput, I am still looking for other better solutions to fix this
>>>>> issue.
>>>>>
>>>>> I think the main problem is still the big pool lock in dm-thin which
>>>>> hurts both the scalability and performance of. I am wondering if there
>>>>> is any plan on improving this or any better fix for the I/O block
>>>>> problem.
>>
>> Just so I'm aware: which kernel are you using?
>>
>> dm_pool_delete_thin_device() takes pmd->root_lock so yes it is very
>> coarse-grained; especially when you consider concurrent IO to another
>> thin device from the same pool will call interfaces, like
>> dm_thin_find_block(), which also take the same pmd->root_lock.
>
> We have seen lvremove of thin snapshots sometimes minutes,
> even ~20 minutes before.
> So that means blocking IO to other devices in that pool
> (e.g. the typically currently in-use "origin") for minutes.
>
> That was, iirc, with ~10 TB origin, mostly allocated,
> tens of "rotating" snapshots, 64k chunk size,
> and considerable random write change rate on the origin.
>
> I'd like to propose a different approach for lvremove of thin devices
> (using "made up terms" instead of the correct device mapper vocabulary,
> because I'm lazy):
> on lvremove of a thin device, take all the locks you need,
> even if that implies blocking IO to other devices,
> BUT
> then don't do all the "delete" right there while holding those
> locks, but convert the device into a "i-am-currently-removing-myself"
> target, and release all the locks. That should be fast (enough).
>
> Then this "i-am-currently-removing-myself" target would have its .open()
> return some error, so it cannot even be opened anymore (or something
> with similar effect), start some kernel thread that does the actual
> "wipe" and "unref/unmap" from the tree and all that stuff "in the
> background", using much finer granular temporary locking for each
> processed region.
>
> If that then takes 20 minutes, someone may still care, but at least it
> does not block IO to the other active devices in the pool.
>
> Or is something like this already going on?
>

Hi

Please always specify kernel in-use.
Eventually retry with last officially released one (e.g. 4.4)
There were number of improvements in speed of discard.

Also - you may try to use thin-pool with '--discards  nopassdown'
(or even ignore) in case TRIM is very limiting factor
(with impacting free space in thin-pool for 'ignore' one)

Zdenek

  reply	other threads:[~2016-01-22 13:58 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-20 10:05 I/O block when removing thin device on the same pool Dennis Yang
2016-01-20 11:27 ` Zdenek Kabelac
2016-01-20 16:17   ` Dennis Yang
2016-01-21 17:33   ` Nikolay Borisov
2016-01-21 19:44     ` Mike Snitzer
2016-01-22 13:38       ` Lars Ellenberg
2016-01-22 13:58         ` Zdenek Kabelac [this message]
2016-01-22 16:07           ` Mike Snitzer
2016-01-22 16:43         ` Joe Thornber
2016-01-25  9:13           ` Dennis Yang
2016-01-26 16:19             ` Joe Thornber
2016-01-27  4:51               ` Dennis Yang
2016-01-28 10:44                 ` Joe Thornber
2016-01-29 11:01                   ` Dennis Yang
2016-01-29 16:05                     ` Joe Thornber
2016-02-01  3:52                       ` Dennis Yang
2016-01-29 14:50           ` Lars Ellenberg
2016-01-29 16:04             ` Joe Thornber
2016-02-01 17:40               ` Lars Ellenberg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56A2356F.2040705@redhat.com \
    --to=zkabelac@redhat.com \
    --cc=dm-devel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.