All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff Mahoney <jeffm@suse.com>
To: fdmanana@gmail.com
Cc: "linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>
Subject: Re: [PATCH 5/7] btrfs: explictly delete unused block groups in close_ctree and ro-remount
Date: Wed, 17 Jun 2015 10:36:51 -0400	[thread overview]
Message-ID: <55818603.1080109@suse.com> (raw)
In-Reply-To: <558184E1.5040103@suse.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 6/17/15 10:32 AM, Jeff Mahoney wrote:
> On 6/17/15 9:24 AM, Filipe David Manana wrote:
>> On Wed, Jun 17, 2015 at 11:04 AM, Filipe David Manana 
>> <fdmanana@gmail.com> wrote:
>>> On Mon, Jun 15, 2015 at 2:41 PM,  <jeffm@suse.com> wrote:
>>>> From: Jeff Mahoney <jeffm@suse.com>
>>>> 
>>>> The cleaner thread may already be sleeping by the time we 
>>>> enter close_ctree.  If that's the case, we'll skip removing
>>>> any unused block groups queued for removal, even during a
>>>> normal umount. They'll be cleaned up automatically at next
>>>> mount, but users expect a umount to be a clean
>>>> synchronization point, especially when used on
>>>> thin-provisioned storage with -odiscard.  We also explicitly
>>>> remove unused block groups in the ro-remount path for the
>>>> same reason.
>>>> 
>>>> Signed-off-by: Jeff Mahoney <jeffm@suse.com>
>>> Reviewed-by: Filipe Manana <fdmanana@suse.com> Tested-by:
>>> Filipe Manana <fdmanana@suse.com>
>>> 
>>>> --- fs/btrfs/disk-io.c |  9 +++++++++ fs/btrfs/super.c   |
>>>> 11 +++++++++++ 2 files changed, 20 insertions(+)
>>>> 
>>>> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 
>>>> 2ef9a4b..2e47fef 100644 --- a/fs/btrfs/disk-io.c +++ 
>>>> b/fs/btrfs/disk-io.c @@ -3710,6 +3710,15 @@ void 
>>>> close_ctree(struct btrfs_root *root) 
>>>> cancel_work_sync(&fs_info->async_reclaim_work);
>>>> 
>>>> if (!(fs_info->sb->s_flags & MS_RDONLY)) { +               /*
>>>> + * If the cleaner thread is stopped and there are + * block
>>>> groups queued for removal, the deletion will be + * skipped
>>>> when we quit the cleaner thread. +                */ +
>>>> mutex_lock(&root->fs_info->cleaner_mutex); + 
>>>> btrfs_delete_unused_bgs(root->fs_info); + 
>>>> mutex_unlock(&root->fs_info->cleaner_mutex); + ret = 
>>>> btrfs_commit_super(root); if (ret) btrfs_err(fs_info,
>>>> "commit super ret %d", ret); diff --git a/fs/btrfs/super.c 
>>>> b/fs/btrfs/super.c index 9e66f5e..2ccd8d4 100644 --- 
>>>> a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -1539,6
>>>> +1539,17 @@ static int btrfs_remount(struct super_block *sb,
>>>> int *flags, char *data)
>>>> 
>>>> sb->s_flags |= MS_RDONLY;
>>>> 
>>>> +               /* +                * Setting MS_RDONLY will 
>>>> put the cleaner thread to +                * sleep at the
>>>> next loop if it's already active. +                * If it's
>>>> already asleep, we'll leave unused block +                *
>>>> groups on disk until we're mounted read-write again +
>>>> * unless we clean them up here. +                */ + 
>>>> mutex_lock(&root->fs_info->cleaner_mutex); + 
>>>> btrfs_delete_unused_bgs(fs_info); + 
>>>> mutex_unlock(&root->fs_info->cleaner_mutex);
> 
>> So actually, this allows for a deadlock after the patch I sent
>> out last week:
> 
>> https://patchwork.kernel.org/patch/6586811/
> 
>> In that patch delete_unused_bgs is no longer called under the 
>> cleaner_mutex, and making it so, will cause a deadlock with/ru 
>> relocation.
> 
>> Even without that patch, I don't think you need using this mutex
>>  anyway - no 2 tasks running this function can get the same bg
>> from the fs_info->unused_bgs list.
> 
> I was hitting crashes during umount when xfstests would do
> remount-ro and umount in quick succession.  I can go back and
> confirm this, but I believe I was encountering a race between the
> cleaner thread and umount after being set read-only.  It didn't
> trigger all the time.  My hypothesis is that if the cleaner thread
> was running and had a lot of work to do, it could start before set
> MS_RDONLY and still be performing work through the remount and into
> the umount.  Ro-remount would have set MS_RDONLY so we skip the
> btrfs_super_commit in close_ctree and then blow up afterwards.
> 
> Taking the cleaner mutex means we either wait until the cleaner
> thread has finished or we put it to sleep on the next loop before
> it does anything.  In either case, it's safe.  It could just has
> easily been:
> 
> mutex_lock(&root->fs_info->cleaner_mutex); 
> mutex_unlock(&root->fs_info->cleaner_mutex);
> 
> btrfs_delete_unused_bgs(fs_info);
> 
> I think it actually was in a previous version I was testing.  It 
> probably should go back to that version so that we don't end up 
> confusing it with the new mutex you introduced in your patch.

It looks like your:
[PATCH] Btrfs: fix crash on close_ctree() if cleaner starts new
transaction

would also fix this in a more general case.  We can drop taking the
cleaner mutex here.

- -Jeff

- -- 
Jeff Mahoney
SUSE Labs
-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.19 (Darwin)

iQIcBAEBAgAGBQJVgYYDAAoJEB57S2MheeWyxIcQAIGwFvP1bL4C8Oa3WyFL/tjE
QITNDQZGYXEKfFqRWdHEAeFJ8kv234xo/tx7Ml0Txd8DFrqzDwXSxv6deLzDiiTT
gymMdBKO3x7TLKZTxnyDXYEUDHM72IMOUS2el3wOOsc61rL1KajFEWySGtAA80pk
bIUH6uosRTXhpXBRe080mc9XPhtfIQyCC8nroJHYazNwT3VWrvbhDaZPM3npNttj
5glsCz7ieseiWKqFCIlYC5yCgpst79U7D8M75Jo0yslvtZNpZOMR3YhvyQakj5hG
p/CFRfbdFGnl3wKv+ACyu7XlewqoA9LwkB5Sbjzd4XbS3n7J4gch043b+BbIl2SA
VghNTTI+tm7KKvMa3fghtedooVYu6DjdhU58VEWOBtHaDiWntSmd0FqzUCqAotxC
fwEmMWCWCWR1E0etRUrnbO1DGltkR38ost7cvXOPXUUdvv3Hy22mTfWW73YwsWXW
kwmG2V+IdgOWHDMxQCnj55/NbYep+/TiVjDPJnOuCn8tD5Tw+zHxtRbXhVcyKpGj
jJXKb9uxDhKpsisz8HQJHf1uMLFJ3qzCqgYxysbc2PqlzylFfY2aefYWSPmrE6y4
6OJW7gTr75PzrGGm7gM1sPiPQLuFNEFBEi0Ak7ad6Q6SAAV339r+h00sg4Q1adVu
2JedYHUeFDUjAGAgft0G
=SQ/a
-----END PGP SIGNATURE-----

  reply	other threads:[~2015-06-17 14:36 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-15 13:45 [PATCH v5] btrfs: fix automatic blockgroup remove + discard Jeff Mahoney
2015-06-15 13:41 ` [PATCH 1/7] btrfs: make btrfs_issue_discard return bytes discarded jeffm
2015-06-17  9:53   ` Filipe David Manana
2015-06-15 13:41 ` [PATCH 2/7] btrfs: btrfs_issue_discard ensure offset/length are aligned to sector boundaries jeffm
2015-06-17  9:55   ` Filipe David Manana
2015-06-15 13:41 ` [PATCH 3/7] btrfs: skip superblocks during discard jeffm
2015-06-17  9:57   ` Filipe David Manana
2015-06-15 13:41 ` [PATCH 4/7] btrfs: iterate over unused chunk space in FITRIM jeffm
2015-06-17 10:03   ` Filipe David Manana
2015-06-15 13:41 ` [PATCH 5/7] btrfs: explictly delete unused block groups in close_ctree and ro-remount jeffm
2015-06-17 10:04   ` Filipe David Manana
2015-06-17 13:24     ` Filipe David Manana
2015-06-17 14:32       ` Jeff Mahoney
2015-06-17 14:36         ` Jeff Mahoney [this message]
2015-06-17 14:37           ` Filipe David Manana
2015-06-15 13:41 ` [PATCH 6/7] btrfs: add missing discards when unpinning extents with -o discard jeffm
2015-06-17 10:07   ` Filipe David Manana
2015-06-15 13:41 ` [PATCH 7/7] btrfs: cleanup, stop casting for extent_map->lookup everywhere jeffm
2015-06-17 10:08   ` Filipe David Manana

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55818603.1080109@suse.com \
    --to=jeffm@suse.com \
    --cc=fdmanana@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.