linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Filipe Manana <fdmanana@gmail.com>
To: Justin Maggard <jmaggard10@gmail.com>
Cc: "linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>,
	Justin Maggard <jmaggard@netgear.com>
Subject: Re: [PATCH v2] btrfs: qgroup: exit the rescan worker during umount
Date: Thu, 8 Oct 2015 10:25:13 +0100	[thread overview]
Message-ID: <CAL3q7H6QCVt0=bR_mOFiFzz5uengnbYGbvYch3MW3AZUto8PqQ@mail.gmail.com> (raw)
In-Reply-To: <1441242317-16547-1-git-send-email-jmaggard@netgear.com>

On Thu, Sep 3, 2015 at 2:05 AM, Justin Maggard <jmaggard10@gmail.com> wrote:
> v2: Fix stupid error while making formatting changes...
>
> I was hitting a consistent NULL pointer dereference during shutdown that
> showed the trace running through end_workqueue_bio().  I traced it back to
> the endio_meta_workers workqueue being poked after it had already been
> destroyed.
>
> Eventually I found that the root cause was a qgroup rescan that was still
> in progress while we were stopping all the btrfs workers.
>
> Currently we explicitly pause balance and scrub operations in
> close_ctree(), but we do nothing to stop the qgroup rescan.  We should
> probably be doing the same for qgroup rescan, but that's a much larger
> change.  This small change is good enough to allow me to unmount without
> crashing.
>
> Signed-off-by: Justin Maggard <jmaggard@netgear.com>
> ---
>  fs/btrfs/qgroup.c | 9 ++++++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
> index d904ee1..5bfcee9 100644
> --- a/fs/btrfs/qgroup.c
> +++ b/fs/btrfs/qgroup.c
> @@ -2278,7 +2278,7 @@ static void btrfs_qgroup_rescan_worker(struct btrfs_work *work)
>                 goto out;
>
>         err = 0;
> -       while (!err) {
> +       while (!err && !btrfs_fs_closing(fs_info)) {
>                 trans = btrfs_start_transaction(fs_info->fs_root, 0);
>                 if (IS_ERR(trans)) {
>                         err = PTR_ERR(trans);
> @@ -2301,7 +2301,8 @@ out:
>         btrfs_free_path(path);
>
>         mutex_lock(&fs_info->qgroup_rescan_lock);
> -       fs_info->qgroup_flags &= ~BTRFS_QGROUP_STATUS_FLAG_RESCAN;
> +       if (!btrfs_fs_closing(fs_info))
> +               fs_info->qgroup_flags &= ~BTRFS_QGROUP_STATUS_FLAG_RESCAN;
>
>         if (err > 0 &&
>             fs_info->qgroup_flags & BTRFS_QGROUP_STATUS_FLAG_INCONSISTENT) {
> @@ -2330,7 +2331,9 @@ out:
>         }
>         btrfs_end_transaction(trans, fs_info->quota_root);
>
> -       if (err >= 0) {
> +       if (btrfs_fs_closing(fs_info)) {
> +               btrfs_info(fs_info, "qgroup scan paused");
> +       } else if (err >= 0) {
>                 btrfs_info(fs_info, "qgroup scan completed%s",
>                         err > 0 ? " (inconsistency flag cleared)" : "");
>         } else {

Justin, this is still racy (however much less racy than before).

Once we leave the loop because of the condition
btrfs_fs_closing(fs_info), we start a transaction and do some write
operation on the quota btree. While or before we do such write
operation, close_ctree() might have completed or be at a point where
such write operation will result in another null pointer dereference,
or accessing some dangling pointer, or leak a transaction that never
gets committed (because close_ctree() already stopped the transaction
kthread), etc, etc.

So in addition to what you did, you need to call
btrfs_qgroup_wait_for_completion(fs_info) at disk-io.c:close_ctree()
right after setting fs_info->closing to 1.

Otherwise it looks good.
Thanks.


> --
> 2.5.1
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Filipe David Manana,

"Reasonable men adapt themselves to the world.
 Unreasonable men adapt the world to themselves.
 That's why all progress depends on unreasonable men."

      parent reply	other threads:[~2015-10-08  9:25 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-03  1:05 [PATCH v2] btrfs: qgroup: exit the rescan worker during umount Justin Maggard
2015-09-22 14:45 ` David Sterba
2015-09-26  0:25   ` Justin Maggard
2015-09-26 11:49     ` Filipe Manana
2015-10-08  9:25 ` Filipe Manana [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAL3q7H6QCVt0=bR_mOFiFzz5uengnbYGbvYch3MW3AZUto8PqQ@mail.gmail.com' \
    --to=fdmanana@gmail.com \
    --cc=jmaggard10@gmail.com \
    --cc=jmaggard@netgear.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).