All of lore.kernel.org
 help / color / mirror / Atom feed
From: Omar Sandoval <osandov@osandov.com>
To: Nikolay Borisov <nborisov@suse.com>
Cc: linux-btrfs@vger.kernel.org, kernel-team@fb.com,
	Tejun Heo <tj@kernel.org>
Subject: Re: [PATCH] Btrfs: fix workqueue deadlock on dependent filesystems
Date: Wed, 7 Aug 2019 10:08:57 -0700	[thread overview]
Message-ID: <20190807170857.GA18948@vader> (raw)
In-Reply-To: <70f6b4fa-26b1-9225-7509-aa89bb7e067c@suse.com>

On Wed, Aug 07, 2019 at 10:17:26AM +0300, Nikolay Borisov wrote:
> 
> 
> On 6.08.19 г. 20:34 ч., Omar Sandoval wrote:
> > From: Omar Sandoval <osandov@fb.com>
> > 
> > We hit a the following very strange deadlock on a system with Btrfs on a
> > loop device backed by another Btrfs filesystem:
> > 
> > 1. The top (loop device) filesystem queues an async_cow work item from
> >    cow_file_range_async(). We'll call this work X.
> > 2. Worker thread A starts work X (normal_work_helper()).
> > 3. Worker thread A executes the ordered work for the top filesystem
> >    (run_ordered_work()).
> > 4. Worker thread A finishes the ordered work for work X and frees X
> >    (work->ordered_free()).
> > 5. Worker thread A executes another ordered work and gets blocked on I/O
> >    to the bottom filesystem (still in run_ordered_work()).
> > 6. Meanwhile, the bottom filesystem allocates and queues an async_cow
> >    work item which happens to be the recently-freed X.
> > 7. The workqueue code sees that X is already being executed by worker
> >    thread A, so it schedules X to be executed _after_ worker thread A
> >    finishes (see the find_worker_executing_work() call in
> >    process_one_work()).
> 
> Isn't the bigger problem  that a single run_ordered_work could
> potentially run the ordered work for more than one normal work? E.g.
> what if btrfs' code is reworked such that run_ordered_work executes
> ordered_func for just one work item (the one which called the function
> in the first place) ? Wouldn't that also resolve the issue? Correct me
> if I'm wrong but it seems silly to have one work item outlive
> ordered_free which is what currently happens, right?

We can't always run the ordered work for the normal work because then it
wouldn't be ordered :) If work item N completes before item N-1, then we
can't run the ordered work for N yet. Then, when N-1 completes, we need
to do the ordered work for N-1 _and_ N, which is how we get in this
situation.

  reply	other threads:[~2019-08-07 17:09 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-06 17:34 [PATCH] Btrfs: fix workqueue deadlock on dependent filesystems Omar Sandoval
2019-08-07  7:17 ` Nikolay Borisov
2019-08-07 17:08   ` Omar Sandoval [this message]
2019-08-12 11:38 ` Filipe Manana
2019-08-12 18:48   ` Omar Sandoval
2019-08-12 18:53     ` Filipe Manana
2019-08-19 16:37 ` David Sterba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190807170857.GA18948@vader \
    --to=osandov@osandov.com \
    --cc=kernel-team@fb.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=nborisov@suse.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.