public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Tejun Heo <tj@kernel.org>
Cc: Eric Sandeen <sandeen@redhat.com>,
	Eric Sandeen <sandeen@sandeen.net>, xfs-oss <xfs@oss.sgi.com>
Subject: Re: [PATCH 2/2] xfs: mark the xfs-alloc workqueue as high priority
Date: Mon, 12 Jan 2015 14:30:15 +1100	[thread overview]
Message-ID: <20150112033015.GB29484@destitution> (raw)
In-Reply-To: <20150110174133.GC25319@htj.dyndns.org>

On Sat, Jan 10, 2015 at 12:41:33PM -0500, Tejun Heo wrote:
> Hello, Dave.
> 
> On Sat, Jan 10, 2015 at 10:28:15AM +1100, Dave Chinner wrote:
> > process A			kworker (1..N)
> > ilock(excl)
> > alloc
> >   queue work(allocwq)
> >     (work queued as no kworker
> >      threads available)
> >      				execute work from xfsbuf-wq
> >      				xfs_end_io
> > 				  ilock(excl)
> > 				    (blocks waiting on queued work)
> > 
> > No new kworkers are started, so the queue never makes progress,
> > we deadlock.
> 
> But allocwq is a separate workqueue from xfsbuf-wq and should have its
> own rescuer.  The work item queued by process on A is guaranteed to
> make forward progress no matter what work items on xfsbuf-wq are
> doing.  The deadlock as depicted above cannot happen.  A workqueue
> with WQ_MEM_RECLAIM can deadlock iff an executing work item on the
> workqueue deadlocks.

Eric will have to confirm, but I recall asking Eric to check the
recuer threads and that they were idle...

....

> > before the end-io processing of the xfsbuf-wq and unwritten-wq
> > because of this lock inversion, just like we we always want the
> > xfsbufd to run before the unwritten-wq because unwritten extent
> > conversion may block waiting for metadata buffer IO to complete, and
> > we always want the the xfslog-wq works to run before all of them
> > because metadata buffer IO may get blocked waiting for buffers
> > pinned by the log to be unpinned for log Io completion...
> 
> I'm not really following your logic here.  Are you saying that xfs is
> trying to work around cyclic dependency by manipulating execution
> order of specific work items?

No, it's not cyclic. They are different dependencies.

Data IO completion can take the XFS inode i_lock. i.e. in the
mp->m_data_workqueue and the mp->m_unwritten_workqueue.

mp->m_data_workqueue has no other dependencies.

mp->m_unwritten_workqueue reads buffers, so is dependent on
mp->m_buf_workqueue for buffer IO completion.

mp->m_unwritten_workqueue can cause btree splits, which can defer
work to the xfs_alloc_wq.

xfs_alloc_wq reads buffers, so it dependent on the
mp->m_buf_workqueue for buffer IO completion.

So lock/wq ordering dependencies are:

m_data_workqueue -> i_lock
m_unwritten_workqueue -> i_lock -> xfs_alloc_wq -> m_buf_workqueue
syscall -> i_lock -> xfs_alloc_wq -> m_buf_workqueue

The issue we see is:

process A:	write(2) -> i_lock -> xfs_allow_wq
kworkers:	m_data_workqueue -> i_lock
		(blocked on process A work completion)

Queued work:  m_data_workqueue work, xfs_allow_wq work

Queued work does not appear to be dispatched for some reason, wq
concurrency depth does not appear to be exhausted and rescuer
threads do not appear to be active. Something has gone wrong for
the queued work to be stalled like this.

> There no reason to play with priorities to avoid deadlock.  That
> doesn't make any sense to me.  Priority or chained queueing, which is

Prioritised work queuing is what I suggested, not modifying kworker
scheduler priorities...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2015-01-13 20:27 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-01-09 18:08 [PATCH 0/2] xfs: make xfs allocation workqueue per-mount, and high priority Eric Sandeen
2015-01-09 18:10 ` [PATCH 1/2] xfs: make global xfsalloc workqueue per-mount Eric Sandeen
2015-01-12 15:35   ` Brian Foster
2015-01-09 18:12 ` [PATCH 2/2] xfs: mark the xfs-alloc workqueue as high priority Eric Sandeen
2015-01-09 18:23   ` Tejun Heo
2015-01-09 20:36     ` Eric Sandeen
2015-01-10 19:28       ` Tejun Heo
2015-01-11  0:04         ` Eric Sandeen
2015-01-11  6:33           ` Tejun Heo
2015-01-12 20:09         ` Eric Sandeen
2015-01-12 22:53           ` Tejun Heo
2015-01-12 23:12             ` Eric Sandeen
2015-01-12 23:37               ` Tejun Heo
2015-01-13 19:08                 ` Eric Sandeen
2015-01-13 20:19                   ` Tejun Heo
2015-01-13 20:29                     ` Eric Sandeen
2015-01-13 20:46                       ` Tejun Heo
2015-01-13 22:58                         ` Eric Sandeen
2015-01-13 23:35                           ` [PATCH wq/for-3.19] workqueue: fix subtle pool management issue which can stall whole worker_pool Tejun Heo
2015-01-16 19:32                             ` [PATCH workqueue wq/for-3.19-fixes] " Tejun Heo
2015-01-19  2:15                               ` Lai Jiangshan
2015-01-09 23:28     ` [PATCH 2/2] xfs: mark the xfs-alloc workqueue as high priority Dave Chinner
2015-01-10 17:41       ` Tejun Heo
2015-01-12  3:30         ` Dave Chinner [this message]
2015-01-13 20:50           ` Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150112033015.GB29484@destitution \
    --to=david@fromorbit.com \
    --cc=sandeen@redhat.com \
    --cc=sandeen@sandeen.net \
    --cc=tj@kernel.org \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox