From: Dave Chinner <david@fromorbit.com>
To: Ming Lei <ming.lei@canonical.com>
Cc: Michal Hocko <mhocko@suse.cz>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
linux-mm <linux-mm@kvack.org>,
xfs@oss.sgi.com
Subject: Re: [regression 4.2-rc3] loop: xfstests xfs/073 deadlocked in low memory conditions
Date: Tue, 21 Jul 2015 15:46:12 +1000 [thread overview]
Message-ID: <20150721054612.GZ7943@dastard> (raw)
In-Reply-To: <CACVXFVMyW8SmuxPoZznemwQTnvXZLbxyoi9iYh7wK-3BdW=jbQ@mail.gmail.com>
On Tue, Jul 21, 2015 at 12:05:56AM -0400, Ming Lei wrote:
> On Mon, Jul 20, 2015 at 9:59 PM, Dave Chinner <david@fromorbit.com> wrote:
> > Hi Ming,
> >
> > With the recent merge of the loop device changes, I'm now seeing
> > XFS deadlock on my single CPU, 1GB RAM VM running xfs/073.
> >
> > The deadlocked is as follows:
> >
> > kloopd1: loop_queue_read_work
> > xfs_file_iter_read
> > lock XFS inode XFS_IOLOCK_SHARED (on image file)
> > page cache read (GFP_KERNEL)
> > radix tree alloc
> > memory reclaim
> > reclaim XFS inodes
> > log force to unpin inodes
> > <wait for log IO completion>
> >
> > xfs-cil/loop1: <does log force IO work>
> > xlog_cil_push
> > xlog_write
> > <loop issuing log writes>
> > xlog_state_get_iclog_space()
> > <blocks due to all log buffers under write io>
> > <waits for IO completion>
> >
> > kloopd1: loop_queue_write_work
> > xfs_file_write_iter
> > lock XFS inode XFS_IOLOCK_EXCL (on image file)
> > <wait for inode to be unlocked>
> >
> > [The full stack traces are below].
> >
> > i.e. the kloopd, with it's split read and write work queues, has
> > introduced a dependency through memory reclaim. i.e. that writes
> > need to be able to progress for reads make progress.
>
> This kind of change just makes READ vs READ OR WRITE submitted
> to fs concurrently, and the use case should have been simulated from
> user space on one regular XFS file too?
Assuming the "regular XFS file" is on a normal block device (i.e.
not a loop device) then this will not deadlock as there is not
dependency on vfs level locking for log writes.
i.e. normal userspace IO path is:
userspace read
vfs_read
xfs_read
page cache alloc (GFP_KERNEL)
direct reclaim
xfs_inode reclaim
log force
CIL push
<workqueue>
xlog_write
submit_bio
-> hardware.
And then the log IO completes, and everything continues onward.
What the loop device used to do:
userspace read
vfs_read
xfs_read
page cache alloc (GFP_KERNEL)
submit_bio
loop device
splice_read (on image file)
xfs_splice_read
page cache alloc (GFP_NOFS)
direct reclaim
<skip filesystem reclaim>
submit_bio
-> hardware.
And when the read Io completes, everything moves onwards.
What the loop device now does:
userspace read
vfs_read
xfs_read
page cache alloc (GFP_KERNEL)
submit_bio
loop device
<workqueue>
vfs_read (on image file)
xfs_read
page cache alloc (GFP_KERNEL)
direct reclaim
xfs_inode reclaim
log force
CIL push
<workqueue>
xlog_write
submit_bio
loop device
<workqueue>
vfs_write (on image file)
xfs_write
<deadlock on image file lock>
> > The problem, fundamentally, is that mpage_readpages() does a
> > GFP_KERNEL allocation, rather than paying attention to the inode's
> > mapping gfp mask, which is set to GFP_NOFS.
>
> That looks the root cause, and I guess the issue is just triggered
> after commit aa4d86163e4(block: loop: switch to VFS ITER_BVEC)
> which changes splice to bvec iterator.
Yup - you are the unfortunate person who has wandered into the
minefield I'd been telling people about for quite some time. :(
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2015-07-21 5:46 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-07-21 1:59 [regression 4.2-rc3] loop: xfstests xfs/073 deadlocked in low memory conditions Dave Chinner
2015-07-21 4:05 ` Ming Lei
2015-07-21 5:46 ` Dave Chinner [this message]
2015-07-21 8:58 ` Michal Hocko
2015-07-29 11:54 ` Michal Hocko
2015-07-29 22:13 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150721054612.GZ7943@dastard \
--to=david@fromorbit.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.cz \
--cc=ming.lei@canonical.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).