From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from userp1040.oracle.com ([156.151.31.81]:48483 "EHLO userp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751078AbdGZXHK (ORCPT ); Wed, 26 Jul 2017 19:07:10 -0400 Reply-To: ashish.samant@oracle.com Subject: Re: [PATCH V2] fuse: Dont call set_page_dirty_lock() for ITER_BVEC pages for async_dio References: <1499912818-16882-1-git-send-email-ashish.samant@oracle.com> To: linux-fsdevel@vger.kernel.org, fuse-devel@lists.sourceforge.net Cc: miklos@szeredi.hu From: Ashish Samant Message-ID: <5979208F.7080203@oracle.com> Date: Wed, 26 Jul 2017 16:06:55 -0700 MIME-Version: 1.0 In-Reply-To: <1499912818-16882-1-git-send-email-ashish.samant@oracle.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-fsdevel-owner@vger.kernel.org List-ID: Hi Miklos, Is this version ok? Thanks, Ashish On 07/12/2017 07:26 PM, Ashish Samant wrote: > 'Commit 8fba54aebbdf ("fuse: direct-io: don't dirty ITER_BVEC pages")' > fixes the ITER_BVEC page deadlock for direct io in fuse by checking in > fuse_direct_io(), whether the page is a bvec page or not, before locking > it. However, this check is missed when the "async_dio" mount option is > enabled. In this case, set_page_dirty_lock() is called from the req->end > callback in request_end(), when the fuse thread is returning from > userspace to respond to the read request. This will cause the same > deadlock because the bvec condition is not checked in this path. > > Here is the stack of the deadlocked thread, while returning from userspace: > > [13706.656686] INFO: task glusterfs:3006 blocked for more than 120 seconds. > [13706.657808] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables > this message. > [13706.658788] glusterfs D ffffffff816c80f0 0 3006 1 > 0x00000080 > [13706.658797] ffff8800d6713a58 0000000000000086 ffff8800d9ad7000 > ffff8800d9ad5400 > [13706.658799] ffff88011ffd5cc0 ffff8800d6710008 ffff88011fd176c0 > 7fffffffffffffff > [13706.658801] 0000000000000002 ffffffff816c80f0 ffff8800d6713a78 > ffffffff816c790e > [13706.658803] Call Trace: > [13706.658809] [] ? bit_wait_io_timeout+0x80/0x80 > [13706.658811] [] schedule+0x3e/0x90 > [13706.658813] [] schedule_timeout+0x1b5/0x210 > [13706.658816] [] ? gup_pud_range+0x1db/0x1f0 > [13706.658817] [] ? kvm_clock_read+0x1e/0x20 > [13706.658819] [] ? kvm_clock_get_cycles+0x9/0x10 > [13706.658822] [] ? ktime_get+0x52/0xc0 > [13706.658824] [] io_schedule_timeout+0xa4/0x110 > [13706.658826] [] bit_wait_io+0x36/0x50 > [13706.658828] [] __wait_on_bit_lock+0x76/0xb0 > [13706.658831] [] ? lock_request+0x46/0x70 [fuse] > [13706.658834] [] __lock_page+0xaa/0xb0 > [13706.658836] [] ? wake_atomic_t_function+0x40/0x40 > [13706.658838] [] set_page_dirty_lock+0x58/0x60 > [13706.658841] [] fuse_release_user_pages+0x58/0x70 [fuse] > [13706.658844] [] ? fuse_aio_complete+0x190/0x190 [fuse] > [13706.658847] [] fuse_aio_complete_req+0x29/0x90 [fuse] > [13706.658849] [] request_end+0xd9/0x190 [fuse] > [13706.658852] [] fuse_dev_do_write+0x336/0x490 [fuse] > [13706.658854] [] fuse_dev_write+0x6e/0xa0 [fuse] > [13706.658857] [] ? security_file_permission+0x23/0x90 > [13706.658859] [] do_iter_readv_writev+0x60/0x90 > [13706.658862] [] ? fuse_dev_splice_write+0x350/0x350 > [fuse] > [13706.658863] [] do_readv_writev+0x171/0x1f0 > [13706.658866] [] ? try_to_wake_up+0x210/0x210 > [13706.658868] [] vfs_writev+0x41/0x50 > [13706.658870] [] SyS_writev+0x56/0xf0 > [13706.658872] [] ? syscall_trace_leave+0xf1/0x160 > [13706.658874] [] system_call_fastpath+0x12/0x71 > > Fix this by making should_dirty a fuse_io_priv parametero that it can be > checked in fuse_aio_complete_req(). > > Reported-by: Tiger Yang > Signed-off-by: Ashish Samant > --- > fs/fuse/file.c | 8 +++++--- > fs/fuse/fuse_i.h | 2 ++ > 2 files changed, 7 insertions(+), 3 deletions(-) > > diff --git a/fs/fuse/file.c b/fs/fuse/file.c > index 3ee4fdc..e865ac7 100644 > --- a/fs/fuse/file.c > +++ b/fs/fuse/file.c > @@ -608,8 +608,9 @@ static void fuse_aio_complete_req(struct fuse_conn *fc, struct fuse_req *req) > { > struct fuse_io_priv *io = req->io; > ssize_t pos = -1; > + bool should_dirty = io->should_dirty && !io->write; > > - fuse_release_user_pages(req, !io->write); > + fuse_release_user_pages(req, should_dirty); > > if (io->write) { > if (req->misc.write.in.size != req->misc.write.out.size) > @@ -1316,7 +1317,6 @@ ssize_t fuse_direct_io(struct fuse_io_priv *io, struct iov_iter *iter, > loff_t *ppos, int flags) > { > int write = flags & FUSE_DIO_WRITE; > - bool should_dirty = !write && iter_is_iovec(iter); > int cuse = flags & FUSE_DIO_CUSE; > struct file *file = io->file; > struct inode *inode = file->f_mapping->host; > @@ -1346,6 +1346,7 @@ ssize_t fuse_direct_io(struct fuse_io_priv *io, struct iov_iter *iter, > inode_unlock(inode); > } > > + io->should_dirty = !write && iter_is_iovec(iter); > while (count) { > size_t nres; > fl_owner_t owner = current->files; > @@ -1360,7 +1361,7 @@ ssize_t fuse_direct_io(struct fuse_io_priv *io, struct iov_iter *iter, > nres = fuse_send_read(req, io, pos, nbytes, owner); > > if (!io->async) > - fuse_release_user_pages(req, should_dirty); > + fuse_release_user_pages(req, io->should_dirty); > if (req->out.h.error) { > err = req->out.h.error; > break; > @@ -2872,6 +2873,7 @@ static inline loff_t fuse_round_up(loff_t off) > io->size = 0; > io->offset = offset; > io->write = (iov_iter_rw(iter) == WRITE); > + io->should_dirty = true; > io->err = 0; > io->file = file; > /* > diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h > index 1bd7ffd..2ef205e 100644 > --- a/fs/fuse/fuse_i.h > +++ b/fs/fuse/fuse_i.h > @@ -249,6 +249,7 @@ struct fuse_io_priv { > size_t size; > __u64 offset; > bool write; > + bool should_dirty; > int err; > struct kiocb *iocb; > struct file *file; > @@ -261,6 +262,7 @@ struct fuse_io_priv { > .refcnt = KREF_INIT(1), \ > .async = 0, \ > .file = f, \ > + .should_dirty = true, \ > } > > /**