public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
* splice vs truncate lockdep splat
@ 2016-03-20 13:43 Sage Weil
  2016-03-20 23:16 ` Dave Chinner
  0 siblings, 1 reply; 3+ messages in thread
From: Sage Weil @ 2016-03-20 13:43 UTC (permalink / raw)
  To: xfs

Hi all,

For the last few kernel releases we've been seeing this pretty regularly:

 ======================================================
 [ INFO: possible circular locking dependency detected ]
 4.5.0-rc3-ceph-00790-g2364ac5 #1 Not tainted
 -------------------------------------------------------
 tp_fstore_op/25647 is trying to acquire lock:
 (&sb->s_type->i_mutex_key#19){+.+.+.}, at: [<ffffffffa0574bad>] xfs_file_buffered_aio_write+0x6d/0x300 [xfs]
 #012but task is already holding lock:
 (&pipe->mutex/1){+.+.+.}, at: [<ffffffff8126347d>] pipe_lock+0x1d/0x20
 #012which lock already depends on the new lock.
 #012the existing dependency chain (in reverse order) is:
 #012-> #2 (&pipe->mutex/1){+.+.+.}:
       [<ffffffff81100842>] lock_acquire+0x162/0x1b0
       [<ffffffff817d77c9>] mutex_lock_nested+0x79/0x380
       [<ffffffff8126347d>] pipe_lock+0x1d/0x20
       [<ffffffff81290d20>] splice_to_pipe+0x30/0x250
       [<ffffffff81292379>] __generic_file_splice_read+0x4f9/0x620
       [<ffffffff812924de>] generic_file_splice_read+0x3e/0x70
       [<ffffffffa05738d4>] xfs_file_splice_read+0x194/0x220 [xfs]
       [<ffffffff812912da>] do_splice_to+0x6a/0x80
       [<ffffffff812934e1>] SyS_splice+0x6c1/0x720
       [<ffffffff817da4b2>] entry_SYSCALL_64_fastpath+0x12/0x76
 #012-> #1 (&(&ip->i_iolock)->mr_lock){++++++}:
       [<ffffffff81100842>] lock_acquire+0x162/0x1b0
       [<ffffffff810fa37f>] down_write_nested+0x4f/0x80
       [<ffffffffa058106d>] xfs_ilock+0xed/0x1b0 [xfs]
       [<ffffffffa05801c0>] xfs_vn_setattr+0x30/0x70 [xfs]
       [<ffffffff8127a09d>] notify_change+0x1ed/0x320
       [<ffffffff8125843c>] do_truncate+0x7c/0xc0
       [<ffffffff81258843>] do_sys_ftruncate.constprop.12+0x153/0x170
       [<ffffffff8125889e>] SyS_ftruncate+0xe/0x10
       [<ffffffff817da4b2>] entry_SYSCALL_64_fastpath+0x12/0x76
 #012-> #0 (&sb->s_type->i_mutex_key#19){+.+.+.}:
       [<ffffffff810ff57e>] __lock_acquire+0x13be/0x1e30
       [<ffffffff81100842>] lock_acquire+0x162/0x1b0
       [<ffffffff817d77c9>] mutex_lock_nested+0x79/0x380
       [<ffffffffa0574bad>] xfs_file_buffered_aio_write+0x6d/0x300 [xfs]
       [<ffffffffa0574ed5>] xfs_file_write_iter+0x95/0x140 [xfs]
       [<ffffffff81259ee5>] vfs_iter_write+0x75/0xb0
       [<ffffffff81291a51>] iter_file_splice_write+0x281/0x3a0
       [<ffffffff812932fa>] SyS_splice+0x4da/0x720
       [<ffffffff817da4b2>] entry_SYSCALL_64_fastpath+0x12/0x76
 #012other info that might help us debug this:
 Chain exists of:#012  &sb->s_type->i_mutex_key#19 --> &(&ip->i_iolock)->mr_lock --> &pipe->mutex/1
 Possible unsafe locking scenario:
       CPU0                    CPU1
       ----                    ----
  lock(&pipe->mutex/1);
                               lock(&(&ip->i_iolock)->mr_lock);
                               lock(&pipe->mutex/1);
  lock(&sb->s_type->i_mutex_key#19);
 #012 *** DEADLOCK ***
 2 locks held by tp_fstore_op/25647:
 #0:  (sb_writers#15){.+.+.+}, at: [<ffffffff8125db7d>] __sb_start_write+0xad/0xe0
 #1:  (&pipe->mutex/1){+.+.+.}, at: [<ffffffff8126347d>] pipe_lock+0x1d/0x20
 #012stack backtrace:
 CPU: 4 PID: 25647 Comm: tp_fstore_op Not tainted 4.5.0-rc3-ceph-00790-g2364ac5 #1
 Hardware name: Supermicro SYS-5018R-WR/X10SRW-F, BIOS 1.0c 09/07/2015
 0000000000000086 00000000547c0aa5 ffff880858c0bad8 ffffffff813f8661
 ffffffff827127a0 ffffffff82711870 ffff880858c0bb18 ffffffff811c5d39
 ffff880858c0bb70 ffff8808530288d8 0000000000000001 0000000000000002
 Call Trace:
 [<ffffffff813f8661>] dump_stack+0x85/0xc4
 [<ffffffff811c5d39>] print_circular_bug+0x200/0x20e
 [<ffffffff810ff57e>] __lock_acquire+0x13be/0x1e30
 [<ffffffff81100842>] lock_acquire+0x162/0x1b0
 [<ffffffffa0574bad>] ? xfs_file_buffered_aio_write+0x6d/0x300 [xfs]
 [<ffffffff817d77c9>] mutex_lock_nested+0x79/0x380
 [<ffffffffa0574bad>] ? xfs_file_buffered_aio_write+0x6d/0x300 [xfs]
 [<ffffffff810d338d>] ? ttwu_stat+0x19d/0x200
 [<ffffffffa0574bad>] xfs_file_buffered_aio_write+0x6d/0x300 [xfs]
 [<ffffffff810fdc6f>] ? mark_held_locks+0x6f/0x80
 [<ffffffff810fdde3>] ? trace_hardirqs_on_caller+0x163/0x1d0
 [<ffffffffa0574ed5>] xfs_file_write_iter+0x95/0x140 [xfs]
 [<ffffffff81259ee5>] vfs_iter_write+0x75/0xb0
 [<ffffffff81291a51>] iter_file_splice_write+0x281/0x3a0
 [<ffffffff812932fa>] SyS_splice+0x4da/0x720
 [<ffffffff810031d6>] ? do_audit_syscall_entry+0x66/0x70
 [<ffffffff817da4b2>] entry_SYSCALL_64_fastpath+0x12/0x76

We're not actually doing racing truncate and splice (or any racing 
operations for that matter) to the same files, so we don't actually hit 
this, but the lockdep warning is enough to make our test runs fail, and 
we'd rather not whitelist this if it's something that should get fixed 
anyway.

My apologies for not raising this sooner; I thought I'd already emailed 
the list about this a few months ago but it looks think I'd only pinged 
#xfs.

sage

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: splice vs truncate lockdep splat
  2016-03-20 13:43 splice vs truncate lockdep splat Sage Weil
@ 2016-03-20 23:16 ` Dave Chinner
  2016-03-20 23:29   ` Dave Chinner
  0 siblings, 1 reply; 3+ messages in thread
From: Dave Chinner @ 2016-03-20 23:16 UTC (permalink / raw)
  To: Sage Weil; +Cc: xfs

On Sun, Mar 20, 2016 at 09:43:01AM -0400, Sage Weil wrote:
> Hi all,
> 
> For the last few kernel releases we've been seeing this pretty regularly:
> 

[snip]

Yup, splice still has locking problems. Always has, and the
write_iter rework a few releases ago just moved the warnings
elsewhere.

>  #012-> #2 (&pipe->mutex/1){+.+.+.}:

xfs_file_splice_read
  xfs_iolock(XFS_IOLOCK_SHARED)
    ....
    pipe_lock(pipe)

XFS takes the XFS_IOLOCK_SHARED here to serialise splice read
against truncate.

>  #012-> #1 (&(&ip->i_iolock)->mr_lock){++++++}:

mutex_lock(inode->i_mutex)
....
  xfs_vn_setattr
    xfs_iolock(XFS_IOLOCK_EXCL)

Truncate lock order for IO serialisation (no pipe lock).

>  #012-> #0 (&sb->s_type->i_mutex_key#19){+.+.+.}:

iter_file_splice_write
  pipe_lock(pipe)
    xfs_file_buffered_aio_write
      mutex_lock(inode->i_mutex)
        xfs_iolock(XFS_IOLOCK_EXCL)

XFS takes i_mutex (required for buffered IO path) and
XFS_IOLOCK_EXCL (for atomic write semantics and truncate exclusion),
but the splice write path has put this under the pipe lock....

i.e. because the read/write path take the IOLOCK different ways,
lockdep seems to think it's OK. Hence IO testing doesn't report
issues, but the moment you throw in another operation that takes
both the i_mutex and XFS_IOLOCK_EXCL, lockdep will through a lock
order violation report.

> We're not actually doing racing truncate and splice (or any racing 
> operations for that matter) to the same files, so we don't actually hit 
> this, but the lockdep warning is enough to make our test runs fail, and 
> we'd rather not whitelist this if it's something that should get fixed 
> anyway.

We can't fix it in XFS - the high level splice infrastructure which
that read IO serialisation is done entirely based on page cache page
locking and so can hold the pipe lock across ->splice_read callouts.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: splice vs truncate lockdep splat
  2016-03-20 23:16 ` Dave Chinner
@ 2016-03-20 23:29   ` Dave Chinner
  0 siblings, 0 replies; 3+ messages in thread
From: Dave Chinner @ 2016-03-20 23:29 UTC (permalink / raw)
  To: Sage Weil; +Cc: xfs

On Mon, Mar 21, 2016 at 10:16:38AM +1100, Dave Chinner wrote:
> On Sun, Mar 20, 2016 at 09:43:01AM -0400, Sage Weil wrote:
> > Hi all,
> > 
> > For the last few kernel releases we've been seeing this pretty regularly:
> > 
> 
> [snip]
> 
> Yup, splice still has locking problems. Always has, and the
> write_iter rework a few releases ago just moved the warnings
> elsewhere.
> 
> >  #012-> #2 (&pipe->mutex/1){+.+.+.}:
> 
> xfs_file_splice_read
>   xfs_iolock(XFS_IOLOCK_SHARED)
>     ....
>     pipe_lock(pipe)
> 
> XFS takes the XFS_IOLOCK_SHARED here to serialise splice read
> against truncate.
> 
> >  #012-> #1 (&(&ip->i_iolock)->mr_lock){++++++}:
> 
> mutex_lock(inode->i_mutex)
> ....
>   xfs_vn_setattr
>     xfs_iolock(XFS_IOLOCK_EXCL)
> 
> Truncate lock order for IO serialisation (no pipe lock).
> 
> >  #012-> #0 (&sb->s_type->i_mutex_key#19){+.+.+.}:
> 
> iter_file_splice_write
>   pipe_lock(pipe)
>     xfs_file_buffered_aio_write
>       mutex_lock(inode->i_mutex)
>         xfs_iolock(XFS_IOLOCK_EXCL)
> 
> XFS takes i_mutex (required for buffered IO path) and
> XFS_IOLOCK_EXCL (for atomic write semantics and truncate exclusion),
> but the splice write path has put this under the pipe lock....
> 
> i.e. because the read/write path take the IOLOCK different ways,
> lockdep seems to think it's OK. Hence IO testing doesn't report
> issues, but the moment you throw in another operation that takes
> both the i_mutex and XFS_IOLOCK_EXCL, lockdep will through a lock
> order violation report.
> 
> > We're not actually doing racing truncate and splice (or any racing 
> > operations for that matter) to the same files, so we don't actually hit 
> > this, but the lockdep warning is enough to make our test runs fail, and 
> > we'd rather not whitelist this if it's something that should get fixed 
> > anyway.
> 
> We can't fix it in XFS - the high level splice infrastructure which
> that read IO serialisation is done entirely based on page cache page
> locking and so can hold the pipe lock across ->splice_read callouts.
                                                 ^^^^^^^^^^^
						vfs_iter_write()

Got my read and write callouts  mixed up there. (i.e. splice_read is
assumed to have no filesystem locking above the pipe lock, so it is
assumed to be safe to take inode locks in vfs_iter_write() whilst
under the pipe lock).

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2016-03-20 23:30 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-03-20 13:43 splice vs truncate lockdep splat Sage Weil
2016-03-20 23:16 ` Dave Chinner
2016-03-20 23:29   ` Dave Chinner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox