* pipe/cred lockdep warning
@ 2013-10-01 14:57 Dave Jones
2013-10-03 18:56 ` Al Viro
0 siblings, 1 reply; 5+ messages in thread
From: Dave Jones @ 2013-10-01 14:57 UTC (permalink / raw)
To: Linux Kernel; +Cc: ebiederm, Al Viro, Linus Torvalds
I seem to recall hitting this quite a while ago. Does it look familiar ?
Either it didn't get fixed, or it's back..
Dave
[ 2836.628351] ======================================================
[ 2836.628392] [ INFO: possible circular locking dependency detected ]
[ 2836.628436] 3.12.0-rc3+ #90 Not tainted
[ 2836.628464] -------------------------------------------------------
[ 2836.628506] trinity-child0/30759 is trying to acquire lock:
[ 2836.628544] (&sig->cred_guard_mutex){+.+.+.}, at: [<ffffffff81239045>] proc_pid_attr_write+0xf5/0x140
[ 2836.628624]
but task is already holding lock:
[ 2836.628664] (&pipe->mutex/1){+.+.+.}, at: [<ffffffff811ccf66>] pipe_lock+0x26/0x30
[ 2836.628734]
which lock already depends on the new lock.
[ 2836.628787]
the existing dependency chain (in reverse order) is:
[ 2836.628837]
-> #2 (&pipe->mutex/1){+.+.+.}:
[ 2836.628888] [<ffffffff810cc1e3>] lock_acquire+0x93/0x200
[ 2836.628933] [<ffffffff8172de5c>] mutex_lock_nested+0x7c/0x420
[ 2836.628981] [<ffffffff811ccf66>] pipe_lock+0x26/0x30
[ 2836.629023] [<ffffffff811f6e94>] generic_file_splice_write+0x64/0x170
[ 2836.629076] [<ffffffffa02dc710>] xfs_file_splice_write+0xb0/0x230 [xfs]
[ 2836.629152] [<ffffffff811f8dca>] SyS_splice+0x24a/0x7e0
[ 2836.629196] [<ffffffff8173ce24>] tracesys+0xdd/0xe2
[ 2836.629238]
-> #1 (&(&ip->i_iolock)->mr_lock){++++++}:
[ 2836.629291] [<ffffffff810cc1e3>] lock_acquire+0x93/0x200
[ 2836.629335] [<ffffffff81089562>] down_read_nested+0x52/0xa0
[ 2836.630980] [<ffffffffa033b7a0>] xfs_ilock+0x1d0/0x280 [xfs]
[ 2836.632649] [<ffffffffa02dc161>] xfs_file_aio_read+0x111/0x3e0 [xfs]
[ 2836.634313] [<ffffffff811c3f1a>] do_sync_read+0x5a/0x90
[ 2836.635957] [<ffffffff811c4551>] vfs_read+0xa1/0x170
[ 2836.637581] [<ffffffff811ca911>] kernel_read+0x41/0x60
[ 2836.639184] [<ffffffff811cab33>] prepare_binprm+0xb3/0x170
[ 2836.640768] [<ffffffff811cc733>] do_execve_common.isra.28+0x5a3/0x990
[ 2836.642344] [<ffffffff811cce06>] SyS_execve+0x36/0x50
[ 2836.643904] [<ffffffff8173d209>] stub_execve+0x69/0xa0
[ 2836.645442]
-> #0 (&sig->cred_guard_mutex){+.+.+.}:
[ 2836.648483] [<ffffffff810cb76a>] __lock_acquire+0x191a/0x1be0
[ 2836.650025] [<ffffffff810cc1e3>] lock_acquire+0x93/0x200
[ 2836.651559] [<ffffffff8172eba7>] mutex_lock_interruptible_nested+0x77/0x510
[ 2836.653110] [<ffffffff81239045>] proc_pid_attr_write+0xf5/0x140
[ 2836.654659] [<ffffffff811c4f92>] __kernel_write+0x72/0x150
[ 2836.656208] [<ffffffff811f7269>] write_pipe_buf+0x49/0x70
[ 2836.657750] [<ffffffff811f6894>] splice_from_pipe_feed+0x84/0x140
[ 2836.659310] [<ffffffff811f6c3e>] __splice_from_pipe+0x6e/0x90
[ 2836.660868] [<ffffffff811f8731>] splice_from_pipe+0x51/0x70
[ 2836.662423] [<ffffffff811f8789>] default_file_splice_write+0x19/0x30
[ 2836.663979] [<ffffffff811f8dca>] SyS_splice+0x24a/0x7e0
[ 2836.665518] [<ffffffff8173ce24>] tracesys+0xdd/0xe2
[ 2836.666999]
other info that might help us debug this:
[ 2836.671264] Chain exists of:
&sig->cred_guard_mutex --> &(&ip->i_iolock)->mr_lock --> &pipe->mutex/1
[ 2836.675587] Possible unsafe locking scenario:
[ 2836.678443] CPU0 CPU1
[ 2836.679832] ---- ----
[ 2836.681187] lock(&pipe->mutex/1);
[ 2836.682548] lock(&(&ip->i_iolock)->mr_lock);
[ 2836.683938] lock(&pipe->mutex/1);
[ 2836.685309] lock(&sig->cred_guard_mutex);
[ 2836.686670]
*** DEADLOCK ***
[ 2836.690616] 2 locks held by trinity-child0/30759:
[ 2836.691921] #0: (sb_writers#4){.+.+.+}, at: [<ffffffff811f92b1>] SyS_splice+0x731/0x7e0
[ 2836.693264] #1: (&pipe->mutex/1){+.+.+.}, at: [<ffffffff811ccf66>] pipe_lock+0x26/0x30
[ 2836.694597]
stack backtrace:
[ 2836.697165] CPU: 0 PID: 30759 Comm: trinity-child0 Not tainted 3.12.0-rc3+ #90
[ 2836.699886] ffffffff825329d0 ffff88009cb61b88 ffffffff81729d13 ffffffff82504b80
[ 2836.701295] ffff88009cb61bc8 ffffffff817260f1 ffff88009cb61c20 ffff88009caa5cb8
[ 2836.702723] ffff88009caa5580 0000000000000001 ffff88009caa5c80 ffff88009caa5cb8
[ 2836.704149] Call Trace:
[ 2836.705547] [<ffffffff81729d13>] dump_stack+0x4e/0x82
[ 2836.706958] [<ffffffff817260f1>] print_circular_bug+0x200/0x20f
[ 2836.708365] [<ffffffff810cb76a>] __lock_acquire+0x191a/0x1be0
[ 2836.709778] [<ffffffff810cc1e3>] lock_acquire+0x93/0x200
[ 2836.711184] [<ffffffff81239045>] ? proc_pid_attr_write+0xf5/0x140
[ 2836.712596] [<ffffffff81239045>] ? proc_pid_attr_write+0xf5/0x140
[ 2836.713983] [<ffffffff8172eba7>] mutex_lock_interruptible_nested+0x77/0x510
[ 2836.715382] [<ffffffff81239045>] ? proc_pid_attr_write+0xf5/0x140
[ 2836.716785] [<ffffffff81239045>] ? proc_pid_attr_write+0xf5/0x140
[ 2836.718171] [<ffffffff811a5b99>] ? alloc_pages_current+0xa9/0x170
[ 2836.719554] [<ffffffff81239045>] proc_pid_attr_write+0xf5/0x140
[ 2836.720926] [<ffffffff811f7220>] ? splice_direct_to_actor+0x1f0/0x1f0
[ 2836.722303] [<ffffffff811c4f92>] __kernel_write+0x72/0x150
[ 2836.723680] [<ffffffff811f7220>] ? splice_direct_to_actor+0x1f0/0x1f0
[ 2836.725052] [<ffffffff811f7269>] write_pipe_buf+0x49/0x70
[ 2836.726420] [<ffffffff811f6894>] splice_from_pipe_feed+0x84/0x140
[ 2836.727805] [<ffffffff811f7220>] ? splice_direct_to_actor+0x1f0/0x1f0
[ 2836.729199] [<ffffffff811f6c3e>] __splice_from_pipe+0x6e/0x90
[ 2836.730589] [<ffffffff811f7220>] ? splice_direct_to_actor+0x1f0/0x1f0
[ 2836.731983] [<ffffffff811f8731>] splice_from_pipe+0x51/0x70
[ 2836.733373] [<ffffffff811f8789>] default_file_splice_write+0x19/0x30
[ 2836.734771] [<ffffffff811f8dca>] SyS_splice+0x24a/0x7e0
[ 2836.736161] [<ffffffff8173ce24>] tracesys+0xdd/0xe2
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: pipe/cred lockdep warning
2013-10-01 14:57 pipe/cred lockdep warning Dave Jones
@ 2013-10-03 18:56 ` Al Viro
2013-10-04 23:27 ` Linus Torvalds
0 siblings, 1 reply; 5+ messages in thread
From: Al Viro @ 2013-10-03 18:56 UTC (permalink / raw)
To: Dave Jones, Linux Kernel, ebiederm, Linus Torvalds
On Tue, Oct 01, 2013 at 10:57:55AM -0400, Dave Jones wrote:
> I seem to recall hitting this quite a while ago. Does it look familiar ?
> Either it didn't get fixed, or it's back..
> [ 2836.628506] trinity-child0/30759 is trying to acquire lock:
> [ 2836.628544] (&sig->cred_guard_mutex){+.+.+.}, at: [<ffffffff81239045>] proc_pid_attr_write+0xf5/0x140
> [ 2836.628624]
> but task is already holding lock:
> [ 2836.628664] (&pipe->mutex/1){+.+.+.}, at: [<ffffffff811ccf66>] pipe_lock+0x26/0x30
> [ 2836.628734]
splice to /proc/<pid>/attr/<something> [cred_guard shite under pipe->mutex]
vs.
splice from XFS [pipe->mutex under exclusive iolock]
vs.
read from XFS in execve() [shared iolock under cred_guard]
Note, BTW, that splice to /proc/<pid>/attr/<something> is broken.
proc_pid_attr_write() is *not* supposed to allow partial writes at all.
Frankly, I'd consider adding a ->splice_write() instance that would
simply return -EINVAL there...
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: pipe/cred lockdep warning
2013-10-03 18:56 ` Al Viro
@ 2013-10-04 23:27 ` Linus Torvalds
2013-10-05 0:46 ` Al Viro
0 siblings, 1 reply; 5+ messages in thread
From: Linus Torvalds @ 2013-10-04 23:27 UTC (permalink / raw)
To: Al Viro; +Cc: Dave Jones, Linux Kernel, Eric W. Biederman
On Thu, Oct 3, 2013 at 11:56 AM, Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> Note, BTW, that splice to /proc/<pid>/attr/<something> is broken.
> proc_pid_attr_write() is *not* supposed to allow partial writes at all.
> Frankly, I'd consider adding a ->splice_write() instance that would
> simply return -EINVAL there...
That sounds like the right thing to do.
Or possibly go even further, and say that the default is to return
-EINVAL, and files and filesystems that actually want the
"default_file_splice_write()" semantics have to say so in their d_op
structure.
Hmm?
Linus
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: pipe/cred lockdep warning
2013-10-04 23:27 ` Linus Torvalds
@ 2013-10-05 0:46 ` Al Viro
[not found] ` <CA+55aFzrJ4JpfXxGA6LTd_fe-dEjTHhPEY2bkWEJJ=AF-FfT4Q@mail.gmail.com>
0 siblings, 1 reply; 5+ messages in thread
From: Al Viro @ 2013-10-05 0:46 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Dave Jones, Linux Kernel, Eric W. Biederman
On Fri, Oct 04, 2013 at 04:27:35PM -0700, Linus Torvalds wrote:
> On Thu, Oct 3, 2013 at 11:56 AM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> >
> > Note, BTW, that splice to /proc/<pid>/attr/<something> is broken.
> > proc_pid_attr_write() is *not* supposed to allow partial writes at all.
> > Frankly, I'd consider adding a ->splice_write() instance that would
> > simply return -EINVAL there...
>
> That sounds like the right thing to do.
>
> Or possibly go even further, and say that the default is to return
> -EINVAL, and files and filesystems that actually want the
> "default_file_splice_write()" semantics have to say so in their d_op
> structure.
>
> Hmm?
f_op, unfortunately... That's going to be a _lot_ of churn, I'm afraid ;-/
We have 2002 instances of file_operations in the tree. 774 of those
have ->write. 60 have both ->aio_write and ->write, 4 - only ->aio_write.
->splice_write is present only for 25. Which leaves 753 instances of
file_operations to review and decide whether we keep the current behaviour
or start giving -EINVAL. In a lot of those the answer will be EINVAL,
but which ones those are?
We could, I suppose, generate a commit that would add default_file_splice_write
to those 753 instances and then get rid of them one by one (e.g. provide
the same thing under different name and use that name in already reviewed
cases if behaviour is right, so that we could keep track of what's left
to do), but...
a) if we go that way, I would suggest doing the first commit as
right-after-rc1 special
b) I'd expect that review and removal to take at least a full cycle
and contain quite a few mistaken decisions ;-/
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2013-10-05 1:42 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-10-01 14:57 pipe/cred lockdep warning Dave Jones
2013-10-03 18:56 ` Al Viro
2013-10-04 23:27 ` Linus Torvalds
2013-10-05 0:46 ` Al Viro
[not found] ` <CA+55aFzrJ4JpfXxGA6LTd_fe-dEjTHhPEY2bkWEJJ=AF-FfT4Q@mail.gmail.com>
2013-10-05 1:42 ` Al Viro
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.