* Re: [PATCH v2 4/7] fs/proc/task_mmu.c: shift mm_access() from m_start() to proc_maps_open()
[not found] ` <20140805194655.GA30728@redhat.com>
@ 2014-12-03 14:14 ` Kirill A. Shutemov
2014-12-03 16:59 ` Eric W. Biederman
2014-12-03 17:34 ` Oleg Nesterov
0 siblings, 2 replies; 4+ messages in thread
From: Kirill A. Shutemov @ 2014-12-03 14:14 UTC (permalink / raw)
To: Oleg Nesterov, David S. Miller, Linus Torvalds
Cc: Andrew Morton, Alexander Viro, Cyrill Gorcunov, David Howells,
Eric W. Biederman, Kirill A. Shutemov, Peter Zijlstra,
Sasha Levin, linux-fsdevel, linux-kernel, Alexey Dobriyan, netdev
On Tue, Aug 05, 2014 at 09:46:55PM +0200, Oleg Nesterov wrote:
> A simple test-case from Kirill Shutemov
>
> cat /proc/self/maps >/dev/null
> chmod +x /proc/self/net/packet
> exec /proc/self/net/packet
>
> makes lockdep unhappy, cat/exec take seq_file->lock + cred_guard_mutex in
> the opposite order.
Oleg, I see it again with almost the same test-case:
cat /proc/self/stack >/dev/null
chmod +x /proc/self/net/packet
exec /proc/self/net/packet
Looks like bunch of proc files were converted to use seq_file by Alexey
Dobriyan around the same time you've fixed the issue for /proc/pid/maps.
More generic test-case:
find /proc/self/ -type f -exec dd if='{}' of=/dev/null bs=1 count=1 ';' 2>/dev/null
chmod +x /proc/self/net/packet
exec /proc/self/net/packet
David, any justification for allowing chmod +x for files under
/proc/pid/net?
[ 2.042212] ======================================================
[ 2.042930] [ INFO: possible circular locking dependency detected ]
[ 2.043648] 3.18.0-rc7-00003-g3a18ca061311-dirty #237 Not tainted
[ 2.044350] -------------------------------------------------------
[ 2.045054] sh/94 is trying to acquire lock:
[ 2.045546] (&p->lock){+.+.+.}, at: [<ffffffff811e12fd>] seq_read+0x3d/0x3e0
[ 2.045781]
[ 2.045781] but task is already holding lock:
[ 2.045781] (&sig->cred_guard_mutex){+.+.+.}, at: [<ffffffff811c0e3d>] prepare_bprm_creds+0x2d/0x90
[ 2.045781]
[ 2.045781] which lock already depends on the new lock.
[ 2.045781]
[ 2.045781]
[ 2.045781] the existing dependency chain (in reverse order) is:
[ 2.045781]
-> #1 (&sig->cred_guard_mutex){+.+.+.}:
[ 2.045781] [<ffffffff810a6e99>] __lock_acquire+0x4d9/0xd40
[ 2.045781] [<ffffffff810a7ff2>] lock_acquire+0xd2/0x2a0
[ 2.045781] [<ffffffff81849da6>] mutex_lock_killable_nested+0x66/0x460
[ 2.045781] [<ffffffff81229de4>] lock_trace+0x24/0x70
[ 2.045781] [<ffffffff81229e8f>] proc_pid_stack+0x5f/0xe0
[ 2.045781] [<ffffffff81227244>] proc_single_show+0x54/0xa0
[ 2.045781] [<ffffffff811e13a0>] seq_read+0xe0/0x3e0
[ 2.045781] [<ffffffff811b9377>] vfs_read+0x97/0x180
[ 2.045781] [<ffffffff811b9f5d>] SyS_read+0x4d/0xc0
[ 2.045781] [<ffffffff8184e492>] system_call_fastpath+0x12/0x17
[ 2.045781]
-> #0 (&p->lock){+.+.+.}:
[ 2.045781] [<ffffffff810a389f>] validate_chain.isra.36+0xfff/0x1400
[ 2.045781] [<ffffffff810a6e99>] __lock_acquire+0x4d9/0xd40
[ 2.045781] [<ffffffff810a7ff2>] lock_acquire+0xd2/0x2a0
[ 2.045781] [<ffffffff81849629>] mutex_lock_nested+0x69/0x3c0
[ 2.045781] [<ffffffff811e12fd>] seq_read+0x3d/0x3e0
[ 2.045781] [<ffffffff81226428>] proc_reg_read+0x48/0x70
[ 2.045781] [<ffffffff811b9377>] vfs_read+0x97/0x180
[ 2.045781] [<ffffffff811bf1a8>] kernel_read+0x48/0x60
[ 2.045781] [<ffffffff811bfb2c>] prepare_binprm+0xdc/0x180
[ 2.045781] [<ffffffff811c139a>] do_execve_common.isra.29+0x4fa/0x960
[ 2.045781] [<ffffffff811c1818>] do_execve+0x18/0x20
[ 2.045781] [<ffffffff811c1b05>] SyS_execve+0x25/0x30
[ 2.045781] [<ffffffff8184ea49>] stub_execve+0x69/0xa0
[ 2.045781]
[ 2.045781] other info that might help us debug this:
[ 2.045781]
[ 2.045781] Possible unsafe locking scenario:
[ 2.045781]
[ 2.045781] CPU0 CPU1
[ 2.045781] ---- ----
[ 2.045781] lock(&sig->cred_guard_mutex);
[ 2.045781] lock(&p->lock);
[ 2.045781] lock(&sig->cred_guard_mutex);
[ 2.045781] lock(&p->lock);
[ 2.045781]
[ 2.045781] *** DEADLOCK ***
[ 2.045781]
[ 2.045781] 1 lock held by sh/94:
[ 2.045781] #0: (&sig->cred_guard_mutex){+.+.+.}, at: [<ffffffff811c0e3d>] prepare_bprm_creds+0x2d/0x90
[ 2.045781]
[ 2.045781] stack backtrace:
[ 2.045781] CPU: 0 PID: 94 Comm: sh Not tainted 3.18.0-rc7-00003-g3a18ca061311-dirty #237
[ 2.045781] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
[ 2.045781] ffffffff82a48d50 ffff88085427bad8 ffffffff81844a85 0000000000000cac
[ 2.045781] ffffffff82a654a0 ffff88085427bb28 ffffffff810a1b03 0000000000000000
[ 2.045781] ffff88085427bb68 ffff88085427bb28 ffff8808547f1500 ffff8808547f1c40
[ 2.045781] Call Trace:
[ 2.045781] [<ffffffff81844a85>] dump_stack+0x4e/0x68
[ 2.045781] [<ffffffff810a1b03>] print_circular_bug+0x203/0x310
[ 2.045781] [<ffffffff810a389f>] validate_chain.isra.36+0xfff/0x1400
[ 2.045781] [<ffffffff8108fa76>] ? local_clock+0x16/0x30
[ 2.045781] [<ffffffff810a6e99>] __lock_acquire+0x4d9/0xd40
[ 2.045781] [<ffffffff810a7ff2>] lock_acquire+0xd2/0x2a0
[ 2.045781] [<ffffffff811e12fd>] ? seq_read+0x3d/0x3e0
[ 2.045781] [<ffffffff81849629>] mutex_lock_nested+0x69/0x3c0
[ 2.045781] [<ffffffff811e12fd>] ? seq_read+0x3d/0x3e0
[ 2.045781] [<ffffffff8108f9f8>] ? sched_clock_cpu+0x98/0xc0
[ 2.045781] [<ffffffff811e12fd>] ? seq_read+0x3d/0x3e0
[ 2.045781] [<ffffffff814050b9>] ? lockref_put_or_lock+0x29/0x40
[ 2.045781] [<ffffffff811e12fd>] seq_read+0x3d/0x3e0
[ 2.045781] [<ffffffff814050b9>] ? lockref_put_or_lock+0x29/0x40
[ 2.045781] [<ffffffff81226428>] proc_reg_read+0x48/0x70
[ 2.045781] [<ffffffff811b9377>] vfs_read+0x97/0x180
[ 2.045781] [<ffffffff811bf1a8>] kernel_read+0x48/0x60
[ 2.045781] [<ffffffff811bfb2c>] prepare_binprm+0xdc/0x180
[ 2.045781] [<ffffffff811c139a>] do_execve_common.isra.29+0x4fa/0x960
[ 2.092142] tsc: Refined TSC clocksource calibration: 2693.484 MHz
[ 2.045781] [<ffffffff811c0fd3>] ? do_execve_common.isra.29+0x133/0x960
[ 2.045781] [<ffffffff8184f04d>] ? retint_swapgs+0xe/0x13
[ 2.045781] [<ffffffff811c1818>] do_execve+0x18/0x20
[ 2.045781] [<ffffffff811c1b05>] SyS_execve+0x25/0x30
[ 2.045781] [<ffffffff8184ea49>] stub_execve+0x69/0xa0
--
Kirill A. Shutemov
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH v2 4/7] fs/proc/task_mmu.c: shift mm_access() from m_start() to proc_maps_open()
2014-12-03 14:14 ` [PATCH v2 4/7] fs/proc/task_mmu.c: shift mm_access() from m_start() to proc_maps_open() Kirill A. Shutemov
@ 2014-12-03 16:59 ` Eric W. Biederman
2014-12-04 16:17 ` Kirill A. Shutemov
2014-12-03 17:34 ` Oleg Nesterov
1 sibling, 1 reply; 4+ messages in thread
From: Eric W. Biederman @ 2014-12-03 16:59 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: Oleg Nesterov, David S. Miller, Linus Torvalds, Andrew Morton,
Alexander Viro, Cyrill Gorcunov, David Howells,
Kirill A. Shutemov, Peter Zijlstra, Sasha Levin, linux-fsdevel,
linux-kernel, Alexey Dobriyan, netdev
"Kirill A. Shutemov" <kirill@shutemov.name> writes:
> On Tue, Aug 05, 2014 at 09:46:55PM +0200, Oleg Nesterov wrote:
>> A simple test-case from Kirill Shutemov
>>
>> cat /proc/self/maps >/dev/null
>> chmod +x /proc/self/net/packet
>> exec /proc/self/net/packet
>>
>> makes lockdep unhappy, cat/exec take seq_file->lock + cred_guard_mutex in
>> the opposite order.
>
> Oleg, I see it again with almost the same test-case:
>
> cat /proc/self/stack >/dev/null
> chmod +x /proc/self/net/packet
> exec /proc/self/net/packet
>
> Looks like bunch of proc files were converted to use seq_file by Alexey
> Dobriyan around the same time you've fixed the issue for /proc/pid/maps.
>
> More generic test-case:
>
> find /proc/self/ -type f -exec dd if='{}' of=/dev/null bs=1 count=1 ';' 2>/dev/null
> chmod +x /proc/self/net/packet
> exec /proc/self/net/packet
>
> David, any justification for allowing chmod +x for files under
> /proc/pid/net?
I don't think there are any good reasons for allowing chmod +x for the
proc generic files. Certainly executing any of them is nonsense.
I do recall some weird conner cases existing. I think they resulted
in a need to preserve chmod if not chmod +x. This is just me saying
tread carefully before you change anything.
It really should be safe to tweak proc_notify_change to not allow
messing with the executable bits of proc files.
> [ 2.042212] ======================================================
> [ 2.042930] [ INFO: possible circular locking dependency detected ]
> [ 2.043648] 3.18.0-rc7-00003-g3a18ca061311-dirty #237 Not tainted
> [ 2.044350] -------------------------------------------------------
> [ 2.045054] sh/94 is trying to acquire lock:
> [ 2.045546] (&p->lock){+.+.+.}, at: [<ffffffff811e12fd>] seq_read+0x3d/0x3e0
> [ 2.045781]
> [ 2.045781] but task is already holding lock:
> [ 2.045781] (&sig->cred_guard_mutex){+.+.+.}, at: [<ffffffff811c0e3d>] prepare_bprm_creds+0x2d/0x90
> [ 2.045781]
> [ 2.045781] which lock already depends on the new lock.
> [ 2.045781]
> [ 2.045781]
> [ 2.045781] the existing dependency chain (in reverse order) is:
> [ 2.045781]
> -> #1 (&sig->cred_guard_mutex){+.+.+.}:
> [ 2.045781] [<ffffffff810a6e99>] __lock_acquire+0x4d9/0xd40
> [ 2.045781] [<ffffffff810a7ff2>] lock_acquire+0xd2/0x2a0
> [ 2.045781] [<ffffffff81849da6>] mutex_lock_killable_nested+0x66/0x460
> [ 2.045781] [<ffffffff81229de4>] lock_trace+0x24/0x70
> [ 2.045781] [<ffffffff81229e8f>] proc_pid_stack+0x5f/0xe0
> [ 2.045781] [<ffffffff81227244>] proc_single_show+0x54/0xa0
> [ 2.045781] [<ffffffff811e13a0>] seq_read+0xe0/0x3e0
> [ 2.045781] [<ffffffff811b9377>] vfs_read+0x97/0x180
> [ 2.045781] [<ffffffff811b9f5d>] SyS_read+0x4d/0xc0
> [ 2.045781] [<ffffffff8184e492>] system_call_fastpath+0x12/0x17
> [ 2.045781]
> -> #0 (&p->lock){+.+.+.}:
> [ 2.045781] [<ffffffff810a389f>] validate_chain.isra.36+0xfff/0x1400
> [ 2.045781] [<ffffffff810a6e99>] __lock_acquire+0x4d9/0xd40
> [ 2.045781] [<ffffffff810a7ff2>] lock_acquire+0xd2/0x2a0
> [ 2.045781] [<ffffffff81849629>] mutex_lock_nested+0x69/0x3c0
> [ 2.045781] [<ffffffff811e12fd>] seq_read+0x3d/0x3e0
> [ 2.045781] [<ffffffff81226428>] proc_reg_read+0x48/0x70
> [ 2.045781] [<ffffffff811b9377>] vfs_read+0x97/0x180
> [ 2.045781] [<ffffffff811bf1a8>] kernel_read+0x48/0x60
> [ 2.045781] [<ffffffff811bfb2c>] prepare_binprm+0xdc/0x180
> [ 2.045781] [<ffffffff811c139a>] do_execve_common.isra.29+0x4fa/0x960
> [ 2.045781] [<ffffffff811c1818>] do_execve+0x18/0x20
> [ 2.045781] [<ffffffff811c1b05>] SyS_execve+0x25/0x30
> [ 2.045781] [<ffffffff8184ea49>] stub_execve+0x69/0xa0
> [ 2.045781]
> [ 2.045781] other info that might help us debug this:
> [ 2.045781]
> [ 2.045781] Possible unsafe locking scenario:
> [ 2.045781]
> [ 2.045781] CPU0 CPU1
> [ 2.045781] ---- ----
> [ 2.045781] lock(&sig->cred_guard_mutex);
> [ 2.045781] lock(&p->lock);
> [ 2.045781] lock(&sig->cred_guard_mutex);
> [ 2.045781] lock(&p->lock);
> [ 2.045781]
> [ 2.045781] *** DEADLOCK ***
> [ 2.045781]
> [ 2.045781] 1 lock held by sh/94:
> [ 2.045781] #0: (&sig->cred_guard_mutex){+.+.+.}, at: [<ffffffff811c0e3d>] prepare_bprm_creds+0x2d/0x90
> [ 2.045781]
> [ 2.045781] stack backtrace:
> [ 2.045781] CPU: 0 PID: 94 Comm: sh Not tainted 3.18.0-rc7-00003-g3a18ca061311-dirty #237
> [ 2.045781] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
> [ 2.045781] ffffffff82a48d50 ffff88085427bad8 ffffffff81844a85 0000000000000cac
> [ 2.045781] ffffffff82a654a0 ffff88085427bb28 ffffffff810a1b03 0000000000000000
> [ 2.045781] ffff88085427bb68 ffff88085427bb28 ffff8808547f1500 ffff8808547f1c40
> [ 2.045781] Call Trace:
> [ 2.045781] [<ffffffff81844a85>] dump_stack+0x4e/0x68
> [ 2.045781] [<ffffffff810a1b03>] print_circular_bug+0x203/0x310
> [ 2.045781] [<ffffffff810a389f>] validate_chain.isra.36+0xfff/0x1400
> [ 2.045781] [<ffffffff8108fa76>] ? local_clock+0x16/0x30
> [ 2.045781] [<ffffffff810a6e99>] __lock_acquire+0x4d9/0xd40
> [ 2.045781] [<ffffffff810a7ff2>] lock_acquire+0xd2/0x2a0
> [ 2.045781] [<ffffffff811e12fd>] ? seq_read+0x3d/0x3e0
> [ 2.045781] [<ffffffff81849629>] mutex_lock_nested+0x69/0x3c0
> [ 2.045781] [<ffffffff811e12fd>] ? seq_read+0x3d/0x3e0
> [ 2.045781] [<ffffffff8108f9f8>] ? sched_clock_cpu+0x98/0xc0
> [ 2.045781] [<ffffffff811e12fd>] ? seq_read+0x3d/0x3e0
> [ 2.045781] [<ffffffff814050b9>] ? lockref_put_or_lock+0x29/0x40
> [ 2.045781] [<ffffffff811e12fd>] seq_read+0x3d/0x3e0
> [ 2.045781] [<ffffffff814050b9>] ? lockref_put_or_lock+0x29/0x40
> [ 2.045781] [<ffffffff81226428>] proc_reg_read+0x48/0x70
> [ 2.045781] [<ffffffff811b9377>] vfs_read+0x97/0x180
> [ 2.045781] [<ffffffff811bf1a8>] kernel_read+0x48/0x60
> [ 2.045781] [<ffffffff811bfb2c>] prepare_binprm+0xdc/0x180
> [ 2.045781] [<ffffffff811c139a>] do_execve_common.isra.29+0x4fa/0x960
> [ 2.092142] tsc: Refined TSC clocksource calibration: 2693.484 MHz
> [ 2.045781] [<ffffffff811c0fd3>] ? do_execve_common.isra.29+0x133/0x960
> [ 2.045781] [<ffffffff8184f04d>] ? retint_swapgs+0xe/0x13
> [ 2.045781] [<ffffffff811c1818>] do_execve+0x18/0x20
> [ 2.045781] [<ffffffff811c1b05>] SyS_execve+0x25/0x30
> [ 2.045781] [<ffffffff8184ea49>] stub_execve+0x69/0xa0
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH v2 4/7] fs/proc/task_mmu.c: shift mm_access() from m_start() to proc_maps_open()
2014-12-03 14:14 ` [PATCH v2 4/7] fs/proc/task_mmu.c: shift mm_access() from m_start() to proc_maps_open() Kirill A. Shutemov
2014-12-03 16:59 ` Eric W. Biederman
@ 2014-12-03 17:34 ` Oleg Nesterov
1 sibling, 0 replies; 4+ messages in thread
From: Oleg Nesterov @ 2014-12-03 17:34 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: David S. Miller, Linus Torvalds, Andrew Morton, Alexander Viro,
Cyrill Gorcunov, David Howells, Eric W. Biederman,
Kirill A. Shutemov, Peter Zijlstra, Sasha Levin, linux-fsdevel,
linux-kernel, Alexey Dobriyan, netdev
On 12/03, Kirill A. Shutemov wrote:
>
> On Tue, Aug 05, 2014 at 09:46:55PM +0200, Oleg Nesterov wrote:
> > A simple test-case from Kirill Shutemov
> >
> > cat /proc/self/maps >/dev/null
> > chmod +x /proc/self/net/packet
> > exec /proc/self/net/packet
> >
> > makes lockdep unhappy, cat/exec take seq_file->lock + cred_guard_mutex in
> > the opposite order.
>
> Oleg, I see it again with almost the same test-case:
>
> cat /proc/self/stack >/dev/null
> chmod +x /proc/self/net/packet
> exec /proc/self/net/packet
Yes, there are more lock_trace/mm_access (ab)users. Fortunately, they
are much simpler than proc/pid/maps (which also asked for other cleanups
and fixes).
I'll try to take a look, thanks for reminding.
And I agree with Eric, chmod+x probably makes no sense. Still I think
this code deserves some cleanups regardless. To the point I think that
lock_trace() should probably die.
Thanks!
Oleg.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH v2 4/7] fs/proc/task_mmu.c: shift mm_access() from m_start() to proc_maps_open()
2014-12-03 16:59 ` Eric W. Biederman
@ 2014-12-04 16:17 ` Kirill A. Shutemov
0 siblings, 0 replies; 4+ messages in thread
From: Kirill A. Shutemov @ 2014-12-04 16:17 UTC (permalink / raw)
To: Eric W. Biederman
Cc: Oleg Nesterov, David S. Miller, Linus Torvalds, Andrew Morton,
Alexander Viro, Cyrill Gorcunov, David Howells,
Kirill A. Shutemov, Peter Zijlstra, Sasha Levin, linux-fsdevel,
linux-kernel, Alexey Dobriyan, netdev
On Wed, Dec 03, 2014 at 10:59:57AM -0600, Eric W. Biederman wrote:
> "Kirill A. Shutemov" <kirill@shutemov.name> writes:
>
> > On Tue, Aug 05, 2014 at 09:46:55PM +0200, Oleg Nesterov wrote:
> >> A simple test-case from Kirill Shutemov
> >>
> >> cat /proc/self/maps >/dev/null
> >> chmod +x /proc/self/net/packet
> >> exec /proc/self/net/packet
> >>
> >> makes lockdep unhappy, cat/exec take seq_file->lock + cred_guard_mutex in
> >> the opposite order.
> >
> > Oleg, I see it again with almost the same test-case:
> >
> > cat /proc/self/stack >/dev/null
> > chmod +x /proc/self/net/packet
> > exec /proc/self/net/packet
> >
> > Looks like bunch of proc files were converted to use seq_file by Alexey
> > Dobriyan around the same time you've fixed the issue for /proc/pid/maps.
> >
> > More generic test-case:
> >
> > find /proc/self/ -type f -exec dd if='{}' of=/dev/null bs=1 count=1 ';' 2>/dev/null
> > chmod +x /proc/self/net/packet
> > exec /proc/self/net/packet
> >
> > David, any justification for allowing chmod +x for files under
> > /proc/pid/net?
>
> I don't think there are any good reasons for allowing chmod +x for the
> proc generic files. Certainly executing any of them is nonsense.
>
> I do recall some weird conner cases existing. I think they resulted
> in a need to preserve chmod if not chmod +x. This is just me saying
> tread carefully before you change anything.
>
> It really should be safe to tweak proc_notify_change to not allow
> messing with the executable bits of proc files.
BTW, we have MS_NOSUID and MS_NOEXEC set in ->s_flags for procfs since
2006 -- see 92d032855e64.
But there's no code which would translate them into vfsmount->mnt_flags |=
MNT_NOSUID/MNT_NOEXEC and we bypast nosuid/noexec checks on exec path.
Hm?..
--
Kirill A. Shutemov
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2014-12-04 16:17 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20140805194627.GA30693@redhat.com>
[not found] ` <20140805194655.GA30728@redhat.com>
2014-12-03 14:14 ` [PATCH v2 4/7] fs/proc/task_mmu.c: shift mm_access() from m_start() to proc_maps_open() Kirill A. Shutemov
2014-12-03 16:59 ` Eric W. Biederman
2014-12-04 16:17 ` Kirill A. Shutemov
2014-12-03 17:34 ` Oleg Nesterov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).