netlink: GPF in netlink

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* netlink: GPF in netlink_unicast
@ 2017-03-06 10:54 Dmitry Vyukov
  2017-03-06 18:10 ` Cong Wang
  0 siblings, 1 reply; 10+ messages in thread
From: Dmitry Vyukov @ 2017-03-06 10:54 UTC (permalink / raw)
  To: David Miller, Eric Dumazet, Herbert Xu, Cong Wang,
	Alexei Starovoitov, netdev, LKML
  Cc: syzkaller

Hello,

I've got the following crash while running syzkaller fuzzer on
net-next/8d70eeb84ab277377c017af6a21d0a337025dede:

kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] SMP KASAN
Dumping ftrace buffer:
   (ftrace buffer empty)
Modules linked in:
CPU: 0 PID: 883 Comm: kauditd Not tainted 4.10.0+ #6
Hardware name: Google Google Compute Engine/Google Compute Engine,
BIOS Google 01/01/2011
task: ffff8801d79f0240 task.stack: ffff8801d7a20000
RIP: 0010:sock_sndtimeo include/net/sock.h:2162 [inline]
RIP: 0010:netlink_unicast+0xdd/0x730 net/netlink/af_netlink.c:1249
RSP: 0018:ffff8801d7a27c38 EFLAGS: 00010206
RAX: 0000000000000056 RBX: ffff8801d7a27cd0 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00000000000002b0
RBP: ffff8801d7a27cf8 R08: ffffed00385cf286 R09: ffffed00385cf286
R10: 0000000000000006 R11: ffffed00385cf285 R12: 0000000000000000
R13: dffffc0000000000 R14: ffff8801c2fc3c80 R15: 00000000014000c0
FS:  0000000000000000(0000) GS:ffff8801dbe00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020cfd000 CR3: 00000001c758f000 CR4: 00000000001406f0
Call Trace:
 kauditd_send_unicast_skb+0x3c/0x70 kernel/audit.c:482
 kauditd_thread+0x174/0xb00 kernel/audit.c:599
 kthread+0x326/0x3f0 kernel/kthread.c:229
 ret_from_fork+0x31/0x40 arch/x86/entry/entry_64.S:430
Code: 44 89 fe e8 56 15 ff ff 8b 8d 70 ff ff ff 49 89 c6 31 c0 85 c9
75 27 e8 b2 b2 f4 fd 49 8d bc 24 b0 02 00 00 48 89 f8 48 c1 e8 03 <42>
80 3c 28 00 0f 85 37 06 00 00 49 8b 84 24 b0 02 00 00 4c 8d
RIP: sock_sndtimeo include/net/sock.h:2162 [inline] RSP: ffff8801d7a27c38
RIP: netlink_unicast+0xdd/0x730 net/netlink/af_netlink.c:1249 RSP:
ffff8801d7a27c38
---[ end trace ad1bba9d457430b6 ]---
Kernel panic - not syncing: Fatal exception


This is not reproducible and seems to be caused by an elusive race.
However, looking at the code I don't see any proper protection of
audit_sock (other than the if (!audit_pid) which is obviously not
enough to protect against races).

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: netlink: GPF in netlink_unicast
  2017-03-06 10:54 netlink: GPF in netlink_unicast Dmitry Vyukov
@ 2017-03-06 18:10 ` Cong Wang
  2017-03-07  4:03   ` Richard Guy Briggs
  0 siblings, 1 reply; 10+ messages in thread
From: Cong Wang @ 2017-03-06 18:10 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: David Miller, Eric Dumazet, Herbert Xu, Alexei Starovoitov,
	netdev, LKML, syzkaller, Richard Guy Briggs

On Mon, Mar 6, 2017 at 2:54 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
> Hello,
>
> I've got the following crash while running syzkaller fuzzer on
> net-next/8d70eeb84ab277377c017af6a21d0a337025dede:
>
> kasan: GPF could be caused by NULL-ptr deref or user memory access
> general protection fault: 0000 [#1] SMP KASAN
> Dumping ftrace buffer:
>    (ftrace buffer empty)
> Modules linked in:
> CPU: 0 PID: 883 Comm: kauditd Not tainted 4.10.0+ #6
> Hardware name: Google Google Compute Engine/Google Compute Engine,
> BIOS Google 01/01/2011
> task: ffff8801d79f0240 task.stack: ffff8801d7a20000
> RIP: 0010:sock_sndtimeo include/net/sock.h:2162 [inline]
> RIP: 0010:netlink_unicast+0xdd/0x730 net/netlink/af_netlink.c:1249
> RSP: 0018:ffff8801d7a27c38 EFLAGS: 00010206
> RAX: 0000000000000056 RBX: ffff8801d7a27cd0 RCX: 0000000000000000
> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00000000000002b0
> RBP: ffff8801d7a27cf8 R08: ffffed00385cf286 R09: ffffed00385cf286
> R10: 0000000000000006 R11: ffffed00385cf285 R12: 0000000000000000
> R13: dffffc0000000000 R14: ffff8801c2fc3c80 R15: 00000000014000c0
> FS:  0000000000000000(0000) GS:ffff8801dbe00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000020cfd000 CR3: 00000001c758f000 CR4: 00000000001406f0
> Call Trace:
>  kauditd_send_unicast_skb+0x3c/0x70 kernel/audit.c:482
>  kauditd_thread+0x174/0xb00 kernel/audit.c:599
>  kthread+0x326/0x3f0 kernel/kthread.c:229
>  ret_from_fork+0x31/0x40 arch/x86/entry/entry_64.S:430
> Code: 44 89 fe e8 56 15 ff ff 8b 8d 70 ff ff ff 49 89 c6 31 c0 85 c9
> 75 27 e8 b2 b2 f4 fd 49 8d bc 24 b0 02 00 00 48 89 f8 48 c1 e8 03 <42>
> 80 3c 28 00 0f 85 37 06 00 00 49 8b 84 24 b0 02 00 00 4c 8d
> RIP: sock_sndtimeo include/net/sock.h:2162 [inline] RSP: ffff8801d7a27c38
> RIP: netlink_unicast+0xdd/0x730 net/netlink/af_netlink.c:1249 RSP:
> ffff8801d7a27c38
> ---[ end trace ad1bba9d457430b6 ]---
> Kernel panic - not syncing: Fatal exception
>
>
> This is not reproducible and seems to be caused by an elusive race.
> However, looking at the code I don't see any proper protection of
> audit_sock (other than the if (!audit_pid) which is obviously not
> enough to protect against races).

audit_cmd_mutex is supposed to protect it, I think.
But kauditd_send_unicast_skb() seems not holding this mutex.

Richard?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: netlink: GPF in netlink_unicast
  2017-03-06 18:10 ` Cong Wang
@ 2017-03-07  4:03   ` Richard Guy Briggs
  2017-03-07 14:29     ` Paul Moore
  0 siblings, 1 reply; 10+ messages in thread
From: Richard Guy Briggs @ 2017-03-07  4:03 UTC (permalink / raw)
  To: Cong Wang
  Cc: Dmitry Vyukov, David Miller, Eric Dumazet, Herbert Xu,
	Alexei Starovoitov, netdev, LKML, syzkaller, pmoore, linux-audit

On 2017-03-06 10:10, Cong Wang wrote:
> On Mon, Mar 6, 2017 at 2:54 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
> > Hello,
> >
> > I've got the following crash while running syzkaller fuzzer on
> > net-next/8d70eeb84ab277377c017af6a21d0a337025dede:
> >
> > kasan: GPF could be caused by NULL-ptr deref or user memory access
> > general protection fault: 0000 [#1] SMP KASAN
> > Dumping ftrace buffer:
> >    (ftrace buffer empty)
> > Modules linked in:
> > CPU: 0 PID: 883 Comm: kauditd Not tainted 4.10.0+ #6
> > Hardware name: Google Google Compute Engine/Google Compute Engine,
> > BIOS Google 01/01/2011
> > task: ffff8801d79f0240 task.stack: ffff8801d7a20000
> > RIP: 0010:sock_sndtimeo include/net/sock.h:2162 [inline]
> > RIP: 0010:netlink_unicast+0xdd/0x730 net/netlink/af_netlink.c:1249
> > RSP: 0018:ffff8801d7a27c38 EFLAGS: 00010206
> > RAX: 0000000000000056 RBX: ffff8801d7a27cd0 RCX: 0000000000000000
> > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00000000000002b0
> > RBP: ffff8801d7a27cf8 R08: ffffed00385cf286 R09: ffffed00385cf286
> > R10: 0000000000000006 R11: ffffed00385cf285 R12: 0000000000000000
> > R13: dffffc0000000000 R14: ffff8801c2fc3c80 R15: 00000000014000c0
> > FS:  0000000000000000(0000) GS:ffff8801dbe00000(0000) knlGS:0000000000000000
> > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 0000000020cfd000 CR3: 00000001c758f000 CR4: 00000000001406f0
> > Call Trace:
> >  kauditd_send_unicast_skb+0x3c/0x70 kernel/audit.c:482
> >  kauditd_thread+0x174/0xb00 kernel/audit.c:599
> >  kthread+0x326/0x3f0 kernel/kthread.c:229
> >  ret_from_fork+0x31/0x40 arch/x86/entry/entry_64.S:430
> > Code: 44 89 fe e8 56 15 ff ff 8b 8d 70 ff ff ff 49 89 c6 31 c0 85 c9
> > 75 27 e8 b2 b2 f4 fd 49 8d bc 24 b0 02 00 00 48 89 f8 48 c1 e8 03 <42>
> > 80 3c 28 00 0f 85 37 06 00 00 49 8b 84 24 b0 02 00 00 4c 8d
> > RIP: sock_sndtimeo include/net/sock.h:2162 [inline] RSP: ffff8801d7a27c38
> > RIP: netlink_unicast+0xdd/0x730 net/netlink/af_netlink.c:1249 RSP:
> > ffff8801d7a27c38
> > ---[ end trace ad1bba9d457430b6 ]---
> > Kernel panic - not syncing: Fatal exception
> >
> >
> > This is not reproducible and seems to be caused by an elusive race.
> > However, looking at the code I don't see any proper protection of
> > audit_sock (other than the if (!audit_pid) which is obviously not
> > enough to protect against races).
> 
> audit_cmd_mutex is supposed to protect it, I think.
> But kauditd_send_unicast_skb() seems not holding this mutex.

Hmmmm, I wonder if it makes sense to wrap most of the contents of the
outer while loop in kauditd_thread in the audit_cmd_mutex, or around the
first two innter while loops and the "if (auditd)" condition after the
"quick_loop:" label.  The condition on auditd is supposed to catch that
case.  We don't want it locked while playing with the scheduler at the
bottom of that function.

> Richard?

- RGB

--
Richard Guy Briggs <rgb@redhat.com>
Kernel Security Engineering, Base Operating Systems, Red Hat
Remote, Ottawa, Canada
Voice: +1.647.777.2635, Internal: (81) 32635

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: netlink: GPF in netlink_unicast
  2017-03-07  4:03   ` Richard Guy Briggs
@ 2017-03-07 14:29     ` Paul Moore
  2017-03-07 15:55       ` Richard Guy Briggs
  0 siblings, 1 reply; 10+ messages in thread
From: Paul Moore @ 2017-03-07 14:29 UTC (permalink / raw)
  To: Richard Guy Briggs
  Cc: Cong Wang, Herbert Xu, netdev, Alexei Starovoitov, LKML,
	Eric Dumazet, syzkaller, linux-audit, David Miller, Dmitry Vyukov

On Mon, Mar 6, 2017 at 11:03 PM, Richard Guy Briggs <rgb@redhat.com> wrote:
> On 2017-03-06 10:10, Cong Wang wrote:
>> On Mon, Mar 6, 2017 at 2:54 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
>> > Hello,
>> >
>> > I've got the following crash while running syzkaller fuzzer on
>> > net-next/8d70eeb84ab277377c017af6a21d0a337025dede:
>> >
>> > kasan: GPF could be caused by NULL-ptr deref or user memory access
>> > general protection fault: 0000 [#1] SMP KASAN
>> > Dumping ftrace buffer:
>> >    (ftrace buffer empty)
>> > Modules linked in:
>> > CPU: 0 PID: 883 Comm: kauditd Not tainted 4.10.0+ #6
>> > Hardware name: Google Google Compute Engine/Google Compute Engine,
>> > BIOS Google 01/01/2011
>> > task: ffff8801d79f0240 task.stack: ffff8801d7a20000
>> > RIP: 0010:sock_sndtimeo include/net/sock.h:2162 [inline]
>> > RIP: 0010:netlink_unicast+0xdd/0x730 net/netlink/af_netlink.c:1249
>> > RSP: 0018:ffff8801d7a27c38 EFLAGS: 00010206
>> > RAX: 0000000000000056 RBX: ffff8801d7a27cd0 RCX: 0000000000000000
>> > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00000000000002b0
>> > RBP: ffff8801d7a27cf8 R08: ffffed00385cf286 R09: ffffed00385cf286
>> > R10: 0000000000000006 R11: ffffed00385cf285 R12: 0000000000000000
>> > R13: dffffc0000000000 R14: ffff8801c2fc3c80 R15: 00000000014000c0
>> > FS:  0000000000000000(0000) GS:ffff8801dbe00000(0000) knlGS:0000000000000000
>> > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> > CR2: 0000000020cfd000 CR3: 00000001c758f000 CR4: 00000000001406f0
>> > Call Trace:
>> >  kauditd_send_unicast_skb+0x3c/0x70 kernel/audit.c:482
>> >  kauditd_thread+0x174/0xb00 kernel/audit.c:599
>> >  kthread+0x326/0x3f0 kernel/kthread.c:229
>> >  ret_from_fork+0x31/0x40 arch/x86/entry/entry_64.S:430
>> > Code: 44 89 fe e8 56 15 ff ff 8b 8d 70 ff ff ff 49 89 c6 31 c0 85 c9
>> > 75 27 e8 b2 b2 f4 fd 49 8d bc 24 b0 02 00 00 48 89 f8 48 c1 e8 03 <42>
>> > 80 3c 28 00 0f 85 37 06 00 00 49 8b 84 24 b0 02 00 00 4c 8d
>> > RIP: sock_sndtimeo include/net/sock.h:2162 [inline] RSP: ffff8801d7a27c38
>> > RIP: netlink_unicast+0xdd/0x730 net/netlink/af_netlink.c:1249 RSP:
>> > ffff8801d7a27c38
>> > ---[ end trace ad1bba9d457430b6 ]---
>> > Kernel panic - not syncing: Fatal exception
>> >
>> >
>> > This is not reproducible and seems to be caused by an elusive race.
>> > However, looking at the code I don't see any proper protection of
>> > audit_sock (other than the if (!audit_pid) which is obviously not
>> > enough to protect against races).
>>
>> audit_cmd_mutex is supposed to protect it, I think.
>> But kauditd_send_unicast_skb() seems not holding this mutex.
>
> Hmmmm, I wonder if it makes sense to wrap most of the contents of the
> outer while loop in kauditd_thread in the audit_cmd_mutex, or around the
> first two innter while loops and the "if (auditd)" condition after the
> "quick_loop:" label.  The condition on auditd is supposed to catch that
> case.  We don't want it locked while playing with the scheduler at the
> bottom of that function.

Let me look into this and play around with a few things.  I suspected
there might be a problem here, so I've got thoughts on how we might
resolve it; I just need to see code them up and see what option sucks
the least.

FWIW Richard, yes wrapping most of kauditd_thread *should* resolve
this but it's pretty heavy handed and not my first choice.

-- 
paul moore
www.paul-moore.com

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: netlink: GPF in netlink_unicast
  2017-03-07 14:29     ` Paul Moore
@ 2017-03-07 15:55       ` Richard Guy Briggs
  2017-03-07 18:44         ` Paul Moore
  0 siblings, 1 reply; 10+ messages in thread
From: Richard Guy Briggs @ 2017-03-07 15:55 UTC (permalink / raw)
  To: Paul Moore
  Cc: Cong Wang, Herbert Xu, netdev, Alexei Starovoitov, LKML,
	Eric Dumazet, syzkaller, linux-audit, David Miller, Dmitry Vyukov

On 2017-03-07 09:29, Paul Moore wrote:
> On Mon, Mar 6, 2017 at 11:03 PM, Richard Guy Briggs <rgb@redhat.com> wrote:
> > On 2017-03-06 10:10, Cong Wang wrote:
> >> On Mon, Mar 6, 2017 at 2:54 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
> >> > Hello,
> >> >
> >> > I've got the following crash while running syzkaller fuzzer on
> >> > net-next/8d70eeb84ab277377c017af6a21d0a337025dede:
> >> >
> >> > kasan: GPF could be caused by NULL-ptr deref or user memory access
> >> > general protection fault: 0000 [#1] SMP KASAN
> >> > Dumping ftrace buffer:
> >> >    (ftrace buffer empty)
> >> > Modules linked in:
> >> > CPU: 0 PID: 883 Comm: kauditd Not tainted 4.10.0+ #6
> >> > Hardware name: Google Google Compute Engine/Google Compute Engine,
> >> > BIOS Google 01/01/2011
> >> > task: ffff8801d79f0240 task.stack: ffff8801d7a20000
> >> > RIP: 0010:sock_sndtimeo include/net/sock.h:2162 [inline]
> >> > RIP: 0010:netlink_unicast+0xdd/0x730 net/netlink/af_netlink.c:1249
> >> > RSP: 0018:ffff8801d7a27c38 EFLAGS: 00010206
> >> > RAX: 0000000000000056 RBX: ffff8801d7a27cd0 RCX: 0000000000000000
> >> > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00000000000002b0
> >> > RBP: ffff8801d7a27cf8 R08: ffffed00385cf286 R09: ffffed00385cf286
> >> > R10: 0000000000000006 R11: ffffed00385cf285 R12: 0000000000000000
> >> > R13: dffffc0000000000 R14: ffff8801c2fc3c80 R15: 00000000014000c0
> >> > FS:  0000000000000000(0000) GS:ffff8801dbe00000(0000) knlGS:0000000000000000
> >> > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >> > CR2: 0000000020cfd000 CR3: 00000001c758f000 CR4: 00000000001406f0
> >> > Call Trace:
> >> >  kauditd_send_unicast_skb+0x3c/0x70 kernel/audit.c:482
> >> >  kauditd_thread+0x174/0xb00 kernel/audit.c:599
> >> >  kthread+0x326/0x3f0 kernel/kthread.c:229
> >> >  ret_from_fork+0x31/0x40 arch/x86/entry/entry_64.S:430
> >> > Code: 44 89 fe e8 56 15 ff ff 8b 8d 70 ff ff ff 49 89 c6 31 c0 85 c9
> >> > 75 27 e8 b2 b2 f4 fd 49 8d bc 24 b0 02 00 00 48 89 f8 48 c1 e8 03 <42>
> >> > 80 3c 28 00 0f 85 37 06 00 00 49 8b 84 24 b0 02 00 00 4c 8d
> >> > RIP: sock_sndtimeo include/net/sock.h:2162 [inline] RSP: ffff8801d7a27c38
> >> > RIP: netlink_unicast+0xdd/0x730 net/netlink/af_netlink.c:1249 RSP:
> >> > ffff8801d7a27c38
> >> > ---[ end trace ad1bba9d457430b6 ]---
> >> > Kernel panic - not syncing: Fatal exception
> >> >
> >> >
> >> > This is not reproducible and seems to be caused by an elusive race.
> >> > However, looking at the code I don't see any proper protection of
> >> > audit_sock (other than the if (!audit_pid) which is obviously not
> >> > enough to protect against races).
> >>
> >> audit_cmd_mutex is supposed to protect it, I think.
> >> But kauditd_send_unicast_skb() seems not holding this mutex.
> >
> > Hmmmm, I wonder if it makes sense to wrap most of the contents of the
> > outer while loop in kauditd_thread in the audit_cmd_mutex, or around the
> > first two innter while loops and the "if (auditd)" condition after the
> > "quick_loop:" label.  The condition on auditd is supposed to catch that
> > case.  We don't want it locked while playing with the scheduler at the
> > bottom of that function.
> 
> Let me look into this and play around with a few things.  I suspected
> there might be a problem here, so I've got thoughts on how we might
> resolve it; I just need to see code them up and see what option sucks
> the least.
> 
> FWIW Richard, yes wrapping most of kauditd_thread *should* resolve
> this but it's pretty heavy handed and not my first choice.

That's why the inner loops made a bit more sense since it wasn't really
necessary and ran afoul of the scheduler anyways.

> paul moore

- RGB

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: netlink: GPF in netlink_unicast
  2017-03-07 15:55       ` Richard Guy Briggs
@ 2017-03-07 18:44         ` Paul Moore
  2017-03-07 19:23           ` Paul Moore
  0 siblings, 1 reply; 10+ messages in thread
From: Paul Moore @ 2017-03-07 18:44 UTC (permalink / raw)
  To: Richard Guy Briggs
  Cc: Cong Wang, Herbert Xu, netdev, Alexei Starovoitov, LKML,
	Eric Dumazet, syzkaller, linux-audit, David Miller, Dmitry Vyukov

On Tue, Mar 7, 2017 at 10:55 AM, Richard Guy Briggs <rgb@redhat.com> wrote:
> On 2017-03-07 09:29, Paul Moore wrote:
>> On Mon, Mar 6, 2017 at 11:03 PM, Richard Guy Briggs <rgb@redhat.com> wrote:
>> > On 2017-03-06 10:10, Cong Wang wrote:
>> >> On Mon, Mar 6, 2017 at 2:54 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
>> >> > Hello,
>> >> >
>> >> > I've got the following crash while running syzkaller fuzzer on
>> >> > net-next/8d70eeb84ab277377c017af6a21d0a337025dede:
>> >> >
>> >> > kasan: GPF could be caused by NULL-ptr deref or user memory access
>> >> > general protection fault: 0000 [#1] SMP KASAN
>> >> > Dumping ftrace buffer:
>> >> >    (ftrace buffer empty)
>> >> > Modules linked in:
>> >> > CPU: 0 PID: 883 Comm: kauditd Not tainted 4.10.0+ #6
>> >> > Hardware name: Google Google Compute Engine/Google Compute Engine,
>> >> > BIOS Google 01/01/2011
>> >> > task: ffff8801d79f0240 task.stack: ffff8801d7a20000
>> >> > RIP: 0010:sock_sndtimeo include/net/sock.h:2162 [inline]
>> >> > RIP: 0010:netlink_unicast+0xdd/0x730 net/netlink/af_netlink.c:1249
>> >> > RSP: 0018:ffff8801d7a27c38 EFLAGS: 00010206
>> >> > RAX: 0000000000000056 RBX: ffff8801d7a27cd0 RCX: 0000000000000000
>> >> > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00000000000002b0
>> >> > RBP: ffff8801d7a27cf8 R08: ffffed00385cf286 R09: ffffed00385cf286
>> >> > R10: 0000000000000006 R11: ffffed00385cf285 R12: 0000000000000000
>> >> > R13: dffffc0000000000 R14: ffff8801c2fc3c80 R15: 00000000014000c0
>> >> > FS:  0000000000000000(0000) GS:ffff8801dbe00000(0000) knlGS:0000000000000000
>> >> > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> >> > CR2: 0000000020cfd000 CR3: 00000001c758f000 CR4: 00000000001406f0
>> >> > Call Trace:
>> >> >  kauditd_send_unicast_skb+0x3c/0x70 kernel/audit.c:482
>> >> >  kauditd_thread+0x174/0xb00 kernel/audit.c:599
>> >> >  kthread+0x326/0x3f0 kernel/kthread.c:229
>> >> >  ret_from_fork+0x31/0x40 arch/x86/entry/entry_64.S:430
>> >> > Code: 44 89 fe e8 56 15 ff ff 8b 8d 70 ff ff ff 49 89 c6 31 c0 85 c9
>> >> > 75 27 e8 b2 b2 f4 fd 49 8d bc 24 b0 02 00 00 48 89 f8 48 c1 e8 03 <42>
>> >> > 80 3c 28 00 0f 85 37 06 00 00 49 8b 84 24 b0 02 00 00 4c 8d
>> >> > RIP: sock_sndtimeo include/net/sock.h:2162 [inline] RSP: ffff8801d7a27c38
>> >> > RIP: netlink_unicast+0xdd/0x730 net/netlink/af_netlink.c:1249 RSP:
>> >> > ffff8801d7a27c38
>> >> > ---[ end trace ad1bba9d457430b6 ]---
>> >> > Kernel panic - not syncing: Fatal exception
>> >> >
>> >> >
>> >> > This is not reproducible and seems to be caused by an elusive race.
>> >> > However, looking at the code I don't see any proper protection of
>> >> > audit_sock (other than the if (!audit_pid) which is obviously not
>> >> > enough to protect against races).
>> >>
>> >> audit_cmd_mutex is supposed to protect it, I think.
>> >> But kauditd_send_unicast_skb() seems not holding this mutex.
>> >
>> > Hmmmm, I wonder if it makes sense to wrap most of the contents of the
>> > outer while loop in kauditd_thread in the audit_cmd_mutex, or around the
>> > first two innter while loops and the "if (auditd)" condition after the
>> > "quick_loop:" label.  The condition on auditd is supposed to catch that
>> > case.  We don't want it locked while playing with the scheduler at the
>> > bottom of that function.
>>
>> Let me look into this and play around with a few things.  I suspected
>> there might be a problem here, so I've got thoughts on how we might
>> resolve it; I just need to see code them up and see what option sucks
>> the least.
>>
>> FWIW Richard, yes wrapping most of kauditd_thread *should* resolve
>> this but it's pretty heavy handed and not my first choice.
>
> That's why the inner loops made a bit more sense since it wasn't really
> necessary and ran afoul of the scheduler anyways.

One of my preferred options was to get us away from protecting
everything with the audit_cmd_mutex by creating a new locking approach
for the auditd connection state (using RCU/spinlocks since it rarely
changes in practice) and leaving the audit_cmd_mutex for it's
traditional role.  This should minimize the performance impact of the
lock and clean things up a bit.  I'm also moving all the auditd
connection state into a single struct (instead of several variables
associated only by convention) which moves us oh so slightly closer to
allowing multiple auditd connections (hey, it's something).

It's taking a bit longer than expected as I'm dealing with a bit of a
head cold (or something) and my mind is far less than 100% at the
moment ...

-- 
paul moore
www.paul-moore.com

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: netlink: GPF in netlink_unicast
  2017-03-07 18:44         ` Paul Moore
@ 2017-03-07 19:23           ` Paul Moore
  2017-03-08 13:25             ` Richard Guy Briggs
  0 siblings, 1 reply; 10+ messages in thread
From: Paul Moore @ 2017-03-07 19:23 UTC (permalink / raw)
  To: Richard Guy Briggs
  Cc: Cong Wang, Herbert Xu, netdev, Alexei Starovoitov, LKML,
	Eric Dumazet, syzkaller, linux-audit, David Miller, Dmitry Vyukov

On Tue, Mar 7, 2017 at 1:44 PM, Paul Moore <paul@paul-moore.com> wrote:
> On Tue, Mar 7, 2017 at 10:55 AM, Richard Guy Briggs <rgb@redhat.com> wrote:
>> On 2017-03-07 09:29, Paul Moore wrote:
>>> On Mon, Mar 6, 2017 at 11:03 PM, Richard Guy Briggs <rgb@redhat.com> wrote:
>>> > On 2017-03-06 10:10, Cong Wang wrote:
>>> >> On Mon, Mar 6, 2017 at 2:54 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
>>> >> > Hello,
>>> >> >
>>> >> > I've got the following crash while running syzkaller fuzzer on
>>> >> > net-next/8d70eeb84ab277377c017af6a21d0a337025dede:
>>> >> >
>>> >> > kasan: GPF could be caused by NULL-ptr deref or user memory access
>>> >> > general protection fault: 0000 [#1] SMP KASAN
>>> >> > Dumping ftrace buffer:
>>> >> >    (ftrace buffer empty)
>>> >> > Modules linked in:
>>> >> > CPU: 0 PID: 883 Comm: kauditd Not tainted 4.10.0+ #6
>>> >> > Hardware name: Google Google Compute Engine/Google Compute Engine,
>>> >> > BIOS Google 01/01/2011
>>> >> > task: ffff8801d79f0240 task.stack: ffff8801d7a20000
>>> >> > RIP: 0010:sock_sndtimeo include/net/sock.h:2162 [inline]
>>> >> > RIP: 0010:netlink_unicast+0xdd/0x730 net/netlink/af_netlink.c:1249
>>> >> > RSP: 0018:ffff8801d7a27c38 EFLAGS: 00010206
>>> >> > RAX: 0000000000000056 RBX: ffff8801d7a27cd0 RCX: 0000000000000000
>>> >> > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00000000000002b0
>>> >> > RBP: ffff8801d7a27cf8 R08: ffffed00385cf286 R09: ffffed00385cf286
>>> >> > R10: 0000000000000006 R11: ffffed00385cf285 R12: 0000000000000000
>>> >> > R13: dffffc0000000000 R14: ffff8801c2fc3c80 R15: 00000000014000c0
>>> >> > FS:  0000000000000000(0000) GS:ffff8801dbe00000(0000) knlGS:0000000000000000
>>> >> > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> >> > CR2: 0000000020cfd000 CR3: 00000001c758f000 CR4: 00000000001406f0
>>> >> > Call Trace:
>>> >> >  kauditd_send_unicast_skb+0x3c/0x70 kernel/audit.c:482
>>> >> >  kauditd_thread+0x174/0xb00 kernel/audit.c:599
>>> >> >  kthread+0x326/0x3f0 kernel/kthread.c:229
>>> >> >  ret_from_fork+0x31/0x40 arch/x86/entry/entry_64.S:430
>>> >> > Code: 44 89 fe e8 56 15 ff ff 8b 8d 70 ff ff ff 49 89 c6 31 c0 85 c9
>>> >> > 75 27 e8 b2 b2 f4 fd 49 8d bc 24 b0 02 00 00 48 89 f8 48 c1 e8 03 <42>
>>> >> > 80 3c 28 00 0f 85 37 06 00 00 49 8b 84 24 b0 02 00 00 4c 8d
>>> >> > RIP: sock_sndtimeo include/net/sock.h:2162 [inline] RSP: ffff8801d7a27c38
>>> >> > RIP: netlink_unicast+0xdd/0x730 net/netlink/af_netlink.c:1249 RSP:
>>> >> > ffff8801d7a27c38
>>> >> > ---[ end trace ad1bba9d457430b6 ]---
>>> >> > Kernel panic - not syncing: Fatal exception
>>> >> >
>>> >> >
>>> >> > This is not reproducible and seems to be caused by an elusive race.
>>> >> > However, looking at the code I don't see any proper protection of
>>> >> > audit_sock (other than the if (!audit_pid) which is obviously not
>>> >> > enough to protect against races).
>>> >>
>>> >> audit_cmd_mutex is supposed to protect it, I think.
>>> >> But kauditd_send_unicast_skb() seems not holding this mutex.
>>> >
>>> > Hmmmm, I wonder if it makes sense to wrap most of the contents of the
>>> > outer while loop in kauditd_thread in the audit_cmd_mutex, or around the
>>> > first two innter while loops and the "if (auditd)" condition after the
>>> > "quick_loop:" label.  The condition on auditd is supposed to catch that
>>> > case.  We don't want it locked while playing with the scheduler at the
>>> > bottom of that function.
>>>
>>> Let me look into this and play around with a few things.  I suspected
>>> there might be a problem here, so I've got thoughts on how we might
>>> resolve it; I just need to see code them up and see what option sucks
>>> the least.
>>>
>>> FWIW Richard, yes wrapping most of kauditd_thread *should* resolve
>>> this but it's pretty heavy handed and not my first choice.
>>
>> That's why the inner loops made a bit more sense since it wasn't really
>> necessary and ran afoul of the scheduler anyways.
>
> One of my preferred options was to get us away from protecting
> everything with the audit_cmd_mutex by creating a new locking approach
> for the auditd connection state (using RCU/spinlocks since it rarely
> changes in practice) and leaving the audit_cmd_mutex for it's
> traditional role.  This should minimize the performance impact of the
> lock and clean things up a bit.  I'm also moving all the auditd
> connection state into a single struct (instead of several variables
> associated only by convention) which moves us oh so slightly closer to
> allowing multiple auditd connections (hey, it's something).
>
> It's taking a bit longer than expected as I'm dealing with a bit of a
> head cold (or something) and my mind is far less than 100% at the
> moment ...

Ooof.  I just noticed something, and maybe this is the fever talking,
but why do we ever NULL out audit_sock and why are we bothering with
those holds/puts?  We create the audit netlink socket in
audit_net_init() and it should remain valid until we kill it in
audit_next_exit(); we sorta cheat on this now because we track the
socket both in the per-netns audit_net struct as well as audit_sock,
but that doesn't make our audit_sock manipulations right ...

Man I hate this code.  I *really* hate this code.

-- 
paul moore
www.paul-moore.com

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: netlink: GPF in netlink_unicast
  2017-03-07 19:23           ` Paul Moore
@ 2017-03-08 13:25             ` Richard Guy Briggs
  2017-03-08 13:51               ` Richard Guy Briggs
  2017-03-08 13:51               ` Paul Moore
  0 siblings, 2 replies; 10+ messages in thread
From: Richard Guy Briggs @ 2017-03-08 13:25 UTC (permalink / raw)
  To: Paul Moore
  Cc: Cong Wang, Herbert Xu, netdev, Alexei Starovoitov, LKML,
	Eric Dumazet, syzkaller, linux-audit, David Miller, Dmitry Vyukov

On 2017-03-07 14:23, Paul Moore wrote:
> On Tue, Mar 7, 2017 at 1:44 PM, Paul Moore <paul@paul-moore.com> wrote:
> > On Tue, Mar 7, 2017 at 10:55 AM, Richard Guy Briggs <rgb@redhat.com> wrote:
> >> On 2017-03-07 09:29, Paul Moore wrote:
> >>> On Mon, Mar 6, 2017 at 11:03 PM, Richard Guy Briggs <rgb@redhat.com> wrote:
> >>> > On 2017-03-06 10:10, Cong Wang wrote:
> >>> >> On Mon, Mar 6, 2017 at 2:54 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
> >>> >> > Hello,
> >>> >> >
> >>> >> > I've got the following crash while running syzkaller fuzzer on
> >>> >> > net-next/8d70eeb84ab277377c017af6a21d0a337025dede:
> >>> >> >
> >>> >> > kasan: GPF could be caused by NULL-ptr deref or user memory access
> >>> >> > general protection fault: 0000 [#1] SMP KASAN
> >>> >> > Dumping ftrace buffer:
> >>> >> >    (ftrace buffer empty)
> >>> >> > Modules linked in:
> >>> >> > CPU: 0 PID: 883 Comm: kauditd Not tainted 4.10.0+ #6
> >>> >> > Hardware name: Google Google Compute Engine/Google Compute Engine,
> >>> >> > BIOS Google 01/01/2011
> >>> >> > task: ffff8801d79f0240 task.stack: ffff8801d7a20000
> >>> >> > RIP: 0010:sock_sndtimeo include/net/sock.h:2162 [inline]
> >>> >> > RIP: 0010:netlink_unicast+0xdd/0x730 net/netlink/af_netlink.c:1249
> >>> >> > RSP: 0018:ffff8801d7a27c38 EFLAGS: 00010206
> >>> >> > RAX: 0000000000000056 RBX: ffff8801d7a27cd0 RCX: 0000000000000000
> >>> >> > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00000000000002b0
> >>> >> > RBP: ffff8801d7a27cf8 R08: ffffed00385cf286 R09: ffffed00385cf286
> >>> >> > R10: 0000000000000006 R11: ffffed00385cf285 R12: 0000000000000000
> >>> >> > R13: dffffc0000000000 R14: ffff8801c2fc3c80 R15: 00000000014000c0
> >>> >> > FS:  0000000000000000(0000) GS:ffff8801dbe00000(0000) knlGS:0000000000000000
> >>> >> > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >>> >> > CR2: 0000000020cfd000 CR3: 00000001c758f000 CR4: 00000000001406f0
> >>> >> > Call Trace:
> >>> >> >  kauditd_send_unicast_skb+0x3c/0x70 kernel/audit.c:482
> >>> >> >  kauditd_thread+0x174/0xb00 kernel/audit.c:599
> >>> >> >  kthread+0x326/0x3f0 kernel/kthread.c:229
> >>> >> >  ret_from_fork+0x31/0x40 arch/x86/entry/entry_64.S:430
> >>> >> > Code: 44 89 fe e8 56 15 ff ff 8b 8d 70 ff ff ff 49 89 c6 31 c0 85 c9
> >>> >> > 75 27 e8 b2 b2 f4 fd 49 8d bc 24 b0 02 00 00 48 89 f8 48 c1 e8 03 <42>
> >>> >> > 80 3c 28 00 0f 85 37 06 00 00 49 8b 84 24 b0 02 00 00 4c 8d
> >>> >> > RIP: sock_sndtimeo include/net/sock.h:2162 [inline] RSP: ffff8801d7a27c38
> >>> >> > RIP: netlink_unicast+0xdd/0x730 net/netlink/af_netlink.c:1249 RSP:
> >>> >> > ffff8801d7a27c38
> >>> >> > ---[ end trace ad1bba9d457430b6 ]---
> >>> >> > Kernel panic - not syncing: Fatal exception
> >>> >> >
> >>> >> >
> >>> >> > This is not reproducible and seems to be caused by an elusive race.
> >>> >> > However, looking at the code I don't see any proper protection of
> >>> >> > audit_sock (other than the if (!audit_pid) which is obviously not
> >>> >> > enough to protect against races).
> >>> >>
> >>> >> audit_cmd_mutex is supposed to protect it, I think.
> >>> >> But kauditd_send_unicast_skb() seems not holding this mutex.
> >>> >
> >>> > Hmmmm, I wonder if it makes sense to wrap most of the contents of the
> >>> > outer while loop in kauditd_thread in the audit_cmd_mutex, or around the
> >>> > first two innter while loops and the "if (auditd)" condition after the
> >>> > "quick_loop:" label.  The condition on auditd is supposed to catch that
> >>> > case.  We don't want it locked while playing with the scheduler at the
> >>> > bottom of that function.
> >>>
> >>> Let me look into this and play around with a few things.  I suspected
> >>> there might be a problem here, so I've got thoughts on how we might
> >>> resolve it; I just need to see code them up and see what option sucks
> >>> the least.
> >>>
> >>> FWIW Richard, yes wrapping most of kauditd_thread *should* resolve
> >>> this but it's pretty heavy handed and not my first choice.
> >>
> >> That's why the inner loops made a bit more sense since it wasn't really
> >> necessary and ran afoul of the scheduler anyways.
> >
> > One of my preferred options was to get us away from protecting
> > everything with the audit_cmd_mutex by creating a new locking approach
> > for the auditd connection state (using RCU/spinlocks since it rarely
> > changes in practice) and leaving the audit_cmd_mutex for it's
> > traditional role.  This should minimize the performance impact of the
> > lock and clean things up a bit.  I'm also moving all the auditd
> > connection state into a single struct (instead of several variables
> > associated only by convention) which moves us oh so slightly closer to
> > allowing multiple auditd connections (hey, it's something).
> >
> > It's taking a bit longer than expected as I'm dealing with a bit of a
> > head cold (or something) and my mind is far less than 100% at the
> > moment ...
> 
> Ooof.  I just noticed something, and maybe this is the fever talking,
> but why do we ever NULL out audit_sock and why are we bothering with
> those holds/puts?  We create the audit netlink socket in
> audit_net_init() and it should remain valid until we kill it in
> audit_next_exit(); we sorta cheat on this now because we track the
> socket both in the per-netns audit_net struct as well as audit_sock,
> but that doesn't make our audit_sock manipulations right ...

At the moment, you are right, there is no reason to null audit_sock, and
not like auditd will appear on a different sock yet.

The only excuse I can give is that this was anticipating audit daemons
in more than one user namespace necessarily with their own network
namespaces.  The AUDIT_GET, AUDIT_LIST_RULES commands are treated
properly since they use the per-netns audit_net struct and don't use the
primary queue.  The AUDIT_USER_* messages are converted from their
originating namespaces ok, but will need to be tracked what network
namespace they came from for multiple audit daemons in the future.

> Man I hate this code.  I *really* hate this code.
> 
> paul moore

- RGB

--
Richard Guy Briggs <rgb@redhat.com>
Kernel Security Engineering, Base Operating Systems, Red Hat
Remote, Ottawa, Canada
Voice: +1.647.777.2635, Internal: (81) 32635

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: netlink: GPF in netlink_unicast
  2017-03-08 13:25             ` Richard Guy Briggs
@ 2017-03-08 13:51               ` Richard Guy Briggs
  2017-03-08 13:51               ` Paul Moore
  1 sibling, 0 replies; 10+ messages in thread
From: Richard Guy Briggs @ 2017-03-08 13:51 UTC (permalink / raw)
  To: Paul Moore
  Cc: Cong Wang, Herbert Xu, netdev, Alexei Starovoitov, LKML,
	Eric Dumazet, syzkaller, linux-audit, David Miller, Dmitry Vyukov

On 2017-03-08 08:25, Richard Guy Briggs wrote:
> On 2017-03-07 14:23, Paul Moore wrote:
> > On Tue, Mar 7, 2017 at 1:44 PM, Paul Moore <paul@paul-moore.com> wrote:
> > > On Tue, Mar 7, 2017 at 10:55 AM, Richard Guy Briggs <rgb@redhat.com> wrote:
> > >> On 2017-03-07 09:29, Paul Moore wrote:
> > >>> On Mon, Mar 6, 2017 at 11:03 PM, Richard Guy Briggs <rgb@redhat.com> wrote:
> > >>> > On 2017-03-06 10:10, Cong Wang wrote:
> > >>> >> On Mon, Mar 6, 2017 at 2:54 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
> > >>> >> > Hello,
> > >>> >> >
> > >>> >> > I've got the following crash while running syzkaller fuzzer on
> > >>> >> > net-next/8d70eeb84ab277377c017af6a21d0a337025dede:
> > >>> >> >
> > >>> >> > kasan: GPF could be caused by NULL-ptr deref or user memory access
> > >>> >> > general protection fault: 0000 [#1] SMP KASAN
> > >>> >> > Dumping ftrace buffer:
> > >>> >> >    (ftrace buffer empty)
> > >>> >> > Modules linked in:
> > >>> >> > CPU: 0 PID: 883 Comm: kauditd Not tainted 4.10.0+ #6
> > >>> >> > Hardware name: Google Google Compute Engine/Google Compute Engine,
> > >>> >> > BIOS Google 01/01/2011
> > >>> >> > task: ffff8801d79f0240 task.stack: ffff8801d7a20000
> > >>> >> > RIP: 0010:sock_sndtimeo include/net/sock.h:2162 [inline]
> > >>> >> > RIP: 0010:netlink_unicast+0xdd/0x730 net/netlink/af_netlink.c:1249
> > >>> >> > RSP: 0018:ffff8801d7a27c38 EFLAGS: 00010206
> > >>> >> > RAX: 0000000000000056 RBX: ffff8801d7a27cd0 RCX: 0000000000000000
> > >>> >> > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00000000000002b0
> > >>> >> > RBP: ffff8801d7a27cf8 R08: ffffed00385cf286 R09: ffffed00385cf286
> > >>> >> > R10: 0000000000000006 R11: ffffed00385cf285 R12: 0000000000000000
> > >>> >> > R13: dffffc0000000000 R14: ffff8801c2fc3c80 R15: 00000000014000c0
> > >>> >> > FS:  0000000000000000(0000) GS:ffff8801dbe00000(0000) knlGS:0000000000000000
> > >>> >> > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > >>> >> > CR2: 0000000020cfd000 CR3: 00000001c758f000 CR4: 00000000001406f0
> > >>> >> > Call Trace:
> > >>> >> >  kauditd_send_unicast_skb+0x3c/0x70 kernel/audit.c:482
> > >>> >> >  kauditd_thread+0x174/0xb00 kernel/audit.c:599
> > >>> >> >  kthread+0x326/0x3f0 kernel/kthread.c:229
> > >>> >> >  ret_from_fork+0x31/0x40 arch/x86/entry/entry_64.S:430
> > >>> >> > Code: 44 89 fe e8 56 15 ff ff 8b 8d 70 ff ff ff 49 89 c6 31 c0 85 c9
> > >>> >> > 75 27 e8 b2 b2 f4 fd 49 8d bc 24 b0 02 00 00 48 89 f8 48 c1 e8 03 <42>
> > >>> >> > 80 3c 28 00 0f 85 37 06 00 00 49 8b 84 24 b0 02 00 00 4c 8d
> > >>> >> > RIP: sock_sndtimeo include/net/sock.h:2162 [inline] RSP: ffff8801d7a27c38
> > >>> >> > RIP: netlink_unicast+0xdd/0x730 net/netlink/af_netlink.c:1249 RSP:
> > >>> >> > ffff8801d7a27c38
> > >>> >> > ---[ end trace ad1bba9d457430b6 ]---
> > >>> >> > Kernel panic - not syncing: Fatal exception
> > >>> >> >
> > >>> >> >
> > >>> >> > This is not reproducible and seems to be caused by an elusive race.
> > >>> >> > However, looking at the code I don't see any proper protection of
> > >>> >> > audit_sock (other than the if (!audit_pid) which is obviously not
> > >>> >> > enough to protect against races).
> > >>> >>
> > >>> >> audit_cmd_mutex is supposed to protect it, I think.
> > >>> >> But kauditd_send_unicast_skb() seems not holding this mutex.
> > >>> >
> > >>> > Hmmmm, I wonder if it makes sense to wrap most of the contents of the
> > >>> > outer while loop in kauditd_thread in the audit_cmd_mutex, or around the
> > >>> > first two innter while loops and the "if (auditd)" condition after the
> > >>> > "quick_loop:" label.  The condition on auditd is supposed to catch that
> > >>> > case.  We don't want it locked while playing with the scheduler at the
> > >>> > bottom of that function.
> > >>>
> > >>> Let me look into this and play around with a few things.  I suspected
> > >>> there might be a problem here, so I've got thoughts on how we might
> > >>> resolve it; I just need to see code them up and see what option sucks
> > >>> the least.
> > >>>
> > >>> FWIW Richard, yes wrapping most of kauditd_thread *should* resolve
> > >>> this but it's pretty heavy handed and not my first choice.
> > >>
> > >> That's why the inner loops made a bit more sense since it wasn't really
> > >> necessary and ran afoul of the scheduler anyways.
> > >
> > > One of my preferred options was to get us away from protecting
> > > everything with the audit_cmd_mutex by creating a new locking approach
> > > for the auditd connection state (using RCU/spinlocks since it rarely
> > > changes in practice) and leaving the audit_cmd_mutex for it's
> > > traditional role.  This should minimize the performance impact of the
> > > lock and clean things up a bit.  I'm also moving all the auditd
> > > connection state into a single struct (instead of several variables
> > > associated only by convention) which moves us oh so slightly closer to
> > > allowing multiple auditd connections (hey, it's something).
> > >
> > > It's taking a bit longer than expected as I'm dealing with a bit of a
> > > head cold (or something) and my mind is far less than 100% at the
> > > moment ...
> > 
> > Ooof.  I just noticed something, and maybe this is the fever talking,
> > but why do we ever NULL out audit_sock and why are we bothering with
> > those holds/puts?  We create the audit netlink socket in
> > audit_net_init() and it should remain valid until we kill it in
> > audit_next_exit(); we sorta cheat on this now because we track the
> > socket both in the per-netns audit_net struct as well as audit_sock,
> > but that doesn't make our audit_sock manipulations right ...
> 
> At the moment, you are right, there is no reason to null audit_sock, and
> not like auditd will appear on a different sock yet.

Ok, I pushed send too fast and didn't think this through enough.
Currently, the audit daemon *could* re-appear on a different socket.
While it is still in the same user and pid namespace, it could be
started from a different network namespace and it will set audit_sock to
the socket from that network namespace.

> The only excuse I can give is that this was anticipating audit daemons
> in more than one user namespace necessarily with their own network
> namespaces.  The AUDIT_GET, AUDIT_LIST_RULES commands are treated
> properly since they use the per-netns audit_net struct and don't use the
> primary queue.  The AUDIT_USER_* messages are converted from their
> originating namespaces ok, but will need to be tracked what network
> namespace they came from for multiple audit daemons in the future.
> 
> > Man I hate this code.  I *really* hate this code.
> > 
> > paul moore
> 
> - RGB
> 
> --
> Richard Guy Briggs <rgb@redhat.com>
> Kernel Security Engineering, Base Operating Systems, Red Hat
> Remote, Ottawa, Canada
> Voice: +1.647.777.2635, Internal: (81) 32635

- RGB

--
Richard Guy Briggs <rgb@redhat.com>
Kernel Security Engineering, Base Operating Systems, Red Hat
Remote, Ottawa, Canada
Voice: +1.647.777.2635, Internal: (81) 32635

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: netlink: GPF in netlink_unicast
  2017-03-08 13:25             ` Richard Guy Briggs
  2017-03-08 13:51               ` Richard Guy Briggs
@ 2017-03-08 13:51               ` Paul Moore
  1 sibling, 0 replies; 10+ messages in thread
From: Paul Moore @ 2017-03-08 13:51 UTC (permalink / raw)
  To: Richard Guy Briggs
  Cc: Cong Wang, Herbert Xu, netdev, Alexei Starovoitov, LKML,
	Eric Dumazet, syzkaller, linux-audit, David Miller, Dmitry Vyukov

On Wed, Mar 8, 2017 at 8:25 AM, Richard Guy Briggs <rgb@redhat.com> wrote:
> On 2017-03-07 14:23, Paul Moore wrote:
>> On Tue, Mar 7, 2017 at 1:44 PM, Paul Moore <paul@paul-moore.com> wrote:
>> > On Tue, Mar 7, 2017 at 10:55 AM, Richard Guy Briggs <rgb@redhat.com> wrote:
>> >> On 2017-03-07 09:29, Paul Moore wrote:
>> >>> On Mon, Mar 6, 2017 at 11:03 PM, Richard Guy Briggs <rgb@redhat.com> wrote:
>> >>> > On 2017-03-06 10:10, Cong Wang wrote:
>> >>> >> On Mon, Mar 6, 2017 at 2:54 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
>> >>> >> > Hello,
>> >>> >> >
>> >>> >> > I've got the following crash while running syzkaller fuzzer on
>> >>> >> > net-next/8d70eeb84ab277377c017af6a21d0a337025dede:
>> >>> >> >
>> >>> >> > kasan: GPF could be caused by NULL-ptr deref or user memory access
>> >>> >> > general protection fault: 0000 [#1] SMP KASAN
>> >>> >> > Dumping ftrace buffer:
>> >>> >> >    (ftrace buffer empty)
>> >>> >> > Modules linked in:
>> >>> >> > CPU: 0 PID: 883 Comm: kauditd Not tainted 4.10.0+ #6
>> >>> >> > Hardware name: Google Google Compute Engine/Google Compute Engine,
>> >>> >> > BIOS Google 01/01/2011
>> >>> >> > task: ffff8801d79f0240 task.stack: ffff8801d7a20000
>> >>> >> > RIP: 0010:sock_sndtimeo include/net/sock.h:2162 [inline]
>> >>> >> > RIP: 0010:netlink_unicast+0xdd/0x730 net/netlink/af_netlink.c:1249
>> >>> >> > RSP: 0018:ffff8801d7a27c38 EFLAGS: 00010206
>> >>> >> > RAX: 0000000000000056 RBX: ffff8801d7a27cd0 RCX: 0000000000000000
>> >>> >> > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00000000000002b0
>> >>> >> > RBP: ffff8801d7a27cf8 R08: ffffed00385cf286 R09: ffffed00385cf286
>> >>> >> > R10: 0000000000000006 R11: ffffed00385cf285 R12: 0000000000000000
>> >>> >> > R13: dffffc0000000000 R14: ffff8801c2fc3c80 R15: 00000000014000c0
>> >>> >> > FS:  0000000000000000(0000) GS:ffff8801dbe00000(0000) knlGS:0000000000000000
>> >>> >> > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> >>> >> > CR2: 0000000020cfd000 CR3: 00000001c758f000 CR4: 00000000001406f0
>> >>> >> > Call Trace:
>> >>> >> >  kauditd_send_unicast_skb+0x3c/0x70 kernel/audit.c:482
>> >>> >> >  kauditd_thread+0x174/0xb00 kernel/audit.c:599
>> >>> >> >  kthread+0x326/0x3f0 kernel/kthread.c:229
>> >>> >> >  ret_from_fork+0x31/0x40 arch/x86/entry/entry_64.S:430
>> >>> >> > Code: 44 89 fe e8 56 15 ff ff 8b 8d 70 ff ff ff 49 89 c6 31 c0 85 c9
>> >>> >> > 75 27 e8 b2 b2 f4 fd 49 8d bc 24 b0 02 00 00 48 89 f8 48 c1 e8 03 <42>
>> >>> >> > 80 3c 28 00 0f 85 37 06 00 00 49 8b 84 24 b0 02 00 00 4c 8d
>> >>> >> > RIP: sock_sndtimeo include/net/sock.h:2162 [inline] RSP: ffff8801d7a27c38
>> >>> >> > RIP: netlink_unicast+0xdd/0x730 net/netlink/af_netlink.c:1249 RSP:
>> >>> >> > ffff8801d7a27c38
>> >>> >> > ---[ end trace ad1bba9d457430b6 ]---
>> >>> >> > Kernel panic - not syncing: Fatal exception
>> >>> >> >
>> >>> >> >
>> >>> >> > This is not reproducible and seems to be caused by an elusive race.
>> >>> >> > However, looking at the code I don't see any proper protection of
>> >>> >> > audit_sock (other than the if (!audit_pid) which is obviously not
>> >>> >> > enough to protect against races).
>> >>> >>
>> >>> >> audit_cmd_mutex is supposed to protect it, I think.
>> >>> >> But kauditd_send_unicast_skb() seems not holding this mutex.
>> >>> >
>> >>> > Hmmmm, I wonder if it makes sense to wrap most of the contents of the
>> >>> > outer while loop in kauditd_thread in the audit_cmd_mutex, or around the
>> >>> > first two innter while loops and the "if (auditd)" condition after the
>> >>> > "quick_loop:" label.  The condition on auditd is supposed to catch that
>> >>> > case.  We don't want it locked while playing with the scheduler at the
>> >>> > bottom of that function.
>> >>>
>> >>> Let me look into this and play around with a few things.  I suspected
>> >>> there might be a problem here, so I've got thoughts on how we might
>> >>> resolve it; I just need to see code them up and see what option sucks
>> >>> the least.
>> >>>
>> >>> FWIW Richard, yes wrapping most of kauditd_thread *should* resolve
>> >>> this but it's pretty heavy handed and not my first choice.
>> >>
>> >> That's why the inner loops made a bit more sense since it wasn't really
>> >> necessary and ran afoul of the scheduler anyways.
>> >
>> > One of my preferred options was to get us away from protecting
>> > everything with the audit_cmd_mutex by creating a new locking approach
>> > for the auditd connection state (using RCU/spinlocks since it rarely
>> > changes in practice) and leaving the audit_cmd_mutex for it's
>> > traditional role.  This should minimize the performance impact of the
>> > lock and clean things up a bit.  I'm also moving all the auditd
>> > connection state into a single struct (instead of several variables
>> > associated only by convention) which moves us oh so slightly closer to
>> > allowing multiple auditd connections (hey, it's something).
>> >
>> > It's taking a bit longer than expected as I'm dealing with a bit of a
>> > head cold (or something) and my mind is far less than 100% at the
>> > moment ...
>>
>> Ooof.  I just noticed something, and maybe this is the fever talking,
>> but why do we ever NULL out audit_sock and why are we bothering with
>> those holds/puts?  We create the audit netlink socket in
>> audit_net_init() and it should remain valid until we kill it in
>> audit_next_exit(); we sorta cheat on this now because we track the
>> socket both in the per-netns audit_net struct as well as audit_sock,
>> but that doesn't make our audit_sock manipulations right ...
>
> At the moment, you are right, there is no reason to null audit_sock, and
> not like auditd will appear on a different sock yet.
>
> The only excuse I can give is that this was anticipating audit daemons
> in more than one user namespace necessarily with their own network
> namespaces.

It doesn't really matter at this point, but that argument still
doesn't make sense.  Regardless of how many audit daemons we support
on the system, there is no reason to NULLify the audit netlink socket
or get additional reference counts.  The audit_sock variable should
disappear and we should simply use the sock we create in the in the
network namespace callback; when an audit daemon connects to the
kernel we should simply record its network namespace and go from
there.

I'm testing an initial patchset that does this today, I hope to post
an initial RFC sometime this week.

Also, to clarify things a bit, my questions above were a bit
rhetorical, and mostly me venting some frustration and trying to
explain that there wasn't going to be an immediate, or small, fix for
this problem.  In general, don't feel you need to make excuses for
audit; it's full of problems, we all know this, and I'm more concerned
with fixing them than I am with trying to assign blame.  Assigning
blame doesn't fix anything, it just upsets people.

> The AUDIT_GET, AUDIT_LIST_RULES commands are treated
> properly since they use the per-netns audit_net struct and don't use the
> primary queue.  The AUDIT_USER_* messages are converted from their
> originating namespaces ok, but will need to be tracked what network
> namespace they came from for multiple audit daemons in the future.

-- 
paul moore
www.paul-moore.com

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2017-03-08 13:51 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-03-06 10:54 netlink: GPF in netlink_unicast Dmitry Vyukov
2017-03-06 18:10 ` Cong Wang
2017-03-07  4:03   ` Richard Guy Briggs
2017-03-07 14:29     ` Paul Moore
2017-03-07 15:55       ` Richard Guy Briggs
2017-03-07 18:44         ` Paul Moore
2017-03-07 19:23           ` Paul Moore
2017-03-08 13:25             ` Richard Guy Briggs
2017-03-08 13:51               ` Richard Guy Briggs
2017-03-08 13:51               ` Paul Moore

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).