* NULL pointer deref, selinux_socket_unix_may_send+0x34/0x90
@ 2013-03-21 22:19 Ján Stanček
2013-03-22 15:24 ` Paul Moore
0 siblings, 1 reply; 6+ messages in thread
From: Ján Stanček @ 2013-03-21 22:19 UTC (permalink / raw)
To: netdev
[-- Attachment #1: Type: text/plain, Size: 3388 bytes --]
Hi,
I'm occasionally seeing a panic early after system booted and while
systemd is starting other services.
I made a reproducer which is quite reliable on my system (32 CPU Intel)
and can usually trigger this issue within a minute or two. I can reproduce
this issue with 3.9.0-rc3 as root or unprivileged user (see call trace below).
I'm attaching my reproducer and (experimental) patch, which fixes the
issue for me.
Regards,
Jan
[ 307.419660] BUG: unable to handle kernel NULL pointer dereference
at 0000000000000250
[ 307.428453] IP: [<ffffffff812a2d04>] selinux_socket_unix_may_send+0x34/0x90
[ 307.436258] PGD 422cd8067 PUD 4081b1067 PMD 0
[ 307.441266] Oops: 0000 [#1] SMP
[ 307.558800] CPU 25
[ 307.560953] Pid: 7412, comm: a.out Tainted: GF
3.9.0-rc3 #1 Intel Corporation W2600CR/W2600CR
[ 307.571736] RIP: 0010:[<ffffffff812a2d04>] [<ffffffff812a2d04>]
selinux_socket_unix_may_send+0x34/0x90
[ 307.582240] RSP: 0018:ffff880423c67ab8 EFLAGS: 00010246
[ 307.588171] RAX: ffff8808243a0680 RBX: ffff880423c67be8 RCX:
0000000000000007
[ 307.596139] RDX: 0000000000000000 RSI: ffff88042ef1c380 RDI:
ffff880423c67ad8
[ 307.604100] RBP: ffff880423c67b18 R08: ffff88042511e180 R09:
0000000000000000
[ 307.612067] R10: 0000000000000000 R11: 0000000000000000 R12:
ffff8808243a0680
[ 307.620034] R13: 7fffffffffffffff R14: ffff88042511e180 R15:
ffff88042511e470
[ 307.628001] FS: 00007f19d1bb2740(0000) GS:ffff88082ef20000(0000)
knlGS:0000000000000000
[ 307.637028] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 307.643437] CR2: 0000000000000250 CR3: 0000000404ff4000 CR4:
00000000000407e0
[ 307.651397] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[ 307.659358] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
[ 307.667331] Process a.out (pid: 7412, threadinfo
ffff880423c66000, task ffff880423600000)
[ 307.676453] Stack:
[ 307.678694] ffff88042511e102 ffff88042749f600 ffff8808243a0680
ffff88042511e180
[ 307.686990] ffff88042511e180 000000000000000a ffff880423c67af8
ffffffff8129ef36
[ 307.695284] ffff880423c67b28 ffffffff81529747 ffff880423c67be8
00000000fdbc1448
[ 307.703584] Call Trace:
[ 307.706332] [<ffffffff8129ef36>] ? security_sock_rcv_skb+0x16/0x20
[ 307.713339] [<ffffffff81529747>] ? sk_filter+0x37/0xd0
[ 307.719168] [<ffffffff8129ef16>] security_unix_may_send+0x16/0x20
[ 307.726075] [<ffffffff815b694d>] unix_dgram_sendmsg+0x48d/0x640
[ 307.732802] [<ffffffff814fd9c0>] sock_sendmsg+0xb0/0xe0
[ 307.738732] [<ffffffff8103edde>] ? physflat_send_IPI_mask+0xe/0x10
[ 307.745726] [<ffffffff814ff55c>] __sys_sendmsg+0x3ac/0x3c0
[ 307.751961] [<ffffffff811a3357>] ? do_sync_write+0xa7/0xe0
[ 307.758186] [<ffffffff811e31fb>] ? fsnotify+0x24b/0x340
[ 307.764120] [<ffffffff815013c9>] sys_sendmsg+0x49/0x90
[ 307.769966] [<ffffffff81630b99>] system_call_fastpath+0x16/0x1b
[ 307.776665] Code: 00 00 45 31 c9 48 89 e5 48 83 ec 60 48 8b 56 20
65 48 8b 04 25 28 00 00 00 48 89 45 f8 31 c0 48 8b 47 20 48 8d 7d c0
c6 45 a0 02 <48> 8b b2 50 02 00 00 4c 8b 80 50 02 00 00 31 c0 f3 48 ab
48 89
[ 307.798450] RIP [<ffffffff812a2d04>]
selinux_socket_unix_may_send+0x34/0x90
[ 307.806334] RSP <ffff880423c67ab8>
[ 307.810223] CR2: 0000000000000250
[ 307.813957] ---[ end trace 0829e3985976c28a ]---
[-- Attachment #2: 0001-af_unix-fix-race-in-unix_release-unix_dgram_sendmsg.patch --]
[-- Type: application/octet-stream, Size: 4419 bytes --]
From 7bed585f37d10f62fe7389fdd704d3684d61cdf9 Mon Sep 17 00:00:00 2001
Message-Id: <7bed585f37d10f62fe7389fdd704d3684d61cdf9.1363863447.git.jan@stancek.eu>
From: Jan Stancek <jan@stancek.eu>
Date: Thu, 21 Mar 2013 11:20:18 +0100
Subject: [PATCH] af_unix: fix race in unix_release/unix_dgram_sendmsg
unix_release() will set sock->sk to NULL for server side,
while client may still be at unix_dgram_sendmsg(). When client
now calls security_unix_may_send(), other->sk->sk_security will
cause NULL pointer dereference.
Check that 'other' is not in process of being released.
[ 307.419660] BUG: unable to handle kernel NULL pointer dereference at 0000000000000250
[ 307.428453] IP: [<ffffffff812a2d04>] selinux_socket_unix_may_send+0x34/0x90
[ 307.436258] PGD 422cd8067 PUD 4081b1067 PMD 0
[ 307.441266] Oops: 0000 [#1] SMP
[ 307.558800] CPU 25
[ 307.560953] Pid: 7412, comm: a.out Tainted: GF 3.9.0-rc3 #1 Intel Corporation W2600CR/W2600CR
[ 307.571736] RIP: 0010:[<ffffffff812a2d04>] [<ffffffff812a2d04>] selinux_socket_unix_may_send+0x34/0x90
[ 307.582240] RSP: 0018:ffff880423c67ab8 EFLAGS: 00010246
[ 307.588171] RAX: ffff8808243a0680 RBX: ffff880423c67be8 RCX: 0000000000000007
[ 307.596139] RDX: 0000000000000000 RSI: ffff88042ef1c380 RDI: ffff880423c67ad8
[ 307.604100] RBP: ffff880423c67b18 R08: ffff88042511e180 R09: 0000000000000000
[ 307.612067] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8808243a0680
[ 307.620034] R13: 7fffffffffffffff R14: ffff88042511e180 R15: ffff88042511e470
[ 307.628001] FS: 00007f19d1bb2740(0000) GS:ffff88082ef20000(0000) knlGS:0000000000000000
[ 307.637028] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 307.643437] CR2: 0000000000000250 CR3: 0000000404ff4000 CR4: 00000000000407e0
[ 307.651397] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 307.659358] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 307.667331] Process a.out (pid: 7412, threadinfo ffff880423c66000, task ffff880423600000)
[ 307.676453] Stack:
[ 307.678694] ffff88042511e102 ffff88042749f600 ffff8808243a0680 ffff88042511e180
[ 307.686990] ffff88042511e180 000000000000000a ffff880423c67af8 ffffffff8129ef36
[ 307.695284] ffff880423c67b28 ffffffff81529747 ffff880423c67be8 00000000fdbc1448
[ 307.703584] Call Trace:
[ 307.706332] [<ffffffff8129ef36>] ? security_sock_rcv_skb+0x16/0x20
[ 307.713339] [<ffffffff81529747>] ? sk_filter+0x37/0xd0
[ 307.719168] [<ffffffff8129ef16>] security_unix_may_send+0x16/0x20
[ 307.726075] [<ffffffff815b694d>] unix_dgram_sendmsg+0x48d/0x640
[ 307.732802] [<ffffffff814fd9c0>] sock_sendmsg+0xb0/0xe0
[ 307.738732] [<ffffffff8103edde>] ? physflat_send_IPI_mask+0xe/0x10
[ 307.745726] [<ffffffff814ff55c>] __sys_sendmsg+0x3ac/0x3c0
[ 307.751961] [<ffffffff811a3357>] ? do_sync_write+0xa7/0xe0
[ 307.758186] [<ffffffff811e31fb>] ? fsnotify+0x24b/0x340
[ 307.764120] [<ffffffff815013c9>] sys_sendmsg+0x49/0x90
[ 307.769966] [<ffffffff81630b99>] system_call_fastpath+0x16/0x1b
[ 307.776665] Code: 00 00 45 31 c9 48 89 e5 48 83 ec 60 48 8b 56 20 65 48 8b 04 25 28 00 00 00 48 89 45 f8 31 c0 48 8b 47 20 48 8d 7d c0 c6 45 a0 02 <48> 8b b2 50 02 00 00 4c 8b 80 50 02 00 00 31 c0 f3 48 ab 48 89
[ 307.798450] RIP [<ffffffff812a2d04>] selinux_socket_unix_may_send+0x34/0x90
[ 307.806334] RSP <ffff880423c67ab8>
[ 307.810223] CR2: 0000000000000250
[ 307.813957] ---[ end trace 0829e3985976c28a ]---
Signed-off-by: Jan Stancek <jan@stancek.eu>
---
net/unix/af_unix.c | 10 ++++++++++
1 files changed, 10 insertions(+), 0 deletions(-)
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 51be64f..433952c 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -993,6 +993,11 @@ restart:
if (!unix_may_send(sk, other))
goto out_unlock;
+ err = -ECONNREFUSED;
+ /* other is being released */
+ if (!other->sk_socket->sk)
+ goto out_unlock;
+
err = security_unix_may_send(sk->sk_socket, other->sk_socket);
if (err)
goto out_unlock;
@@ -1554,6 +1559,11 @@ restart:
goto out_unlock;
if (sk->sk_type != SOCK_SEQPACKET) {
+ err = -ECONNREFUSED;
+ /* other is being released */
+ if (!other->sk_socket->sk)
+ goto out_unlock;
+
err = security_unix_may_send(sk->sk_socket, other->sk_socket);
if (err)
goto out_unlock;
--
1.7.1
[-- Attachment #3: selinux_socket_unix_may_send.c --]
[-- Type: text/x-csrc, Size: 3730 bytes --]
/*
* reproducer for Bug 887683
* unable to handle kernel NULL in selinux_socket_unix_may_send
*
* This reproducer is based on systemd/journald sources,
* it's possible there is way to make it even more simple, but
* since I was able to trigger the Bug, I rather kept it as it is.
*
* jstancek@redhat.com
*
* gcc selinux_socket_unix_may_send.c; ./a.out
*/
#include <sys/types.h>
#include <sys/socket.h>
#include <sys/un.h>
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <signal.h>
#include <net/if.h>
#include <netinet/in.h>
#include <linux/netlink.h>
#include <unistd.h>
#define offsetof(type, member) __builtin_offsetof (type, member)
#define SNDBUF_SIZE (8*1024*1024)
int pipefd[2];
union sockaddr_union {
struct sockaddr sa;
struct sockaddr_in in4;
struct sockaddr_in6 in6;
struct sockaddr_un un;
struct sockaddr_nl nl;
struct sockaddr_storage storage;
};
int fd_inc_sndbuf(int fd, size_t n) {
int r, value;
socklen_t l = sizeof(value);
r = getsockopt(fd, SOL_SOCKET, SO_SNDBUF, &value, &l);
if (r >= 0 &&
l == sizeof(value) &&
(size_t) value >= n*2)
return 0;
value = (int) n;
r = setsockopt(fd, SOL_SOCKET, SO_SNDBUF, &value, sizeof(value));
if (r < 0)
return -errno;
return 1;
}
int child()
{
int fd = socket(AF_UNIX, SOCK_DGRAM|SOCK_CLOEXEC, 0);
char data[1024] = "123456789";
struct iovec *w;
int j = 0;
int n = 1;
int ret;
int dummy;
struct sockaddr_un sa;
struct msghdr mh;
union {
struct cmsghdr cmsghdr;
uint8_t buf[CMSG_SPACE(sizeof(int))];
} control;
if (fd < 0)
return -errno;
ret = fd_inc_sndbuf(fd, SNDBUF_SIZE);
if (ret < 0)
perror("fd_inc_sndbuf");
memset(&sa, 0, sizeof(sa));
sa.sun_family = AF_UNIX;
strncpy(sa.sun_path, "/run/systemd/journal/socket_test", sizeof(sa.sun_path));
w = malloc(sizeof(struct iovec) * n * 5 + 3);
w[j].iov_base = data;
w[j].iov_len = 10;
j++;
memset(&control, 0, sizeof(control));
mh.msg_control = &control;
mh.msg_controllen = sizeof(control);
memset(&mh, 0, sizeof(mh));
mh.msg_name = &sa;
mh.msg_namelen = offsetof(struct sockaddr_un, sun_path) + strlen(sa.sun_path);
mh.msg_iov = w;
mh.msg_iovlen = 1;
write(pipefd[1], &dummy, 1);
ret = sendmsg(fd, &mh, MSG_NOSIGNAL);
/*if (ret < 0)
perror("ret:");
else
printf("ret: %d\n", ret);*/
close(fd);
}
int parent()
{
int fd, ret, status;
union sockaddr_union sa;
int dummy;
fd = socket(AF_UNIX, SOCK_DGRAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0);
if (fd < 0)
perror("socket");
sa.un.sun_family = AF_UNIX;
strncpy(sa.un.sun_path, "/run/systemd/journal/socket_test", sizeof(sa.un.sun_path));
unlink(sa.un.sun_path);
int one = 1;
ret = setsockopt(fd, SOL_SOCKET, SO_PASSCRED, &one, sizeof(one));
if (ret < 0)
perror("setsockopt 1");
ret = setsockopt(fd, SOL_SOCKET, SO_PASSSEC, &one, sizeof(one));
if (ret < 0)
perror("setsockopt 2");
ret = setsockopt(fd, SOL_SOCKET, SO_TIMESTAMP, &one, sizeof(one));
if (ret < 0)
perror("setsockopt 3");
ret = bind(fd, &sa.sa, offsetof(union sockaddr_union, un.sun_path) + strlen(sa.un.sun_path));
if (ret < 0)
perror("bind");
read(pipefd[0], &dummy, 1);
close(fd);
wait(&status);
}
int main()
{
int busypids[8192];
int childpid;
int i = 0;
int cpus = sysconf(_SC_NPROCESSORS_ONLN)*2;
if (pipe(pipefd) < 0) {
perror("pipe");
return 1;
}
for (i = 0; i < cpus; i++) {
childpid = fork();
if (childpid == 0) {
while(1);
}
busypids[i] = childpid;
}
i = 0;
while (i < 100000) {
childpid = fork();
if (childpid == 0) {
child();
exit(0);
}
parent();
i++;
if (i % 10 == 0) {
printf("loop: %d\n", i);
}
}
for (i = 0; i < cpus; i++) {
kill(busypids[i], SIGKILL);
}
printf("Done.\n");
return 0;
}
^ permalink raw reply related [flat|nested] 6+ messages in thread* Re: NULL pointer deref, selinux_socket_unix_may_send+0x34/0x90
2013-03-21 22:19 NULL pointer deref, selinux_socket_unix_may_send+0x34/0x90 Ján Stanček
@ 2013-03-22 15:24 ` Paul Moore
2013-03-22 15:48 ` Ján Stanček
0 siblings, 1 reply; 6+ messages in thread
From: Paul Moore @ 2013-03-22 15:24 UTC (permalink / raw)
To: Ján Stanček; +Cc: netdev, eparis, sds
[-- Attachment #1: Type: text/plain, Size: 2995 bytes --]
On Thursday, March 21, 2013 11:19:22 PM Ján Stanček wrote:
> Hi,
>
> I'm occasionally seeing a panic early after system booted and while
> systemd is starting other services.
>
> I made a reproducer which is quite reliable on my system (32 CPU Intel)
> and can usually trigger this issue within a minute or two. I can reproduce
> this issue with 3.9.0-rc3 as root or unprivileged user (see call trace
> below).
>
> I'm attaching my reproducer and (experimental) patch, which fixes the
> issue for me.
Hi Jan,
I've heard some similar reports over the past few years but I've never been
able to reproduce the problem and the reporters have never show enough
interest to be able to help me diagnose the problem. Your information about
the size of the machine and the reproducer may help, thank you!
I'll try your reproducer but since I don't happen to have a machine handy that
is the same size as yours would you mind trying the attached (also pasted
inline for others to comment on) patch? I can't promise it will solve your
problem but it was the best idea I could come up with a few years ago when I
first became aware of the problem. I think you are right in that there is a
race condition somewhere with the AF_UNIX sockets shutting down, I'm just not
yet certain where it is ...
Thanks again.
net: fix some potential race issues in the AF_UNIX code
From: Paul Moore <pmoore@redhat.com>
At least one user had reported some odd behavior with UNIX sockets which
could be attributed to some _possible_ race conditions in
unix_release_sock().
Reported-by: Konstantin Boyandin <konstantin@boyandin.com>
Reported-by: Jan Stancek <jstancek@redhat.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
---
net/unix/af_unix.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 51be64f..886b8da 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -408,8 +408,10 @@ static int unix_release_sock(struct sock *sk, int
embrion)
skpair = unix_peer(sk);
if (skpair != NULL) {
- if (sk->sk_type == SOCK_STREAM || sk->sk_type == SOCK_SEQPACKET) {
- unix_state_lock(skpair);
+ unix_state_lock(skpair);
+ if (unix_our_peer(sk, skpair) &&
+ (sk->sk_type == SOCK_STREAM ||
+ sk->sk_type == SOCK_SEQPACKET)) {
/* No more writes */
skpair->sk_shutdown = SHUTDOWN_MASK;
if (!skb_queue_empty(&sk->sk_receive_queue) || embrion)
@@ -417,9 +419,10 @@ static int unix_release_sock(struct sock *sk, int
embrion)
unix_state_unlock(skpair);
skpair->sk_state_change(skpair);
sk_wake_async(skpair, SOCK_WAKE_WAITD, POLL_HUP);
- }
- sock_put(skpair); /* It may now die */
+ } else
+ unix_state_unlock(skpair);
unix_peer(sk) = NULL;
+ sock_put(skpair); /* It may now die */
}
/* Try to flush out this socket. Throw out buffers at least */
--
paul moore
www.paul-moore.com
[-- Attachment #2: unix-race_fix.patch --]
[-- Type: text/x-patch, Size: 1555 bytes --]
net: fix some potential race issues in the AF_UNIX code
From: Paul Moore <pmoore@redhat.com>
At least one user had reported some odd behavior with UNIX sockets which
could be attributed to some _possible_ race conditions in
unix_release_sock().
Reported-by: Konstantin Boyandin <konstantin@boyandin.com>
Reported-by: Jan Stancek <jstancek@redhat.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
---
net/unix/af_unix.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 51be64f..886b8da 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -408,8 +408,10 @@ static int unix_release_sock(struct sock *sk, int embrion)
skpair = unix_peer(sk);
if (skpair != NULL) {
- if (sk->sk_type == SOCK_STREAM || sk->sk_type == SOCK_SEQPACKET) {
- unix_state_lock(skpair);
+ unix_state_lock(skpair);
+ if (unix_our_peer(sk, skpair) &&
+ (sk->sk_type == SOCK_STREAM ||
+ sk->sk_type == SOCK_SEQPACKET)) {
/* No more writes */
skpair->sk_shutdown = SHUTDOWN_MASK;
if (!skb_queue_empty(&sk->sk_receive_queue) || embrion)
@@ -417,9 +419,10 @@ static int unix_release_sock(struct sock *sk, int embrion)
unix_state_unlock(skpair);
skpair->sk_state_change(skpair);
sk_wake_async(skpair, SOCK_WAKE_WAITD, POLL_HUP);
- }
- sock_put(skpair); /* It may now die */
+ } else
+ unix_state_unlock(skpair);
unix_peer(sk) = NULL;
+ sock_put(skpair); /* It may now die */
}
/* Try to flush out this socket. Throw out buffers at least */
^ permalink raw reply related [flat|nested] 6+ messages in thread* Re: NULL pointer deref, selinux_socket_unix_may_send+0x34/0x90
2013-03-22 15:24 ` Paul Moore
@ 2013-03-22 15:48 ` Ján Stanček
2013-03-22 16:24 ` Paul Moore
0 siblings, 1 reply; 6+ messages in thread
From: Ján Stanček @ 2013-03-22 15:48 UTC (permalink / raw)
To: Paul Moore; +Cc: netdev, eparis, sds
[-- Attachment #1: Type: text/plain, Size: 1809 bytes --]
On Fri, Mar 22, 2013 at 4:24 PM, Paul Moore <paul@paul-moore.com> wrote:
> On Thursday, March 21, 2013 11:19:22 PM Ján Stanček wrote:
>> Hi,
>>
>> I'm occasionally seeing a panic early after system booted and while
>> systemd is starting other services.
>>
>> I made a reproducer which is quite reliable on my system (32 CPU Intel)
>> and can usually trigger this issue within a minute or two. I can reproduce
>> this issue with 3.9.0-rc3 as root or unprivileged user (see call trace
>> below).
>>
>> I'm attaching my reproducer and (experimental) patch, which fixes the
>> issue for me.
>
> Hi Jan,
>
> I've heard some similar reports over the past few years but I've never been
> able to reproduce the problem and the reporters have never show enough
> interest to be able to help me diagnose the problem. Your information about
> the size of the machine and the reproducer may help, thank you!
>
> I'll try your reproducer but since I don't happen to have a machine handy that
> is the same size as yours would you mind trying the attached (also pasted
> inline for others to comment on) patch? I can't promise it will solve your
> problem but it was the best idea I could come up with a few years ago when I
> first became aware of the problem. I think you are right in that there is a
> race condition somewhere with the AF_UNIX sockets shutting down, I'm just not
> yet certain where it is ...
Hi Paul,
thanks for reply, I'll try your patch and let you know.
I'm not certain about cause either, but patch I sent in last email
makes it go away,
so maybe that can help in some way.
I made a v2 of the reproducer (attached), which triggers the issue a lot faster
on 2 systems I tried (32 CPU and 4 CPU systems) - just in couple of seconds.
Regards,
Jan
[-- Attachment #2: selinux_socket_unix_may_send_v2.c --]
[-- Type: text/x-csrc, Size: 3762 bytes --]
/*
* reproducer for Bug 887683 v2
* unable to handle kernel NULL in selinux_socket_unix_may_send
*
* This reproducer is based on systemd/journald sources,
* it's possible there is way to make it even more simple, but
* since I was able to trigger the Bug, I rather kept it as it is.
*
* jstancek@redhat.com
*
* gcc selinux_socket_unix_may_send.c; ./a.out
*/
#include <sys/types.h>
#include <sys/socket.h>
#include <sys/un.h>
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <signal.h>
#include <net/if.h>
#include <netinet/in.h>
#include <linux/netlink.h>
#include <unistd.h>
#define offsetof(type, member) __builtin_offsetof (type, member)
#define SNDBUF_SIZE (8*1024*1024)
int pipefd[2];
union sockaddr_union {
struct sockaddr sa;
struct sockaddr_in in4;
struct sockaddr_in6 in6;
struct sockaddr_un un;
struct sockaddr_nl nl;
struct sockaddr_storage storage;
};
int fd_inc_sndbuf(int fd, size_t n) {
int r, value;
socklen_t l = sizeof(value);
r = getsockopt(fd, SOL_SOCKET, SO_SNDBUF, &value, &l);
if (r >= 0 &&
l == sizeof(value) &&
(size_t) value >= n*2)
return 0;
value = (int) n;
r = setsockopt(fd, SOL_SOCKET, SO_SNDBUF, &value, sizeof(value));
if (r < 0)
return -errno;
return 1;
}
int client(int id, int loops)
{
int fd;
char data[1024] = "123456789";
struct iovec *w;
int i;
int j = 0;
int n = 1;
int ret;
int dummy;
struct sockaddr_un sa;
struct msghdr mh;
union {
struct cmsghdr cmsghdr;
uint8_t buf[CMSG_SPACE(sizeof(int))];
} control;
memset(&sa, 0, sizeof(sa));
sa.sun_family = AF_UNIX;
snprintf(sa.sun_path, sizeof(sa.sun_path), "/tmp/socket_test%d", id);
w = malloc(sizeof(struct iovec) * n * 5 + 3);
w[j].iov_base = data;
w[j].iov_len = 10;
j++;
memset(&control, 0, sizeof(control));
mh.msg_control = &control;
mh.msg_controllen = sizeof(control);
memset(&mh, 0, sizeof(mh));
mh.msg_name = &sa;
mh.msg_namelen = offsetof(struct sockaddr_un, sun_path) + strlen(sa.sun_path);
mh.msg_iov = w;
mh.msg_iovlen = 1;
for (i = 0; i < loops; i++) {
fd = socket(AF_UNIX, SOCK_DGRAM|SOCK_CLOEXEC, 0);
if (fd < 0)
return -errno;
ret = fd_inc_sndbuf(fd, SNDBUF_SIZE);
if (ret < 0)
perror("fd_inc_sndbuf");
write(pipefd[1], &dummy, 1);
ret = sendmsg(fd, &mh, MSG_NOSIGNAL);
/*if (ret < 0)
perror("ret:");
else
printf("ret: %d\n", ret);*/
close(fd);
}
}
int server(int id, int loops)
{
int fd, ret, status;
union sockaddr_union sa;
int dummy;
int i;
sa.un.sun_family = AF_UNIX;
snprintf(sa.un.sun_path, sizeof(sa.un.sun_path), "/tmp/socket_test%d", id);
for (i = 0; i < loops; i++) {
unlink(sa.un.sun_path);
fd = socket(AF_UNIX, SOCK_DGRAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0);
if (fd < 0)
perror("socket");
int one = 1;
ret = setsockopt(fd, SOL_SOCKET, SO_PASSCRED, &one, sizeof(one));
if (ret < 0)
perror("setsockopt 1");
ret = setsockopt(fd, SOL_SOCKET, SO_PASSSEC, &one, sizeof(one));
if (ret < 0)
perror("setsockopt 2");
ret = setsockopt(fd, SOL_SOCKET, SO_TIMESTAMP, &one, sizeof(one));
if (ret < 0)
perror("setsockopt 3");
ret = bind(fd, &sa.sa, offsetof(union sockaddr_union, un.sun_path) + strlen(sa.un.sun_path));
if (ret < 0)
perror("bind");
read(pipefd[0], &dummy, 1);
close(fd);
}
}
void child(int loops)
{
int i;
int mypid = getpid();
if (pipe(pipefd) < 0) {
perror("pipe");
return;
}
if (fork() == 0) {
client(mypid, loops);
exit(0);
} else {
server(mypid, loops);
}
}
int main()
{
int i = 0;
int cpus = sysconf(_SC_NPROCESSORS_ONLN)*4;
int status;
for (i = 0; i < cpus; i++) {
if (fork() == 0) {
child(100000);
exit(0);
}
}
for (i = 0; i < cpus; i++) {
wait(&status);
}
printf("Done.\n");
return 0;
}
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: NULL pointer deref, selinux_socket_unix_may_send+0x34/0x90
2013-03-22 15:48 ` Ján Stanček
@ 2013-03-22 16:24 ` Paul Moore
2013-03-22 16:52 ` Ján Stanček
0 siblings, 1 reply; 6+ messages in thread
From: Paul Moore @ 2013-03-22 16:24 UTC (permalink / raw)
To: Ján Stanček; +Cc: netdev, eparis, sds
On Friday, March 22, 2013 04:48:32 PM Ján Stanček wrote:
> Hi Paul,
>
> thanks for reply, I'll try your patch and let you know.
Great, thanks.
> I'm not certain about cause either, but patch I sent in last email
> makes it go away, so maybe that can help in some way.
At the very least you've found a way to reproduce the problem and your patch
furthers my belief that we've got a race condition somewhere - all very
helpful! It may also turn out that your patch is the "right" solution, I'd
just like to better understand why we are seeing the race in the first place.
> I made a v2 of the reproducer (attached), which triggers the issue a lot
> faster on 2 systems I tried (32 CPU and 4 CPU systems) - just in couple of
> seconds.
Excellent, while I don't have a 32 cpu system handy, I do have a 4 cpu system
that I can play with. Thanks again.
-Paul
--
paul moore
www.paul-moore.com
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: NULL pointer deref, selinux_socket_unix_may_send+0x34/0x90
2013-03-22 16:24 ` Paul Moore
@ 2013-03-22 16:52 ` Ján Stanček
2013-03-22 18:24 ` Paul Moore
0 siblings, 1 reply; 6+ messages in thread
From: Ján Stanček @ 2013-03-22 16:52 UTC (permalink / raw)
To: Paul Moore; +Cc: netdev, eparis, sds
Paul,
I applied your patch on top of 3.9-rc3 and ran v2 of reproducer. It
hit the issue
almost instantly:
[ 249.316283] BUG: unable to handle kernel NULL pointer dereference
at 0000000000000250
[ 249.325044] IP: [<ffffffff812a2d04>] selinux_socket_unix_may_send+0x34/0x90
[ 249.332829] PGD 80a8e5067 PUD 803048067 PMD 0
[ 249.337821] Oops: 0000 [#1] SMP
[ 249.453113] CPU 22
[ 249.455262] Pid: 6928, comm: a.out Tainted: GF
3.9.0-rc3+ #1 Intel Corporation W2600CR/W2600CR
[ 249.466132] RIP: 0010:[<ffffffff812a2d04>] [<ffffffff812a2d04>]
selinux_socket_unix_may_send+0x34/0x90
[ 249.476632] RSP: 0018:ffff880826569ab8 EFLAGS: 00010246
[ 249.482551] RAX: ffff880417ee4100 RBX: ffff880826569be8 RCX: 0000000000000007
[ 249.490511] RDX: 0000000000000000 RSI: ffff880828f77d00 RDI: ffff880826569ad8
[ 249.498472] RBP: ffff880826569b18 R08: ffff880424ada080 R09: 0000000000000000
[ 249.506434] R10: ffff880826569a38 R11: 000000000000000f R12: ffff880417ee4100
[ 249.514395] R13: 7fffffffffffffff R14: ffff880424ada080 R15: ffff880424ada370
[ 249.522355] FS: 00007f6d44abf740(0000) GS:ffff88042f7c0000(0000)
knlGS:0000000000000000
[ 249.531383] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 249.537792] CR2: 0000000000000250 CR3: 0000000800ffa000 CR4: 00000000000407e0
[ 249.545755] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 249.553716] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 249.561678] Process a.out (pid: 6928, threadinfo ffff880826568000,
task ffff880825d999a0)
[ 249.570802] Stack:
[ 249.573046] ffff880424ada002 ffff880427c5bc00 ffff880417ee4100
ffff880424ada080
[ 249.581339] ffff880424ada080 000000000000000a ffff880826569af8
ffffffff8129ef36
[ 249.589637] ffff880826569b28 ffffffff81529747 ffff880826569be8
0000000011f46a34
[ 249.597933] Call Trace:
[ 249.600666] [<ffffffff8129ef36>] ? security_sock_rcv_skb+0x16/0x20
[ 249.607661] [<ffffffff81529747>] ? sk_filter+0x37/0xd0
[ 249.613491] [<ffffffff8129ef16>] security_unix_may_send+0x16/0x20
[ 249.620390] [<ffffffff815b697d>] unix_dgram_sendmsg+0x48d/0x640
[ 249.627094] [<ffffffff814fd9c0>] sock_sendmsg+0xb0/0xe0
[ 249.633024] [<ffffffff812adee7>] ? ebitmap_cpy+0x47/0xd0
[ 249.639048] [<ffffffff814ff55c>] __sys_sendmsg+0x3ac/0x3c0
[ 249.645267] [<ffffffff811a3357>] ? do_sync_write+0xa7/0xe0
[ 249.651487] [<ffffffff811e31fb>] ? fsnotify+0x24b/0x340
[ 249.657416] [<ffffffff815013c9>] sys_sendmsg+0x49/0x90
[ 249.663249] [<ffffffff81630bd9>] system_call_fastpath+0x16/0x1b
[ 249.669949] Code: 00 00 45 31 c9 48 89 e5 48 83 ec 60 48 8b 56 20
65 48 8b 04 25 28 00 00 00 48 89 45 f8 31 c0 48 8b 47 20 48 8d 7d c0
c6 45 a0 02 <48> 8b b2 50 02 00 00 4c 8b 80 50 02 00 00 31 c0 f3 48 ab
48 89
Regards,
Jan
On Fri, Mar 22, 2013 at 5:24 PM, Paul Moore <paul@paul-moore.com> wrote:
> On Friday, March 22, 2013 04:48:32 PM Ján Stanček wrote:
>> Hi Paul,
>>
>> thanks for reply, I'll try your patch and let you know.
>
> Great, thanks.
>
>> I'm not certain about cause either, but patch I sent in last email
>> makes it go away, so maybe that can help in some way.
>
> At the very least you've found a way to reproduce the problem and your patch
> furthers my belief that we've got a race condition somewhere - all very
> helpful! It may also turn out that your patch is the "right" solution, I'd
> just like to better understand why we are seeing the race in the first place.
>
>> I made a v2 of the reproducer (attached), which triggers the issue a lot
>> faster on 2 systems I tried (32 CPU and 4 CPU systems) - just in couple of
>> seconds.
>
> Excellent, while I don't have a 32 cpu system handy, I do have a 4 cpu system
> that I can play with. Thanks again.
>
> -Paul
>
> --
> paul moore
> www.paul-moore.com
>
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2013-03-22 18:24 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-03-21 22:19 NULL pointer deref, selinux_socket_unix_may_send+0x34/0x90 Ján Stanček
2013-03-22 15:24 ` Paul Moore
2013-03-22 15:48 ` Ján Stanček
2013-03-22 16:24 ` Paul Moore
2013-03-22 16:52 ` Ján Stanček
2013-03-22 18:24 ` Paul Moore
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).