Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH] ip_tunnel: Fix name string concatenate in __ip_tunnel_create()
From: David Miller @ 2018-06-07 20:32 UTC (permalink / raw)
  To: sultanxda; +Cc: netdev, kuznet, yoshfuji
In-Reply-To: <20180606225654.7077-1-sultanxda@gmail.com>

From: Sultan Alsawaf <sultanxda@gmail.com>
Date: Wed,  6 Jun 2018 15:56:54 -0700

> By passing a limit of 2 bytes to strncat, strncat is limited to writing
> fewer bytes than what it's supposed to append to the name here.
> 
> Since the bounds are checked on the line above this, just remove the string
> bounds checks entirely since they're unneeded.
> 
> Signed-off-by: Sultan Alsawaf <sultanxda@gmail.com>

Applied.

^ permalink raw reply

* [Patch net] socket: close race condition between sock_close() and sockfs_setattr()
From: Cong Wang @ 2018-06-07 20:39 UTC (permalink / raw)
  To: netdev; +Cc: shankarapailoor, Cong Wang, Tetsuo Handa, Lorenzo Colitti,
	Al Viro

fchownat() doesn't even hold refcnt of fd until it figures out
fd is really needed (otherwise is ignored) and releases it after
it resolves the path. This means sock_close() could race with
sockfs_setattr(), which leads to a NULL pointer dereference
since typically we set sock->sk to NULL in ->release().

As pointed out by Al, this is unique to sockfs. So we can fix this
in socket layer by acquiring inode_lock in sock_close() and
checking against NULL in sockfs_setattr().

sock_release() is called in many places, only the sock_close()
path matters here. And fortunately, this should not affect normal
sock_close() as it is only called when the last fd refcnt is gone.
It only affects sock_close() with a parallel sockfs_setattr() in
progress, which is not common.

Fixes: 86741ec25462 ("net: core: Add a UID field to struct sock.")
Reported-by: shankarapailoor <shankarapailoor@gmail.com>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Cc: Lorenzo Colitti <lorenzo@google.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
---
 net/socket.c | 18 +++++++++++++++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/net/socket.c b/net/socket.c
index af57d85bcb48..8a109012608a 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -541,7 +541,10 @@ static int sockfs_setattr(struct dentry *dentry, struct iattr *iattr)
 	if (!err && (iattr->ia_valid & ATTR_UID)) {
 		struct socket *sock = SOCKET_I(d_inode(dentry));
 
-		sock->sk->sk_uid = iattr->ia_uid;
+		if (sock->sk)
+			sock->sk->sk_uid = iattr->ia_uid;
+		else
+			err = -ENOENT;
 	}
 
 	return err;
@@ -590,12 +593,16 @@ EXPORT_SYMBOL(sock_alloc);
  *	an inode not a file.
  */
 
-void sock_release(struct socket *sock)
+static void __sock_release(struct socket *sock, struct inode *inode)
 {
 	if (sock->ops) {
 		struct module *owner = sock->ops->owner;
 
+		if (inode)
+			inode_lock(inode);
 		sock->ops->release(sock);
+		if (inode)
+			inode_unlock(inode);
 		sock->ops = NULL;
 		module_put(owner);
 	}
@@ -609,6 +616,11 @@ void sock_release(struct socket *sock)
 	}
 	sock->file = NULL;
 }
+
+void sock_release(struct socket *sock)
+{
+	__sock_release(sock, NULL);
+}
 EXPORT_SYMBOL(sock_release);
 
 void __sock_tx_timestamp(__u16 tsflags, __u8 *tx_flags)
@@ -1171,7 +1183,7 @@ static int sock_mmap(struct file *file, struct vm_area_struct *vma)
 
 static int sock_close(struct inode *inode, struct file *filp)
 {
-	sock_release(SOCKET_I(inode));
+	__sock_release(SOCKET_I(inode), inode);
 	return 0;
 }
 
-- 
2.13.0

^ permalink raw reply related

* Re: [PATCH net] bonding: re-evaluate force_primary when the primary slave name changes
From: David Miller @ 2018-06-07 20:54 UTC (permalink / raw)
  To: yuxiangning; +Cc: netdev
In-Reply-To: <A977208F-4266-411B-8DA5-0C6901F201E7@gmail.com>

From: Xiangning Yu <yuxiangning@gmail.com>
Date: Thu, 7 Jun 2018 13:39:59 +0800

> From: Xiangning Yu <yuxiangning@gmail.com>
> 
> There is a timing issue under active-standy mode, when bond_enslave() is
> called, bond->params.primary might not be initialized yet.
> 
> Any time the primary slave string changes, bond->force_primary should be
> set to true to make sure the primary becomes the active slave.
> 
> Signed-off-by: Xiangning Yu <yuxiangning@gmail.com>

Applied and queued up for -stable.

^ permalink raw reply

* Re: [PATCH] net: mscc: ocelot: Fix uninitialized error in ocelot_netdevice_event()
From: David Miller @ 2018-06-07 20:55 UTC (permalink / raw)
  To: geert; +Cc: alexandre.belloni, andrew, arnd, netdev, linux-kernel
In-Reply-To: <1528377030-28319-1-git-send-email-geert@linux-m68k.org>

From: Geert Uytterhoeven <geert@linux-m68k.org>
Date: Thu,  7 Jun 2018 15:10:30 +0200

> With gcc-4.1.2:
> 
>     drivers/net/ethernet/mscc/ocelot.c: In function ‘ocelot_netdevice_event’:
>     drivers/net/ethernet/mscc/ocelot.c:1129: warning: ‘ret’ may be used uninitialized in this function
> 
> If the list iterated over by netdev_for_each_lower_dev() is empty, ret
> is never initialized, and converted into a notifier return value.
> 
> Fix this by preinitializing ret to zero.
> 
> Fixes: a556c76adc052c97 ("net: mscc: Add initial Ocelot switch support")
> Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
> ---
> This may be unlikely to happen, but given the notifier is called for any
> (also non-ocelot) network device, better be safe than sorry.

Applied.

^ permalink raw reply

* Re: [PATCH net-next] umh: fix race condition
From: David Miller @ 2018-06-07 20:57 UTC (permalink / raw)
  To: ast; +Cc: daniel, mcgrof, netdev, dvyukov, linux-kernel, kernel-team
In-Reply-To: <20180607172310.3121039-1-ast@kernel.org>

From: Alexei Starovoitov <ast@kernel.org>
Date: Thu, 7 Jun 2018 10:23:10 -0700

> kasan reported use-after-free:
> BUG: KASAN: use-after-free in call_usermodehelper_exec_work+0x2d3/0x310 kernel/umh.c:195
> Write of size 4 at addr ffff8801d9202370 by task kworker/u4:2/50
> Workqueue: events_unbound call_usermodehelper_exec_work
> Call Trace:
>  __dump_stack lib/dump_stack.c:77 [inline]
>  dump_stack+0x1b9/0x294 lib/dump_stack.c:113
>  print_address_description+0x6c/0x20b mm/kasan/report.c:256
>  kasan_report_error mm/kasan/report.c:354 [inline]
>  kasan_report.cold.7+0x242/0x2fe mm/kasan/report.c:412
>  __asan_report_store4_noabort+0x17/0x20 mm/kasan/report.c:437
>  call_usermodehelper_exec_work+0x2d3/0x310 kernel/umh.c:195
>  process_one_work+0xc1e/0x1b50 kernel/workqueue.c:2145
>  worker_thread+0x1cc/0x1440 kernel/workqueue.c:2279
>  kthread+0x345/0x410 kernel/kthread.c:240
>  ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:412
> 
> The reason is that 'sub_info' cannot be accessed out of parent task
> context, since it will be freed by the child.
> Instead remember the pid in the child task.
> 
> Fixes: 449325b52b7a ("umh: introduce fork_usermode_blob() helper")
> Reported-by: syzbot+2c73319c406f1987d156@syzkaller.appspotmail.com
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Applied.

^ permalink raw reply

* Re: [PATCH net-next] bpfilter: fix OUTPUT_FORMAT
From: David Miller @ 2018-06-07 20:57 UTC (permalink / raw)
  To: ast; +Cc: daniel, netdev, kernel-team
In-Reply-To: <20180607172929.3154049-1-ast@kernel.org>

From: Alexei Starovoitov <ast@kernel.org>
Date: Thu, 7 Jun 2018 10:29:29 -0700

> From: Alexei Starovoitov <ast@fb.com>
> 
> CONFIG_OUTPUT_FORMAT is x86 only macro.
> Used objdump to extract elf file format.
> 
> Fixes: d2ba09c17a06 ("net: add skeleton of bpfilter kernel module")
> Reported-by: David S. Miller <davem@davemloft.net>
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Applied.

^ permalink raw reply

* Re: [PATCH] selftests: bpf: fix urandom_read build issue
From: Y Song @ 2018-06-07 21:17 UTC (permalink / raw)
  To: Anders Roxell
  Cc: Alexei Starovoitov, Daniel Borkmann, Shuah Khan, netdev, LKML,
	linux-kselftest
In-Reply-To: <CADYN=9J3wY1f6q4NvH1mHJXnu6GnErOqz75CfzfG1q5-WzcyCg@mail.gmail.com>

On Thu, Jun 7, 2018 at 12:07 PM, Anders Roxell <anders.roxell@linaro.org> wrote:
> On 7 June 2018 at 19:52, Y Song <ys114321@gmail.com> wrote:
>> On Thu, Jun 7, 2018 at 3:57 AM, Anders Roxell <anders.roxell@linaro.org> wrote:
>>> gcc complains that urandom_read gets built twice.
>>>
>>> gcc -o tools/testing/selftests/bpf/urandom_read
>>> -static urandom_read.c -Wl,--build-id
>>> gcc -Wall -O2 -I../../../include/uapi -I../../../lib -I../../../lib/bpf
>>> -I../../../../include/generated  -I../../../include    urandom_read.c
>>> urandom_read -lcap -lelf -lrt -lpthread -o
>>> tools/testing/selftests/bpf/urandom_read
>>> gcc: fatal error: input file
>>> ‘tools/testing/selftests/bpf/urandom_read’ is the
>>> same as output file
>>> compilation terminated.
>>> ../lib.mk:110: recipe for target
>>> 'tools/testing/selftests/bpf/urandom_read' failed
>>
>> What is the build/make command to reproduce the above failure?
>
> make -C tools/testing/selftests

Thanks. The patch will break
   make -C tools/testing/selftests/bpf

[yhs@localhost bpf-next]$ make -C tools/testing/selftests/bpf
make: Entering directory '/home/yhs/work/bpf-next/tools/testing/selftests/bpf'
gcc -o /urandom_read -static urandom_read.c -Wl,--build-id
/usr/bin/ld: cannot open output file /urandom_read: Permission denied
collect2: error: ld returned 1 exit status
make: *** [Makefile:20: /urandom_read] Error 1
make: Leaving directory '/home/yhs/work/bpf-next/tools/testing/selftests/bpf'
[yhs@localhost bpf-next]$

Could you still make the above command work?

>
> Cheers,
> Anders
>
>>
>>> To fix this issue remove the urandom_read target and so target
>>> TEST_CUSTOM_PROGS gets used.
>>>
>>> Fixes: 81f77fd0deeb ("bpf: add selftest for stackmap with BPF_F_STACK_BUILD_ID")
>>> Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
>>> ---
>>>  tools/testing/selftests/bpf/Makefile | 6 ++----
>>>  1 file changed, 2 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
>>> index 607ed8729c06..67285591ffd7 100644
>>> --- a/tools/testing/selftests/bpf/Makefile
>>> +++ b/tools/testing/selftests/bpf/Makefile
>>> @@ -16,10 +16,8 @@ LDLIBS += -lcap -lelf -lrt -lpthread
>>>  TEST_CUSTOM_PROGS = $(OUTPUT)/urandom_read
>>>  all: $(TEST_CUSTOM_PROGS)
>>>
>>> -$(TEST_CUSTOM_PROGS): urandom_read
>>> -
>>> -urandom_read: urandom_read.c
>>> -       $(CC) -o $(TEST_CUSTOM_PROGS) -static $< -Wl,--build-id
>>> +$(TEST_CUSTOM_PROGS): $(OUTPUT)/%: %.c
>>> +       $(CC) -o $@ -static $< -Wl,--build-id
>>>
>>>  # Order correspond to 'make run_tests' order
>>>  TEST_GEN_PROGS = test_verifier test_tag test_maps test_lru_map test_lpm_map test_progs \
>>> --
>>> 2.17.1
>>>

^ permalink raw reply

* Re: [Patch net] socket: close race condition between sock_close() and sockfs_setattr()
From: Al Viro @ 2018-06-07 21:26 UTC (permalink / raw)
  To: Cong Wang; +Cc: netdev, shankarapailoor, Tetsuo Handa, Lorenzo Colitti
In-Reply-To: <20180607203949.16945-1-xiyou.wangcong@gmail.com>

On Thu, Jun 07, 2018 at 01:39:49PM -0700, Cong Wang wrote:
> fchownat() doesn't even hold refcnt of fd until it figures out
> fd is really needed (otherwise is ignored) and releases it after
> it resolves the path. This means sock_close() could race with
> sockfs_setattr(), which leads to a NULL pointer dereference
> since typically we set sock->sk to NULL in ->release().
> 
> As pointed out by Al, this is unique to sockfs. So we can fix this
> in socket layer by acquiring inode_lock in sock_close() and
> checking against NULL in sockfs_setattr().

That looks like a massive overkill - it's way heavier than it should be.
And it's very likely to trigger shitloads of deadlock warnings, some
possibly even true.

^ permalink raw reply

* Re: [PATCH] net-fq: Add WARN_ON check for null flow.
From: Cong Wang @ 2018-06-07 21:29 UTC (permalink / raw)
  To: Ben Greear; +Cc: Linux Kernel Network Developers
In-Reply-To: <1528387585-5806-1-git-send-email-greearb@candelatech.com>

On Thu, Jun 7, 2018 at 9:06 AM,  <greearb@candelatech.com> wrote:
> --- a/include/net/fq_impl.h
> +++ b/include/net/fq_impl.h
> @@ -80,6 +80,9 @@ static struct sk_buff *fq_tin_dequeue(struct fq *fq,
>
>         flow = list_first_entry(head, struct fq_flow, flowchain);
>
> +       if (WARN_ON_ONCE(!flow))
> +               return NULL;
> +

How could even possibly list_first_entry() returns NULL?
You need list_first_entry_or_null().

^ permalink raw reply

* Re: KMSAN: uninit-value in ebt_stp_mt_check (2)
From: syzbot @ 2018-06-07 21:40 UTC (permalink / raw)
  To: bridge, coreteam, davem, dvyukov, fw, kadlec, linux-kernel,
	netdev, netfilter-devel, pablo, stephen, syzkaller-bugs
In-Reply-To: <0000000000009a3b5b056e109e42@google.com>

syzbot has found a reproducer for the following crash on:

HEAD commit:    c6a6aed994b6 kmsan: remove dead code to trigger syzbot build
git tree:       https://github.com/google/kmsan.git/master
console output: https://syzkaller.appspot.com/x/log.txt?x=122606bf800000
kernel config:  https://syzkaller.appspot.com/x/.config?x=848e40757852af3e
dashboard link: https://syzkaller.appspot.com/bug?extid=da4494182233c23a5fcf
compiler:       clang version 7.0.0 (trunk 334104)
syzkaller repro:https://syzkaller.appspot.com/x/repro.syz?x=121f481f800000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=121fbbd7800000

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+da4494182233c23a5fcf@syzkaller.appspotmail.com

random: sshd: uninitialized urandom read (32 bytes read)
random: sshd: uninitialized urandom read (32 bytes read)
random: sshd: uninitialized urandom read (32 bytes read)
random: sshd: uninitialized urandom read (32 bytes read)
==================================================================
BUG: KMSAN: uninit-value in ebt_stp_mt_check+0x24b/0x450  
net/bridge/netfilter/ebt_stp.c:162
CPU: 1 PID: 4523 Comm: syz-executor710 Not tainted 4.17.0+ #3
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
Call Trace:
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0x185/0x1d0 lib/dump_stack.c:113
  kmsan_report+0x149/0x260 mm/kmsan/kmsan.c:1084
  __msan_warning_32+0x6e/0xc0 mm/kmsan/kmsan_instr.c:620
  ebt_stp_mt_check+0x24b/0x450 net/bridge/netfilter/ebt_stp.c:162
  xt_check_match+0x1438/0x1650 net/netfilter/x_tables.c:506
  ebt_check_match net/bridge/netfilter/ebtables.c:372 [inline]
  ebt_check_entry net/bridge/netfilter/ebtables.c:702 [inline]
  translate_table+0x4e88/0x6120 net/bridge/netfilter/ebtables.c:943
  do_replace_finish+0x1258/0x2ea0 net/bridge/netfilter/ebtables.c:999
  do_replace+0x719/0x780 net/bridge/netfilter/ebtables.c:1138
  do_ebt_set_ctl+0x2ab/0x3c0 net/bridge/netfilter/ebtables.c:1517
  nf_sockopt net/netfilter/nf_sockopt.c:106 [inline]
  nf_setsockopt+0x47c/0x4e0 net/netfilter/nf_sockopt.c:115
  ip_setsockopt+0x24b/0x2b0 net/ipv4/ip_sockglue.c:1251
  udp_setsockopt+0x108/0x1b0 net/ipv4/udp.c:2416
  sock_common_setsockopt+0x13b/0x170 net/core/sock.c:3039
  __sys_setsockopt+0x496/0x540 net/socket.c:1903
  __do_sys_setsockopt net/socket.c:1914 [inline]
  __se_sys_setsockopt net/socket.c:1911 [inline]
  __x64_sys_setsockopt+0x15c/0x1c0 net/socket.c:1911
  do_syscall_64+0x15b/0x230 arch/x86/entry/common.c:287
  entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x43fd89
RSP: 002b:00007ffc9d7edb28 EFLAGS: 00000213 ORIG_RAX: 0000000000000036
RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 000000000043fd89
RDX: 0000000000000080 RSI: 0000000000000000 RDI: 0000000000000003
RBP: 00000000006ca018 R08: 0000000000000300 R09: 00000000004002c8
R10: 0000000020000480 R11: 0000000000000213 R12: 00000000004016b0
R13: 0000000000401740 R14: 0000000000000000 R15: 0000000000000000

Local variable description: ----mtpar.i@translate_table
Variable was created at:
  translate_table+0xbb/0x6120 net/bridge/netfilter/ebtables.c:831
  do_replace_finish+0x1258/0x2ea0 net/bridge/netfilter/ebtables.c:999
==================================================================

^ permalink raw reply

* Re: [PATCH] net-fq: Add WARN_ON check for null flow.
From: Ben Greear @ 2018-06-07 21:41 UTC (permalink / raw)
  To: Cong Wang; +Cc: Linux Kernel Network Developers
In-Reply-To: <CAM_iQpXHDVkQ0sS1sOnhnp8vcLkhsZvYzycwF1Do=mydMZhvzA@mail.gmail.com>

On 06/07/2018 02:29 PM, Cong Wang wrote:
> On Thu, Jun 7, 2018 at 9:06 AM,  <greearb@candelatech.com> wrote:
>> --- a/include/net/fq_impl.h
>> +++ b/include/net/fq_impl.h
>> @@ -80,6 +80,9 @@ static struct sk_buff *fq_tin_dequeue(struct fq *fq,
>>
>>         flow = list_first_entry(head, struct fq_flow, flowchain);
>>
>> +       if (WARN_ON_ONCE(!flow))
>> +               return NULL;
>> +
>
> How could even possibly list_first_entry() returns NULL?
> You need list_first_entry_or_null().
>

I don't know for certain flow as null, but something was NULL in this method
near that line and it looked like a likely culprit.

I guess possibly tin or fq was passed in as NULL?

Anyway, if the patch seems worthless just ignore it.  I'll leave it in my tree
since it should be harmless and will let you know if I ever hit it.

If someone else hits a similar crash, hopefully they can report it.

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply

* Re: [PATCH] selftests: bpf: fix urandom_read build issue
From: Anders Roxell @ 2018-06-07 21:43 UTC (permalink / raw)
  To: Y Song
  Cc: Alexei Starovoitov, Daniel Borkmann, Shuah Khan, netdev, LKML,
	linux-kselftest
In-Reply-To: <CAH3MdRXMHRtK9FJ4UP8Gy-bahCS2oipOY7yLTEqu-x6uOMKLvw@mail.gmail.com>

On 7 June 2018 at 23:17, Y Song <ys114321@gmail.com> wrote:
> On Thu, Jun 7, 2018 at 12:07 PM, Anders Roxell <anders.roxell@linaro.org> wrote:
>> On 7 June 2018 at 19:52, Y Song <ys114321@gmail.com> wrote:
>>> On Thu, Jun 7, 2018 at 3:57 AM, Anders Roxell <anders.roxell@linaro.org> wrote:
>>>> gcc complains that urandom_read gets built twice.
>>>>
>>>> gcc -o tools/testing/selftests/bpf/urandom_read
>>>> -static urandom_read.c -Wl,--build-id
>>>> gcc -Wall -O2 -I../../../include/uapi -I../../../lib -I../../../lib/bpf
>>>> -I../../../../include/generated  -I../../../include    urandom_read.c
>>>> urandom_read -lcap -lelf -lrt -lpthread -o
>>>> tools/testing/selftests/bpf/urandom_read
>>>> gcc: fatal error: input file
>>>> ‘tools/testing/selftests/bpf/urandom_read’ is the
>>>> same as output file
>>>> compilation terminated.
>>>> ../lib.mk:110: recipe for target
>>>> 'tools/testing/selftests/bpf/urandom_read' failed
>>>
>>> What is the build/make command to reproduce the above failure?
>>
>> make -C tools/testing/selftests
>
> Thanks. The patch will break
>    make -C tools/testing/selftests/bpf
>
> [yhs@localhost bpf-next]$ make -C tools/testing/selftests/bpf
> make: Entering directory '/home/yhs/work/bpf-next/tools/testing/selftests/bpf'
> gcc -o /urandom_read -static urandom_read.c -Wl,--build-id
> /usr/bin/ld: cannot open output file /urandom_read: Permission denied
> collect2: error: ld returned 1 exit status
> make: *** [Makefile:20: /urandom_read] Error 1
> make: Leaving directory '/home/yhs/work/bpf-next/tools/testing/selftests/bpf'
> [yhs@localhost bpf-next]$

urgh, I'm sorry, missed that.

>
> Could you still make the above command work?

$(TEST_CUSTOM_PROGS): $(OUTPUT)/%: %.c
        $(CC) -o $(TEST_CUSTOM_PROGS) -static $< -Wl,--build-id

That worked both with:
make -C tools/testing/selftests
and
make -C tools/testing/selftests/bpf

for me.

what do you think?

>
>>
>> Cheers,
>> Anders
>>
>>>
>>>> To fix this issue remove the urandom_read target and so target
>>>> TEST_CUSTOM_PROGS gets used.
>>>>
>>>> Fixes: 81f77fd0deeb ("bpf: add selftest for stackmap with BPF_F_STACK_BUILD_ID")
>>>> Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
>>>> ---
>>>>  tools/testing/selftests/bpf/Makefile | 6 ++----
>>>>  1 file changed, 2 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
>>>> index 607ed8729c06..67285591ffd7 100644
>>>> --- a/tools/testing/selftests/bpf/Makefile
>>>> +++ b/tools/testing/selftests/bpf/Makefile
>>>> @@ -16,10 +16,8 @@ LDLIBS += -lcap -lelf -lrt -lpthread
>>>>  TEST_CUSTOM_PROGS = $(OUTPUT)/urandom_read
>>>>  all: $(TEST_CUSTOM_PROGS)
>>>>
>>>> -$(TEST_CUSTOM_PROGS): urandom_read
>>>> -
>>>> -urandom_read: urandom_read.c
>>>> -       $(CC) -o $(TEST_CUSTOM_PROGS) -static $< -Wl,--build-id
>>>> +$(TEST_CUSTOM_PROGS): $(OUTPUT)/%: %.c
>>>> +       $(CC) -o $@ -static $< -Wl,--build-id
>>>>
>>>>  # Order correspond to 'make run_tests' order
>>>>  TEST_GEN_PROGS = test_verifier test_tag test_maps test_lru_map test_lpm_map test_progs \
>>>> --
>>>> 2.17.1
>>>>

^ permalink raw reply

* Re: [Patch net] socket: close race condition between sock_close() and sockfs_setattr()
From: Cong Wang @ 2018-06-07 21:45 UTC (permalink / raw)
  To: Al Viro
  Cc: Linux Kernel Network Developers, shankarapailoor, Tetsuo Handa,
	Lorenzo Colitti
In-Reply-To: <20180607212631.GW30522@ZenIV.linux.org.uk>

On Thu, Jun 7, 2018 at 2:26 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> On Thu, Jun 07, 2018 at 01:39:49PM -0700, Cong Wang wrote:
>> fchownat() doesn't even hold refcnt of fd until it figures out
>> fd is really needed (otherwise is ignored) and releases it after
>> it resolves the path. This means sock_close() could race with
>> sockfs_setattr(), which leads to a NULL pointer dereference
>> since typically we set sock->sk to NULL in ->release().
>>
>> As pointed out by Al, this is unique to sockfs. So we can fix this
>> in socket layer by acquiring inode_lock in sock_close() and
>> checking against NULL in sockfs_setattr().
>
> That looks like a massive overkill - it's way heavier than it should be.

I don't see any other quick way to fix this. My initial thought is
to keep that refcnt until path_put(), apparently you don't like it
either.


> And it's very likely to trigger shitloads of deadlock warnings, some
> possibly even true.

I do audit the inode_lock usage in networking, I don't see any
deadlock, of course, there could be some non-networking code
uses socket API that I missed. But _generally_, socket doesn't
have a pointer to retrieve this inode.

^ permalink raw reply

* Re: linux-next: Please add mlx5-next tree
From: Stephen Rothwell @ 2018-06-07 21:47 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Linux-Next Mailing List, Linux Kernel Mailing List,
	Saeed Mahameed, Jason Gunthorpe, Doug Ledford, David S. Miller,
	linux-netdev, RDMA mailing list
In-Reply-To: <20180606192500.GZ2958@mtr-leonro.mtl.com>

[-- Attachment #1: Type: text/plain, Size: 611 bytes --]

Hi Leon,

On Wed, 6 Jun 2018 22:25:00 +0300 Leon Romanovsky <leon@kernel.org> wrote:
>
> Can you please add the branch "mlx5-next" from the following tree to the
> linux-next integration tree?
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git/
> 
> The mlx5-next branch is used to send shared pull requests to both netdev
> and RDMA subsystems and it is worth to have linux-next coverage on it
> before.

I try hard not to add new trees during the merge window, sorry.  If I
forget to add it after -rc1 has been released, please remind me.

-- 
Cheers,
Stephen Rothwell

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply

* Re: [PATCH] net-fq: Add WARN_ON check for null flow.
From: Cong Wang @ 2018-06-07 21:52 UTC (permalink / raw)
  To: Ben Greear; +Cc: Linux Kernel Network Developers
In-Reply-To: <6dd70f92-f026-0b2a-9aed-91f303ee4335@candelatech.com>

On Thu, Jun 7, 2018 at 2:41 PM, Ben Greear <greearb@candelatech.com> wrote:
> On 06/07/2018 02:29 PM, Cong Wang wrote:
>>
>> On Thu, Jun 7, 2018 at 9:06 AM,  <greearb@candelatech.com> wrote:
>>>
>>> --- a/include/net/fq_impl.h
>>> +++ b/include/net/fq_impl.h
>>> @@ -80,6 +80,9 @@ static struct sk_buff *fq_tin_dequeue(struct fq *fq,
>>>
>>>         flow = list_first_entry(head, struct fq_flow, flowchain);
>>>
>>> +       if (WARN_ON_ONCE(!flow))
>>> +               return NULL;
>>> +
>>
>>
>> How could even possibly list_first_entry() returns NULL?
>> You need list_first_entry_or_null().
>>
>
> I don't know for certain flow as null, but something was NULL in this method
> near that line and it looked like a likely culprit.
>
> I guess possibly tin or fq was passed in as NULL?

A NULL pointer is not always 0. You can trigger a NULL-ptr-def with 0x3c
too, but you are checking against 0 in your patch, that is the problem and
that is why list_first_entry_or_null() exists.

^ permalink raw reply

* [PATCH net] net: aquantia: fix unsigned numvecs comparison with less than zero
From: Igor Russkikh @ 2018-06-07 21:54 UTC (permalink / raw)
  To: David S . Miller
  Cc: netdev, David Arcari, Pavel Belous, Colin Ian King, Igor Russkikh

From: Colin Ian King <colin.king@canonical.com>

From: Colin Ian King <colin.king@canonical.com>

This was originally mistakenly submitted to net-next. Resubmitting to net.

The comparison of numvecs < 0 is always false because numvecs is a u32
and hence the error return from a failed call to pci_alloc_irq_vectores
is never detected.  Fix this by using the signed int ret to handle the
error return and assign numvecs to err.

Detected by CoverityScan, CID#1468650 ("Unsigned compared against 0")

Fixes: a09bd81b5413 ("net: aquantia: Limit number of vectors to actually allocated irqs")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
---
 drivers/net/ethernet/aquantia/atlantic/aq_pci_func.c | 11 +++++------
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_pci_func.c b/drivers/net/ethernet/aquantia/atlantic/aq_pci_func.c
index a50e08bb4748..750007513f9d 100644
--- a/drivers/net/ethernet/aquantia/atlantic/aq_pci_func.c
+++ b/drivers/net/ethernet/aquantia/atlantic/aq_pci_func.c
@@ -267,14 +267,13 @@ static int aq_pci_probe(struct pci_dev *pdev,
 	numvecs = min(numvecs, num_online_cpus());
 	/*enable interrupts */
 #if !AQ_CFG_FORCE_LEGACY_INT
-	numvecs = pci_alloc_irq_vectors(self->pdev, 1, numvecs,
-					PCI_IRQ_MSIX | PCI_IRQ_MSI |
-					PCI_IRQ_LEGACY);
+	err = pci_alloc_irq_vectors(self->pdev, 1, numvecs,
+				    PCI_IRQ_MSIX | PCI_IRQ_MSI |
+				    PCI_IRQ_LEGACY);
 
-	if (numvecs < 0) {
-		err = numvecs;
+	if (err < 0)
 		goto err_hwinit;
-	}
+	numvecs = err;
 #endif
 	self->irqvecs = numvecs;
 
-- 
2.17.1

^ permalink raw reply related

* Re: general protection fault in __vfs_write
From: Alexei Starovoitov @ 2018-06-07 21:58 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: syzbot, linux-fsdevel, linux-kernel, syzkaller-bugs, netdev,
	Alexei Starovoitov, Daniel Borkmann
In-Reply-To: <20180607152846.GB27079@bombadil.infradead.org>

On Thu, Jun 07, 2018 at 08:28:46AM -0700, Matthew Wilcox wrote:
> 
> I would suggest this is a BPF problem, not a VFS problem.
> 
> On Thu, Jun 07, 2018 at 08:19:01AM -0700, syzbot wrote:
> > Hello,
> > 
> > syzbot found the following crash on:
> > 
> > HEAD commit:    7170e6045a6a strparser: Add __strp_unpause and use it in k..
> > git tree:       net-next
> > console output: https://syzkaller.appspot.com/x/log.txt?x=14bde74f800000
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=a601a80fec461d44
> > dashboard link: https://syzkaller.appspot.com/bug?extid=7ade6c94abb2774c0fee
> > compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
> > 
> > Unfortunately, I don't have any reproducer for this crash yet.
> > 
> > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > Reported-by: syzbot+7ade6c94abb2774c0fee@syzkaller.appspotmail.com
> > 
> > bpfilter: read fail -512
> > kasan: GPF could be caused by NULL-ptr deref or user memory access
> > general protection fault: 0000 [#1] SMP KASAN
> > Dumping ftrace buffer:
> >    (ftrace buffer empty)
> > Modules linked in:
> > CPU: 1 PID: 4590 Comm: syz-executor7 Not tainted 4.17.0-rc7+ #82
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> > Google 01/01/2011
> > RIP: 0010:file_write_hint include/linux/fs.h:1925 [inline]
> > RIP: 0010:init_sync_kiocb include/linux/fs.h:1935 [inline]
> > RIP: 0010:new_sync_write fs/read_write.c:470 [inline]
> > RIP: 0010:__vfs_write+0x4a6/0x960 fs/read_write.c:487
> > RSP: 0018:ffff88019b5c7850 EFLAGS: 00010202
> > RAX: dffffc0000000000 RBX: ffff8801d2117c80 RCX: ffffffff81bfc6bb
> > RDX: 0000000000000019 RSI: ffffffff81bfc6ca RDI: 00000000000000c8
> > RBP: ffff88019b5c79c8 R08: ffff88019b5ba540 R09: fffffbfff12cae69
> > R10: ffff88019b5c7a10 R11: ffffffff8965734b R12: 0000000000000000
> > R13: ffff88019b5c79a0 R14: 0000000000000000 R15: ffff88019b5c7a88
> > FS:  0000000002a11940(0000) GS:ffff8801daf00000(0000) knlGS:0000000000000000
> > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 0000000000700138 CR3: 000000019b410000 CR4: 00000000001406e0
> > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > Call Trace:
> >  __kernel_write+0x10c/0x380 fs/read_write.c:506
> >  __bpfilter_process_sockopt+0x1d8/0x35b net/bpfilter/bpfilter_kern.c:66

I think I see the problem. Will send the fix soon.

^ permalink raw reply

* Re: [Patch net] socket: close race condition between sock_close() and sockfs_setattr()
From: Al Viro @ 2018-06-07 22:04 UTC (permalink / raw)
  To: Cong Wang
  Cc: Linux Kernel Network Developers, shankarapailoor, Tetsuo Handa,
	Lorenzo Colitti
In-Reply-To: <CAM_iQpUpBYC0MpjOsYLn5ZnsjS3pSHbZo2AE=trfrsJNxQYyxA@mail.gmail.com>

On Thu, Jun 07, 2018 at 02:45:58PM -0700, Cong Wang wrote:
> On Thu, Jun 7, 2018 at 2:26 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> > On Thu, Jun 07, 2018 at 01:39:49PM -0700, Cong Wang wrote:
> >> fchownat() doesn't even hold refcnt of fd until it figures out
> >> fd is really needed (otherwise is ignored) and releases it after
> >> it resolves the path. This means sock_close() could race with
> >> sockfs_setattr(), which leads to a NULL pointer dereference
> >> since typically we set sock->sk to NULL in ->release().
> >>
> >> As pointed out by Al, this is unique to sockfs. So we can fix this
> >> in socket layer by acquiring inode_lock in sock_close() and
> >> checking against NULL in sockfs_setattr().
> >
> > That looks like a massive overkill - it's way heavier than it should be.
> 
> I don't see any other quick way to fix this. My initial thought is
> to keep that refcnt until path_put(), apparently you don't like it
> either.

You do realize that the same ->setattr() can be called by way of
chown() on /proc/self/fd/<n>, right?  What would you do there -
bump refcount on that struct file when traversing that symlink and
hold it past the end of pathname resolution, until... what exactly?

^ permalink raw reply

* [PATCH bpf] bpf, xdp: fix crash in xdp_umem_unaccount_pages
From: Daniel Borkmann @ 2018-06-07 22:06 UTC (permalink / raw)
  To: ast; +Cc: netdev, bjorn.topel, Daniel Borkmann

syzkaller was able to trigger the following panic for AF_XDP:

  BUG: KASAN: null-ptr-deref in atomic64_sub include/asm-generic/atomic-instrumented.h:144 [inline]
  BUG: KASAN: null-ptr-deref in atomic_long_sub include/asm-generic/atomic-long.h:199 [inline]
  BUG: KASAN: null-ptr-deref in xdp_umem_unaccount_pages.isra.4+0x3d/0x80 net/xdp/xdp_umem.c:135
  Write of size 8 at addr 0000000000000060 by task syz-executor246/4527

  CPU: 1 PID: 4527 Comm: syz-executor246 Not tainted 4.17.0+ #89
  Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
  Call Trace:
   __dump_stack lib/dump_stack.c:77 [inline]
   dump_stack+0x1b9/0x294 lib/dump_stack.c:113
   kasan_report_error mm/kasan/report.c:352 [inline]
   kasan_report.cold.7+0x6d/0x2fe mm/kasan/report.c:412
   check_memory_region_inline mm/kasan/kasan.c:260 [inline]
   check_memory_region+0x13e/0x1b0 mm/kasan/kasan.c:267
   kasan_check_write+0x14/0x20 mm/kasan/kasan.c:278
   atomic64_sub include/asm-generic/atomic-instrumented.h:144 [inline]
   atomic_long_sub include/asm-generic/atomic-long.h:199 [inline]
   xdp_umem_unaccount_pages.isra.4+0x3d/0x80 net/xdp/xdp_umem.c:135
   xdp_umem_reg net/xdp/xdp_umem.c:334 [inline]
   xdp_umem_create+0xd6c/0x10f0 net/xdp/xdp_umem.c:349
   xsk_setsockopt+0x443/0x550 net/xdp/xsk.c:531
   __sys_setsockopt+0x1bd/0x390 net/socket.c:1935
   __do_sys_setsockopt net/socket.c:1946 [inline]
   __se_sys_setsockopt net/socket.c:1943 [inline]
   __x64_sys_setsockopt+0xbe/0x150 net/socket.c:1943
   do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287
   entry_SYSCALL_64_after_hwframe+0x49/0xbe

In xdp_umem_reg() the call to xdp_umem_account_pages() passed
with CAP_IPC_LOCK where we didn't need to end up charging rlimit
on memlock for the current user and therefore umem->user continues
to be NULL. Later on through fault injection syzkaller triggered
a failure in either umem->pgs or umem->pages allocation such that
we bail out and undo accounting in xdp_umem_unaccount_pages()
where we eventually hit the panic since it tries to deref the
umem->user.

The code is pretty close to mm_account_pinned_pages() and
mm_unaccount_pinned_pages() pair and potentially could reuse
it even in a later cleanup, and it appears that the initial
commit c0c77d8fb787 ("xsk: add user memory registration support
sockopt") got this right while later follow-up introduced the
bug via a49049ea2576 ("xsk: simplified umem setup").

Fixes: a49049ea2576 ("xsk: simplified umem setup")
Reported-by: syzbot+979217770b09ebf5c407@syzkaller.appspotmail.com
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
---
 net/xdp/xdp_umem.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/net/xdp/xdp_umem.c b/net/xdp/xdp_umem.c
index 7eb4948..b9ef487 100644
--- a/net/xdp/xdp_umem.c
+++ b/net/xdp/xdp_umem.c
@@ -132,8 +132,10 @@ static void xdp_umem_unpin_pages(struct xdp_umem *umem)
 
 static void xdp_umem_unaccount_pages(struct xdp_umem *umem)
 {
-	atomic_long_sub(umem->npgs, &umem->user->locked_vm);
-	free_uid(umem->user);
+	if (umem->user) {
+		atomic_long_sub(umem->npgs, &umem->user->locked_vm);
+		free_uid(umem->user);
+	}
 }
 
 static void xdp_umem_release(struct xdp_umem *umem)
-- 
2.9.5

^ permalink raw reply related

* Re: [PATCH] net-fq: Add WARN_ON check for null flow.
From: Ben Greear @ 2018-06-07 22:08 UTC (permalink / raw)
  To: Cong Wang; +Cc: Linux Kernel Network Developers
In-Reply-To: <CAM_iQpUdBst8xyzceq2OdVB2onF1qMR063zrkBVDBonZZNuQdA@mail.gmail.com>

On 06/07/2018 02:52 PM, Cong Wang wrote:
> On Thu, Jun 7, 2018 at 2:41 PM, Ben Greear <greearb@candelatech.com> wrote:
>> On 06/07/2018 02:29 PM, Cong Wang wrote:
>>>
>>> On Thu, Jun 7, 2018 at 9:06 AM,  <greearb@candelatech.com> wrote:
>>>>
>>>> --- a/include/net/fq_impl.h
>>>> +++ b/include/net/fq_impl.h
>>>> @@ -80,6 +80,9 @@ static struct sk_buff *fq_tin_dequeue(struct fq *fq,
>>>>
>>>>         flow = list_first_entry(head, struct fq_flow, flowchain);
>>>>
>>>> +       if (WARN_ON_ONCE(!flow))
>>>> +               return NULL;
>>>> +
>>>
>>>
>>> How could even possibly list_first_entry() returns NULL?
>>> You need list_first_entry_or_null().
>>>
>>
>> I don't know for certain flow as null, but something was NULL in this method
>> near that line and it looked like a likely culprit.
>>
>> I guess possibly tin or fq was passed in as NULL?
>
> A NULL pointer is not always 0. You can trigger a NULL-ptr-def with 0x3c
> too, but you are checking against 0 in your patch, that is the problem and
> that is why list_first_entry_or_null() exists.
>

Ahh, I see what you mean, and that is my mistake.  In my case, it did seem to
be a mostly-null deref, not a 0x0 deref.

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply

* Hello Dear
From: Charles Koch @ 2018-06-07 22:10 UTC (permalink / raw)
  To: Recipients

Hello Fellow

Charles Koch here, ceo charles koch foundation worldwide (the largest charity foundation in the world).
This year under our motto of "Give while living", i and the foundation have decided to award $500,000.00
to a few selected lucky individuals and on receipt of this email kindly get back to me.

Regards,
Charles Koch.
chkoch2329@dbzmail.com

^ permalink raw reply

* Re: [Patch net] socket: close race condition between sock_close() and sockfs_setattr()
From: Cong Wang @ 2018-06-07 22:15 UTC (permalink / raw)
  To: Al Viro
  Cc: Linux Kernel Network Developers, shankarapailoor, Tetsuo Handa,
	Lorenzo Colitti
In-Reply-To: <20180607220446.GX30522@ZenIV.linux.org.uk>

On Thu, Jun 7, 2018 at 3:04 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> On Thu, Jun 07, 2018 at 02:45:58PM -0700, Cong Wang wrote:
>> On Thu, Jun 7, 2018 at 2:26 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
>> > On Thu, Jun 07, 2018 at 01:39:49PM -0700, Cong Wang wrote:
>> >> fchownat() doesn't even hold refcnt of fd until it figures out
>> >> fd is really needed (otherwise is ignored) and releases it after
>> >> it resolves the path. This means sock_close() could race with
>> >> sockfs_setattr(), which leads to a NULL pointer dereference
>> >> since typically we set sock->sk to NULL in ->release().
>> >>
>> >> As pointed out by Al, this is unique to sockfs. So we can fix this
>> >> in socket layer by acquiring inode_lock in sock_close() and
>> >> checking against NULL in sockfs_setattr().
>> >
>> > That looks like a massive overkill - it's way heavier than it should be.
>>
>> I don't see any other quick way to fix this. My initial thought is
>> to keep that refcnt until path_put(), apparently you don't like it
>> either.
>
> You do realize that the same ->setattr() can be called by way of
> chown() on /proc/self/fd/<n>, right?  What would you do there -
> bump refcount on that struct file when traversing that symlink and
> hold it past the end of pathname resolution, until... what exactly?

I was thinking about this:

        error = user_path_at(dfd,....); // hold dfd when needed

        if (!error) {
                error = chown_common(&path, mode);
                path_put(&path);      // release dfd if held

With this, we can guarantee ->release() is only possibly called
after chown_common() which is after ->setattr() too.

^ permalink raw reply

* Re: [PATCH bpf] tools/bpf: fix selftest get_cgroup_id_user
From: Daniel Borkmann @ 2018-06-07 22:15 UTC (permalink / raw)
  To: Yonghong Song, ast, netdev; +Cc: kernel-team
In-Reply-To: <20180606161244.2649812-1-yhs@fb.com>

On 06/06/2018 06:12 PM, Yonghong Song wrote:
> Commit f269099a7e7a ("tools/bpf: add a selftest for
> bpf_get_current_cgroup_id() helper") added a test
> for bpf_get_current_cgroup_id() helper. The bpf program
> is attached to tracepoint syscalls/sys_enter_nanosleep
> and will record the cgroup id if the tracepoint is hit.
> The test program creates a cgroup and attachs itself to
> this cgroup and expects that the test program process
> cgroup id is the same as the cgroup_id retrieved
> by the bpf program.
> 
> In a light system where no other processes called
> nanosleep syscall, the test case can pass.
> In a busy system where many different processes can hit
> syscalls/sys_enter_nanosleep tracepoint, the cgroup id
> recorded by bpf program may not match the test program
> process cgroup_id.
> 
> This patch fixed an issue by communicating the test program
> pid to bpf program. The bpf program only records
> cgroup id if the current task pid is the same as
> passed-in pid. This ensures that the recorded cgroup_id
> is for the cgroup within which the test program resides.
> 
> Fixes: f269099a7e7a ("tools/bpf: add a selftest for bpf_get_current_cgroup_id() helper")
> Signed-off-by: Yonghong Song <yhs@fb.com>

Applied to bpf, thanks Yonghong!

^ permalink raw reply

* Re: [PATCH] xsk: Fix umem fill/completion queue mmap on 32-bit
From: Daniel Borkmann @ 2018-06-07 22:21 UTC (permalink / raw)
  To: Björn Töpel, geert
  Cc: David Miller, Björn Töpel, Karlsson, Magnus, ast,
	Arnd Bergmann, akpm, Netdev, linux-mm, LKML
In-Reply-To: <CAJ+HfNiciU0+4zd3ppapH12Gg_SFf9oUWTy+yafJSxCX8Mv-Dg@mail.gmail.com>

On 06/07/2018 06:34 PM, Björn Töpel wrote:
> Den tors 7 juni 2018 kl 15:37 skrev Geert Uytterhoeven <geert@linux-m68k.org>:
>>
>> With gcc-4.1.2 on 32-bit:
>>
>>     net/xdp/xsk.c:663: warning: integer constant is too large for ‘long’ type
>>     net/xdp/xsk.c:665: warning: integer constant is too large for ‘long’ type
>>
>> Add the missing "ULL" suffixes to the large XDP_UMEM_PGOFF_*_RING values
>> to fix this.
>>
>>     net/xdp/xsk.c:663: warning: comparison is always false due to limited range of data type
>>     net/xdp/xsk.c:665: warning: comparison is always false due to limited range of data type
>>
>> "unsigned long" is 32-bit on 32-bit systems, hence the offset is
>> truncated, and can never be equal to any of the XDP_UMEM_PGOFF_*_RING
>> values.  Use loff_t (and the required cast) to fix this.
>>
>> Fixes: 423f38329d267969 ("xsk: add umem fill queue support and mmap")
>> Fixes: fe2308328cd2f26e ("xsk: add umem completion queue support and mmap")
>> Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
[...]
> 
> Thanks Geert!
> 
> Acked-by: Björn Töpel <bjorn.topel@intel.com>

Applied to bpf, thanks everyone!

^ permalink raw reply

* [PATCH net] bpfilter: fix race in pipe access
From: Alexei Starovoitov @ 2018-06-07 22:31 UTC (permalink / raw)
  To: David S . Miller; +Cc: daniel, netdev, dvyukov, willy, kernel-team

syzbot reported the following crash
[  338.293946] bpfilter: read fail -512
[  338.304515] kasan: GPF could be caused by NULL-ptr deref or user memory access
[  338.311863] general protection fault: 0000 [#1] SMP KASAN
[  338.344360] RIP: 0010:__vfs_write+0x4a6/0x960
[  338.426363] Call Trace:
[  338.456967]  __kernel_write+0x10c/0x380
[  338.460928]  __bpfilter_process_sockopt+0x1d8/0x35b
[  338.487103]  bpfilter_mbox_request+0x4d/0xb0
[  338.491492]  bpfilter_ip_get_sockopt+0x6b/0x90

This can happen when multiple cpus trying to talk to user mode process
via bpfilter_mbox_request(). One cpu grabs the mutex while another goes to
sleep on the same mutex. Then former cpu sees that umh pipe is down and
shuts down the pipes. Later cpu finally acquires the mutex and crashes
on freed pipe.
Fix the race by using info.pid as an indicator that umh and pipes are healthy
and check it after acquiring the mutex.

Fixes: d2ba09c17a06 ("net: add skeleton of bpfilter kernel module")
Reported-by: syzbot+7ade6c94abb2774c0fee@syzkaller.appspotmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 net/bpfilter/bpfilter_kern.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/net/bpfilter/bpfilter_kern.c b/net/bpfilter/bpfilter_kern.c
index b13d058f8c34..09522573f611 100644
--- a/net/bpfilter/bpfilter_kern.c
+++ b/net/bpfilter/bpfilter_kern.c
@@ -24,17 +24,19 @@ static void shutdown_umh(struct umh_info *info)
 {
 	struct task_struct *tsk;
 
+	if (!info->pid)
+		return;
 	tsk = pid_task(find_vpid(info->pid), PIDTYPE_PID);
 	if (tsk)
 		force_sig(SIGKILL, tsk);
 	fput(info->pipe_to_umh);
 	fput(info->pipe_from_umh);
+	info->pid = 0;
 }
 
 static void __stop_umh(void)
 {
-	if (IS_ENABLED(CONFIG_INET) &&
-	    bpfilter_process_sockopt) {
+	if (IS_ENABLED(CONFIG_INET)) {
 		bpfilter_process_sockopt = NULL;
 		shutdown_umh(&info);
 	}
@@ -55,7 +57,7 @@ static int __bpfilter_process_sockopt(struct sock *sk, int optname,
 	struct mbox_reply reply;
 	loff_t pos;
 	ssize_t n;
-	int ret;
+	int ret = -EFAULT;
 
 	req.is_set = is_set;
 	req.pid = current->pid;
@@ -63,6 +65,8 @@ static int __bpfilter_process_sockopt(struct sock *sk, int optname,
 	req.addr = (long)optval;
 	req.len = optlen;
 	mutex_lock(&bpfilter_lock);
+	if (!info.pid)
+		goto out;
 	n = __kernel_write(info.pipe_to_umh, &req, sizeof(req), &pos);
 	if (n != sizeof(req)) {
 		pr_err("write fail %zd\n", n);
-- 
2.9.5

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox