Netdev List
 help / color / mirror / Atom feed
* Re: INFO: task hung in ip6gre_exit_batch_net
From: Dmitry Vyukov @ 2018-06-07 19:03 UTC (permalink / raw)
  To: Kirill Tkhai
  Cc: syzbot, Christian Brauner, David Miller, David Ahern,
	Florian Westphal, Jiri Benc, LKML, Xin Long, mschiffer, netdev,
	syzkaller-bugs, Vladislav Yasevich
In-Reply-To: <04b6ee08-7919-bf2d-5b77-bd346a0bff48@virtuozzo.com>

On Thu, Jun 7, 2018 at 8:54 PM, Kirill Tkhai <ktkhai@virtuozzo.com> wrote:
>>>>> Hi, Dmirty!
>>>>>
>>>>> On 04.06.2018 18:22, Dmitry Vyukov wrote:
>>>>>> On Mon, Jun 4, 2018 at 5:03 PM, syzbot
>>>>>> <syzbot+bf78a74f82c1cf19069e@syzkaller.appspotmail.com> wrote:
>>>>>>> Hello,
>>>>>>>
>>>>>>> syzbot found the following crash on:
>>>>>>>
>>>>>>> HEAD commit:    bc2dbc5420e8 Merge branch 'akpm' (patches from Andrew)
>>>>>>> git tree:       upstream
>>>>>>> console output: https://syzkaller.appspot.com/x/log.txt?x=164e42b7800000
>>>>>>> kernel config:  https://syzkaller.appspot.com/x/.config?x=982e2df1b9e60b02
>>>>>>> dashboard link: https://syzkaller.appspot.com/bug?extid=bf78a74f82c1cf19069e
>>>>>>> compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
>>>>>>>
>>>>>>> Unfortunately, I don't have any reproducer for this crash yet.
>>>>>>>
>>>>>>> IMPORTANT: if you fix the bug, please add the following tag to the commit:
>>>>>>> Reported-by: syzbot+bf78a74f82c1cf19069e@syzkaller.appspotmail.com
>>>>>>
>>>>>> Another hang on rtnl lock:
>>>>>>
>>>>>> #syz dup: INFO: task hung in netdev_run_todo
>>>>>>
>>>>>> May be related to "unregister_netdevice: waiting for DEV to become free":
>>>>>> https://syzkaller.appspot.com/bug?id=1a97a5bd119fd97995f752819fd87840ab9479a9
>>>>
>>>> netdev_wait_allrefs does not hold rtnl lock during waiting, so it must
>>>> be something different.
>>>>
>>>>
>>>>>> Any other explanations for massive hangs on rtnl lock for minutes?
>>>>>
>>>>> To exclude the situation, when a task exists with rtnl_mutex held:
>>>>>
>>>>> would the pr_warn() from print_held_locks_bug() be included in the console output
>>>>> if they appear?
>>>>
>>>> Yes, everything containing "WARNING:" is detected as bug.
>>>
>>> OK, then dead task not releasing the lock is excluded.
>>>
>>> One more assumption: someone corrupted memory around rtnl_mutex and it looks like locked.
>>> (I track lockdep "(rtnl_mutex){+.+.}" prints in initial message as "nobody owns rtnl_mutex").
>>> There may help a crash dump of the VM.
>>
>> I can't find any legend for these +'s and .'s, but {+.+.} is present
>> in large amounts in just any task hung report for different mutexes,
>> so I would not expect that it means corruption.
>>
>> Are dozens of known corruptions that syzkaller can trigger. But
>> usually they are reliably caught by KASAN. If any of them would lead
>> to silent memory corruption, we would got dozens of assorted crashes
>> throughout the kernel. We've seen that at some points, but not
>> recently. So I would assume that memory is not corrupted in all these
>> cases:
>> https://syzkaller.appspot.com/bug?id=2503c576cabb08d41812e732b390141f01a59545
>
> This BUG clarifies the {+.+.}:
>
> 4 locks held by kworker/0:145/381:
>  #0:  ((wq_completion)"hwsim_wq"){+.+.}, at: [<000000003f9487f0>] work_static include/linux/workqueue.h:198 [inline]
>  #0:  ((wq_completion)"hwsim_wq"){+.+.}, at: [<000000003f9487f0>] set_work_data kernel/workqueue.c:619 [inline]
>  #0:  ((wq_completion)"hwsim_wq"){+.+.}, at: [<000000003f9487f0>] set_work_pool_and_clear_pending kernel/workqueue.c:646 [inline]
>  #0:  ((wq_completion)"hwsim_wq"){+.+.}, at: [<000000003f9487f0>] process_one_work+0xb12/0x1bb0 kernel/workqueue.c:2084
>  #1:  ((work_completion)(&data->destroy_work)){+.+.}, at: [<00000000bbdd2115>] process_one_work+0xb89/0x1bb0 kernel/workqueue.c:2088
>  #2:  (rtnl_mutex){+.+.}, at: [<000000009c9d14f8>] rtnl_lock+0x17/0x20 net/core/rtnetlink.c:74
>  #3:  (rcu_sched_state.exp_mutex){+.+.}, at: [<000000001ba1a807>] exp_funnel_lock kernel/rcu/tree_exp.h:272 [inline]
>  #3:  (rcu_sched_state.exp_mutex){+.+.}, at: [<000000001ba1a807>] _synchronize_rcu_expedited.constprop.72+0x9fa/0xac0 kernel/rcu/tree_exp.h:596
>
> There we have rtnl_mutex locked and the {..} is like above. It's definitely locked
> since there is one more lock after it.
>
> This BUG happen because of there are many rtnl_mutex waiters while owner
> is synchronizing RCU. Rather clear for me in comparison to the topic's hung.


You mean that it's not hanged, but rather needs more than 2 minutes to
complete, right?


>> I wonder if it can be just that slow, but not actually hanged... net
>> namespace destruction is super slow, so perhaps under heavy load it
>> all stalls for minutes...

^ permalink raw reply

* Re: INFO: task hung in ip6gre_exit_batch_net
From: Kirill Tkhai @ 2018-06-07 18:54 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: syzbot, Christian Brauner, David Miller, David Ahern,
	Florian Westphal, Jiri Benc, LKML, Xin Long, mschiffer, netdev,
	syzkaller-bugs, Vladislav Yasevich
In-Reply-To: <CACT4Y+YBY2yGJBHgqGhAcOguag9JUkGkeOz1acgKd=KR3q+noQ@mail.gmail.com>

On 07.06.2018 21:23, Dmitry Vyukov wrote:
> On Tue, Jun 5, 2018 at 3:55 PM, Kirill Tkhai <ktkhai@virtuozzo.com> wrote:
>> On 05.06.2018 12:36, Dmitry Vyukov wrote:
>>> On Tue, Jun 5, 2018 at 11:03 AM, Kirill Tkhai <ktkhai@virtuozzo.com> wrote:
>>>> Hi, Dmirty!
>>>>
>>>> On 04.06.2018 18:22, Dmitry Vyukov wrote:
>>>>> On Mon, Jun 4, 2018 at 5:03 PM, syzbot
>>>>> <syzbot+bf78a74f82c1cf19069e@syzkaller.appspotmail.com> wrote:
>>>>>> Hello,
>>>>>>
>>>>>> syzbot found the following crash on:
>>>>>>
>>>>>> HEAD commit:    bc2dbc5420e8 Merge branch 'akpm' (patches from Andrew)
>>>>>> git tree:       upstream
>>>>>> console output: https://syzkaller.appspot.com/x/log.txt?x=164e42b7800000
>>>>>> kernel config:  https://syzkaller.appspot.com/x/.config?x=982e2df1b9e60b02
>>>>>> dashboard link: https://syzkaller.appspot.com/bug?extid=bf78a74f82c1cf19069e
>>>>>> compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
>>>>>>
>>>>>> Unfortunately, I don't have any reproducer for this crash yet.
>>>>>>
>>>>>> IMPORTANT: if you fix the bug, please add the following tag to the commit:
>>>>>> Reported-by: syzbot+bf78a74f82c1cf19069e@syzkaller.appspotmail.com
>>>>>
>>>>> Another hang on rtnl lock:
>>>>>
>>>>> #syz dup: INFO: task hung in netdev_run_todo
>>>>>
>>>>> May be related to "unregister_netdevice: waiting for DEV to become free":
>>>>> https://syzkaller.appspot.com/bug?id=1a97a5bd119fd97995f752819fd87840ab9479a9
>>>
>>> netdev_wait_allrefs does not hold rtnl lock during waiting, so it must
>>> be something different.
>>>
>>>
>>>>> Any other explanations for massive hangs on rtnl lock for minutes?
>>>>
>>>> To exclude the situation, when a task exists with rtnl_mutex held:
>>>>
>>>> would the pr_warn() from print_held_locks_bug() be included in the console output
>>>> if they appear?
>>>
>>> Yes, everything containing "WARNING:" is detected as bug.
>>
>> OK, then dead task not releasing the lock is excluded.
>>
>> One more assumption: someone corrupted memory around rtnl_mutex and it looks like locked.
>> (I track lockdep "(rtnl_mutex){+.+.}" prints in initial message as "nobody owns rtnl_mutex").
>> There may help a crash dump of the VM.
> 
> I can't find any legend for these +'s and .'s, but {+.+.} is present
> in large amounts in just any task hung report for different mutexes,
> so I would not expect that it means corruption.
> 
> Are dozens of known corruptions that syzkaller can trigger. But
> usually they are reliably caught by KASAN. If any of them would lead
> to silent memory corruption, we would got dozens of assorted crashes
> throughout the kernel. We've seen that at some points, but not
> recently. So I would assume that memory is not corrupted in all these
> cases:
> https://syzkaller.appspot.com/bug?id=2503c576cabb08d41812e732b390141f01a59545

This BUG clarifies the {+.+.}:

4 locks held by kworker/0:145/381:
 #0:  ((wq_completion)"hwsim_wq"){+.+.}, at: [<000000003f9487f0>] work_static include/linux/workqueue.h:198 [inline]
 #0:  ((wq_completion)"hwsim_wq"){+.+.}, at: [<000000003f9487f0>] set_work_data kernel/workqueue.c:619 [inline]
 #0:  ((wq_completion)"hwsim_wq"){+.+.}, at: [<000000003f9487f0>] set_work_pool_and_clear_pending kernel/workqueue.c:646 [inline]
 #0:  ((wq_completion)"hwsim_wq"){+.+.}, at: [<000000003f9487f0>] process_one_work+0xb12/0x1bb0 kernel/workqueue.c:2084
 #1:  ((work_completion)(&data->destroy_work)){+.+.}, at: [<00000000bbdd2115>] process_one_work+0xb89/0x1bb0 kernel/workqueue.c:2088
 #2:  (rtnl_mutex){+.+.}, at: [<000000009c9d14f8>] rtnl_lock+0x17/0x20 net/core/rtnetlink.c:74
 #3:  (rcu_sched_state.exp_mutex){+.+.}, at: [<000000001ba1a807>] exp_funnel_lock kernel/rcu/tree_exp.h:272 [inline]
 #3:  (rcu_sched_state.exp_mutex){+.+.}, at: [<000000001ba1a807>] _synchronize_rcu_expedited.constprop.72+0x9fa/0xac0 kernel/rcu/tree_exp.h:596

There we have rtnl_mutex locked and the {..} is like above. It's definitely locked
since there is one more lock after it.

This BUG happen because of there are many rtnl_mutex waiters while owner
is synchronizing RCU. Rather clear for me in comparison to the topic's hung.
 
> I wonder if it can be just that slow, but not actually hanged... net
> namespace destruction is super slow, so perhaps under heavy load it
> all stalls for minutes...

Thanks,
Kirill

^ permalink raw reply

* Re: INFO: task hung in ip6gre_exit_batch_net
From: Dmitry Vyukov @ 2018-06-07 18:23 UTC (permalink / raw)
  To: Kirill Tkhai
  Cc: syzbot, Christian Brauner, David Miller, David Ahern,
	Florian Westphal, Jiri Benc, LKML, Xin Long, mschiffer, netdev,
	syzkaller-bugs, Vladislav Yasevich
In-Reply-To: <e71c0df1-d83c-030c-7c97-13a923aca1b3@virtuozzo.com>

On Tue, Jun 5, 2018 at 3:55 PM, Kirill Tkhai <ktkhai@virtuozzo.com> wrote:
> On 05.06.2018 12:36, Dmitry Vyukov wrote:
>> On Tue, Jun 5, 2018 at 11:03 AM, Kirill Tkhai <ktkhai@virtuozzo.com> wrote:
>>> Hi, Dmirty!
>>>
>>> On 04.06.2018 18:22, Dmitry Vyukov wrote:
>>>> On Mon, Jun 4, 2018 at 5:03 PM, syzbot
>>>> <syzbot+bf78a74f82c1cf19069e@syzkaller.appspotmail.com> wrote:
>>>>> Hello,
>>>>>
>>>>> syzbot found the following crash on:
>>>>>
>>>>> HEAD commit:    bc2dbc5420e8 Merge branch 'akpm' (patches from Andrew)
>>>>> git tree:       upstream
>>>>> console output: https://syzkaller.appspot.com/x/log.txt?x=164e42b7800000
>>>>> kernel config:  https://syzkaller.appspot.com/x/.config?x=982e2df1b9e60b02
>>>>> dashboard link: https://syzkaller.appspot.com/bug?extid=bf78a74f82c1cf19069e
>>>>> compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
>>>>>
>>>>> Unfortunately, I don't have any reproducer for this crash yet.
>>>>>
>>>>> IMPORTANT: if you fix the bug, please add the following tag to the commit:
>>>>> Reported-by: syzbot+bf78a74f82c1cf19069e@syzkaller.appspotmail.com
>>>>
>>>> Another hang on rtnl lock:
>>>>
>>>> #syz dup: INFO: task hung in netdev_run_todo
>>>>
>>>> May be related to "unregister_netdevice: waiting for DEV to become free":
>>>> https://syzkaller.appspot.com/bug?id=1a97a5bd119fd97995f752819fd87840ab9479a9
>>
>> netdev_wait_allrefs does not hold rtnl lock during waiting, so it must
>> be something different.
>>
>>
>>>> Any other explanations for massive hangs on rtnl lock for minutes?
>>>
>>> To exclude the situation, when a task exists with rtnl_mutex held:
>>>
>>> would the pr_warn() from print_held_locks_bug() be included in the console output
>>> if they appear?
>>
>> Yes, everything containing "WARNING:" is detected as bug.
>
> OK, then dead task not releasing the lock is excluded.
>
> One more assumption: someone corrupted memory around rtnl_mutex and it looks like locked.
> (I track lockdep "(rtnl_mutex){+.+.}" prints in initial message as "nobody owns rtnl_mutex").
> There may help a crash dump of the VM.

I can't find any legend for these +'s and .'s, but {+.+.} is present
in large amounts in just any task hung report for different mutexes,
so I would not expect that it means corruption.

Are dozens of known corruptions that syzkaller can trigger. But
usually they are reliably caught by KASAN. If any of them would lead
to silent memory corruption, we would got dozens of assorted crashes
throughout the kernel. We've seen that at some points, but not
recently. So I would assume that memory is not corrupted in all these
cases:
https://syzkaller.appspot.com/bug?id=2503c576cabb08d41812e732b390141f01a59545

I wonder if it can be just that slow, but not actually hanged... net
namespace destruction is super slow, so perhaps under heavy load it
all stalls for minutes...


> Also, there may be a locking code BUG, but this seems the least probable for me.
>
> Kirill

^ permalink raw reply

* KASAN: null-ptr-deref Write in xdp_umem_unaccount_pages
From: syzbot @ 2018-06-07 18:17 UTC (permalink / raw)
  To: bjorn.topel, davem, linux-kernel, magnus.karlsson, netdev,
	syzkaller-bugs

Hello,

syzbot found the following crash on:

HEAD commit:    1c8c5a9d38f6 Merge git://git.kernel.org/pub/scm/linux/kern..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=13a72bdf800000
kernel config:  https://syzkaller.appspot.com/x/.config?x=4f1acdf888c9d4e9
dashboard link: https://syzkaller.appspot.com/bug?extid=979217770b09ebf5c407
compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
syzkaller repro:https://syzkaller.appspot.com/x/repro.syz?x=12aca2af800000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=161d4ddf800000

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+979217770b09ebf5c407@syzkaller.appspotmail.com

RDX: 0000000000000004 RSI: 000000000000011b RDI: 0000000000000004
RBP: 00000000006cb018 R08: 0000000000000018 R09: 00007fffc4750032
R10: 0000000020000040 R11: 0000000000000246 R12: 0000000000000005
R13: ffffffffffffffff R14: 0000000000000000 R15: 0000000000000000
==================================================================
BUG: KASAN: null-ptr-deref in atomic64_sub  
include/asm-generic/atomic-instrumented.h:144 [inline]
BUG: KASAN: null-ptr-deref in atomic_long_sub  
include/asm-generic/atomic-long.h:199 [inline]
BUG: KASAN: null-ptr-deref in xdp_umem_unaccount_pages.isra.4+0x3d/0x80  
net/xdp/xdp_umem.c:135
Write of size 8 at addr 0000000000000060 by task syz-executor246/4527

CPU: 1 PID: 4527 Comm: syz-executor246 Not tainted 4.17.0+ #89
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
Call Trace:
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0x1b9/0x294 lib/dump_stack.c:113
  kasan_report_error mm/kasan/report.c:352 [inline]
  kasan_report.cold.7+0x6d/0x2fe mm/kasan/report.c:412
  check_memory_region_inline mm/kasan/kasan.c:260 [inline]
  check_memory_region+0x13e/0x1b0 mm/kasan/kasan.c:267
  kasan_check_write+0x14/0x20 mm/kasan/kasan.c:278
  atomic64_sub include/asm-generic/atomic-instrumented.h:144 [inline]
  atomic_long_sub include/asm-generic/atomic-long.h:199 [inline]
  xdp_umem_unaccount_pages.isra.4+0x3d/0x80 net/xdp/xdp_umem.c:135
  xdp_umem_reg net/xdp/xdp_umem.c:334 [inline]
  xdp_umem_create+0xd6c/0x10f0 net/xdp/xdp_umem.c:349
  xsk_setsockopt+0x443/0x550 net/xdp/xsk.c:531
  __sys_setsockopt+0x1bd/0x390 net/socket.c:1935
  __do_sys_setsockopt net/socket.c:1946 [inline]
  __se_sys_setsockopt net/socket.c:1943 [inline]
  __x64_sys_setsockopt+0xbe/0x150 net/socket.c:1943
  do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287
  entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x440549
Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7  
48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff  
ff 0f 83 5b 14 fc ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007fffc475d008 EFLAGS: 00000246 ORIG_RAX: 0000000000000036
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000440549
RDX: 0000000000000004 RSI: 000000000000011b RDI: 0000000000000004
RBP: 00000000006cb018 R08: 0000000000000018 R09: 00007fffc4750032
R10: 0000000020000040 R11: 0000000000000246 R12: 0000000000000005
R13: ffffffffffffffff R14: 0000000000000000 R15: 0000000000000000
==================================================================


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with  
syzbot.
syzbot can test patches for this bug, for details see:
https://goo.gl/tpsmEJ#testing-patches

^ permalink raw reply

* Re: [PATCH] bpf: attach type BPF_LIRC_MODE2 should not depend on CONFIG_BPF_CGROUP
From: Y Song @ 2018-06-07 18:14 UTC (permalink / raw)
  To: Sean Young
  Cc: Daniel Borkmann, Matthias Reichl, linux-media, LKML,
	Alexei Starovoitov, Mauro Carvalho Chehab, netdev,
	Devin Heitmueller, Quentin Monnet
In-Reply-To: <20180606210939.q3vviyc4b2h6gu3c@gofer.mess.org>

On Wed, Jun 6, 2018 at 2:09 PM, Sean Young <sean@mess.org> wrote:
> Compile bpf_prog_{attach,detach,query} even if CONFIG_BPF_CGROUP is not
> set.

It should be CONFIG_CGROUP_BPF here. The same for subject line.
Today, if CONFIG_CGROUP_BPF is not defined. Users will get an -EINVAL
if they try to attach/detach/query.

I am not sure what is the motivation behind this change. Could you explain more?

>
> Signed-off-by: Sean Young <sean@mess.org>
> ---
>  include/linux/bpf-cgroup.h |  31 +++++++++++
>  kernel/bpf/cgroup.c        | 110 +++++++++++++++++++++++++++++++++++++
>  kernel/bpf/syscall.c       | 105 ++---------------------------------
>  3 files changed, 145 insertions(+), 101 deletions(-)
>
> diff --git a/include/linux/bpf-cgroup.h b/include/linux/bpf-cgroup.h
> index 975fb4cf1bb7..ee67cd35f426 100644
> --- a/include/linux/bpf-cgroup.h
> +++ b/include/linux/bpf-cgroup.h
> @@ -188,12 +188,43 @@ int __cgroup_bpf_check_dev_permission(short dev_type, u32 major, u32 minor,
>                                                                               \
>         __ret;                                                                \
>  })
> +int sockmap_get_from_fd(const union bpf_attr *attr, int type, bool attach);
> +int cgroup_bpf_prog_attach(const union bpf_attr *attr,
> +                          enum bpf_prog_type ptype);
> +int cgroup_bpf_prog_detach(const union bpf_attr *attr,
> +                          enum bpf_prog_type ptype);
> +int cgroup_bpf_prog_query(const union bpf_attr *attr,
> +                         union bpf_attr __user *uattr);
>  #else
>
>  struct cgroup_bpf {};
>  static inline void cgroup_bpf_put(struct cgroup *cgrp) {}
>  static inline int cgroup_bpf_inherit(struct cgroup *cgrp) { return 0; }
>
> +static inline int sockmap_get_from_fd(const union bpf_attr *attr,
> +                                     int type, bool attach)
> +{
> +       return -EINVAL;
> +}
> +
> +static inline int cgroup_bpf_prog_attach(const union bpf_attr *attr,
> +                                        enum bpf_prog_type ptype)
> +{
> +       return -EINVAL;
> +}
> +
> +static inline int cgroup_bpf_prog_detach(const union bpf_attr *attr,
> +                                        enum bpf_prog_type ptype)
> +{
> +       return -EINVAL;
> +}
> +
> +static inline int cgroup_bpf_prog_query(const union bpf_attr *attr,
> +                                       union bpf_attr __user *uattr)
> +{
> +       return -EINVAL;
> +}
> +
>  #define cgroup_bpf_enabled (0)
>  #define BPF_CGROUP_PRE_CONNECT_ENABLED(sk) (0)
>  #define BPF_CGROUP_RUN_PROG_INET_INGRESS(sk,skb) ({ 0; })
> diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
> index f7c00bd6f8e4..d6e18f9dc0c4 100644
> --- a/kernel/bpf/cgroup.c
> +++ b/kernel/bpf/cgroup.c
> @@ -428,6 +428,116 @@ int __cgroup_bpf_query(struct cgroup *cgrp, const union bpf_attr *attr,
>         return ret;
>  }
>
> +int bpf_prog_attach_check_attach_type(const struct bpf_prog *prog,
> +                                     enum bpf_attach_type attach_type)
> +{
> +       switch (prog->type) {
> +       case BPF_PROG_TYPE_CGROUP_SOCK:
> +       case BPF_PROG_TYPE_CGROUP_SOCK_ADDR:
> +               return attach_type == prog->expected_attach_type ? 0 : -EINVAL;
> +       default:
> +               return 0;
> +       }
> +}
> +
> +int sockmap_get_from_fd(const union bpf_attr *attr, int type, bool attach)
> +{
> +       struct bpf_prog *prog = NULL;
> +       int ufd = attr->target_fd;
> +       struct bpf_map *map;
> +       struct fd f;
> +       int err;
> +
> +       f = fdget(ufd);
> +       map = __bpf_map_get(f);
> +       if (IS_ERR(map))
> +               return PTR_ERR(map);
> +
> +       if (attach) {
> +               prog = bpf_prog_get_type(attr->attach_bpf_fd, type);
> +               if (IS_ERR(prog)) {
> +                       fdput(f);
> +                       return PTR_ERR(prog);
> +               }
> +       }
> +
> +       err = sock_map_prog(map, prog, attr->attach_type);
> +       if (err) {
> +               fdput(f);
> +               if (prog)
> +                       bpf_prog_put(prog);
> +               return err;
> +       }
> +
> +       fdput(f);
> +       return 0;
> +}
> +
> +int cgroup_bpf_prog_attach(const union bpf_attr *attr, enum bpf_prog_type ptype)
> +{
> +       struct bpf_prog *prog;
> +       struct cgroup *cgrp;
> +       int ret;
> +
> +       prog = bpf_prog_get_type(attr->attach_bpf_fd, ptype);
> +       if (IS_ERR(prog))
> +               return PTR_ERR(prog);
> +
> +       if (bpf_prog_attach_check_attach_type(prog, attr->attach_type)) {
> +               bpf_prog_put(prog);
> +               return -EINVAL;
> +       }
> +
> +       cgrp = cgroup_get_from_fd(attr->target_fd);
> +       if (IS_ERR(cgrp)) {
> +               bpf_prog_put(prog);
> +               return PTR_ERR(cgrp);
> +       }
> +
> +       ret = cgroup_bpf_attach(cgrp, prog, attr->attach_type,
> +                               attr->attach_flags);
> +       if (ret)
> +               bpf_prog_put(prog);
> +       cgroup_put(cgrp);
> +
> +       return ret;
> +}
> +
> +int cgroup_bpf_prog_detach(const union bpf_attr *attr, enum bpf_prog_type ptype)
> +{
> +       struct bpf_prog *prog;
> +       struct cgroup *cgrp;
> +       int ret;
> +
> +       cgrp = cgroup_get_from_fd(attr->target_fd);
> +       if (IS_ERR(cgrp))
> +               return PTR_ERR(cgrp);
> +
> +       prog = bpf_prog_get_type(attr->attach_bpf_fd, ptype);
> +       if (IS_ERR(prog))
> +               prog = NULL;
> +
> +       ret = cgroup_bpf_detach(cgrp, prog, attr->attach_type, 0);
> +       if (prog)
> +               bpf_prog_put(prog);
> +       cgroup_put(cgrp);
> +       return ret;
> +}
> +
> +int cgroup_bpf_prog_query(const union bpf_attr *attr,
> +                         union bpf_attr __user *uattr)
> +{
> +       struct cgroup *cgrp;
> +       int ret;
> +
> +       cgrp = cgroup_get_from_fd(attr->query.target_fd);
> +       if (IS_ERR(cgrp))
> +               return PTR_ERR(cgrp);
> +       ret = cgroup_bpf_query(cgrp, attr, uattr);
> +       cgroup_put(cgrp);
> +       return ret;
> +}
> +
>  /**
>   * __cgroup_bpf_run_filter_skb() - Run a program for packet filtering
>   * @sk: The socket sending or receiving traffic
> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> index 0fa20624707f..52fa44856623 100644
> --- a/kernel/bpf/syscall.c
> +++ b/kernel/bpf/syscall.c
> @@ -1489,65 +1489,14 @@ static int bpf_raw_tracepoint_open(const union bpf_attr *attr)
>         return err;
>  }
>
> -#ifdef CONFIG_CGROUP_BPF
> -
> -static int bpf_prog_attach_check_attach_type(const struct bpf_prog *prog,
> -                                            enum bpf_attach_type attach_type)
> -{
> -       switch (prog->type) {
> -       case BPF_PROG_TYPE_CGROUP_SOCK:
> -       case BPF_PROG_TYPE_CGROUP_SOCK_ADDR:
> -               return attach_type == prog->expected_attach_type ? 0 : -EINVAL;
> -       default:
> -               return 0;
> -       }
> -}
> -
>  #define BPF_PROG_ATTACH_LAST_FIELD attach_flags
>
> -static int sockmap_get_from_fd(const union bpf_attr *attr,
> -                              int type, bool attach)
> -{
> -       struct bpf_prog *prog = NULL;
> -       int ufd = attr->target_fd;
> -       struct bpf_map *map;
> -       struct fd f;
> -       int err;
> -
> -       f = fdget(ufd);
> -       map = __bpf_map_get(f);
> -       if (IS_ERR(map))
> -               return PTR_ERR(map);
> -
> -       if (attach) {
> -               prog = bpf_prog_get_type(attr->attach_bpf_fd, type);
> -               if (IS_ERR(prog)) {
> -                       fdput(f);
> -                       return PTR_ERR(prog);
> -               }
> -       }
> -
> -       err = sock_map_prog(map, prog, attr->attach_type);
> -       if (err) {
> -               fdput(f);
> -               if (prog)
> -                       bpf_prog_put(prog);
> -               return err;
> -       }
> -
> -       fdput(f);
> -       return 0;
> -}
> -
>  #define BPF_F_ATTACH_MASK \
>         (BPF_F_ALLOW_OVERRIDE | BPF_F_ALLOW_MULTI)
>
>  static int bpf_prog_attach(const union bpf_attr *attr)
>  {
>         enum bpf_prog_type ptype;
> -       struct bpf_prog *prog;
> -       struct cgroup *cgrp;
> -       int ret;
>
>         if (!capable(CAP_NET_ADMIN))
>                 return -EPERM;
> @@ -1593,28 +1542,7 @@ static int bpf_prog_attach(const union bpf_attr *attr)
>                 return -EINVAL;
>         }
>
> -       prog = bpf_prog_get_type(attr->attach_bpf_fd, ptype);
> -       if (IS_ERR(prog))
> -               return PTR_ERR(prog);
> -
> -       if (bpf_prog_attach_check_attach_type(prog, attr->attach_type)) {
> -               bpf_prog_put(prog);
> -               return -EINVAL;
> -       }
> -
> -       cgrp = cgroup_get_from_fd(attr->target_fd);
> -       if (IS_ERR(cgrp)) {
> -               bpf_prog_put(prog);
> -               return PTR_ERR(cgrp);
> -       }
> -
> -       ret = cgroup_bpf_attach(cgrp, prog, attr->attach_type,
> -                               attr->attach_flags);
> -       if (ret)
> -               bpf_prog_put(prog);
> -       cgroup_put(cgrp);
> -
> -       return ret;
> +       return cgroup_bpf_prog_attach(attr, ptype);
>  }
>
>  #define BPF_PROG_DETACH_LAST_FIELD attach_type
> @@ -1622,9 +1550,6 @@ static int bpf_prog_attach(const union bpf_attr *attr)
>  static int bpf_prog_detach(const union bpf_attr *attr)
>  {
>         enum bpf_prog_type ptype;
> -       struct bpf_prog *prog;
> -       struct cgroup *cgrp;
> -       int ret;
>
>         if (!capable(CAP_NET_ADMIN))
>                 return -EPERM;
> @@ -1667,19 +1592,7 @@ static int bpf_prog_detach(const union bpf_attr *attr)
>                 return -EINVAL;
>         }
>
> -       cgrp = cgroup_get_from_fd(attr->target_fd);
> -       if (IS_ERR(cgrp))
> -               return PTR_ERR(cgrp);
> -
> -       prog = bpf_prog_get_type(attr->attach_bpf_fd, ptype);
> -       if (IS_ERR(prog))
> -               prog = NULL;
> -
> -       ret = cgroup_bpf_detach(cgrp, prog, attr->attach_type, 0);
> -       if (prog)
> -               bpf_prog_put(prog);
> -       cgroup_put(cgrp);
> -       return ret;
> +       return cgroup_bpf_prog_detach(attr, ptype);
>  }
>
>  #define BPF_PROG_QUERY_LAST_FIELD query.prog_cnt
> @@ -1687,9 +1600,6 @@ static int bpf_prog_detach(const union bpf_attr *attr)
>  static int bpf_prog_query(const union bpf_attr *attr,
>                           union bpf_attr __user *uattr)
>  {
> -       struct cgroup *cgrp;
> -       int ret;
> -
>         if (!capable(CAP_NET_ADMIN))
>                 return -EPERM;
>         if (CHECK_ATTR(BPF_PROG_QUERY))
> @@ -1717,14 +1627,9 @@ static int bpf_prog_query(const union bpf_attr *attr,
>         default:
>                 return -EINVAL;
>         }
> -       cgrp = cgroup_get_from_fd(attr->query.target_fd);
> -       if (IS_ERR(cgrp))
> -               return PTR_ERR(cgrp);
> -       ret = cgroup_bpf_query(cgrp, attr, uattr);
> -       cgroup_put(cgrp);
> -       return ret;
> +
> +       return cgroup_bpf_prog_query(attr, uattr);
>  }
> -#endif /* CONFIG_CGROUP_BPF */
>
>  #define BPF_PROG_TEST_RUN_LAST_FIELD test.duration
>
> @@ -2371,7 +2276,6 @@ SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, siz
>         case BPF_OBJ_GET:
>                 err = bpf_obj_get(&attr);
>                 break;
> -#ifdef CONFIG_CGROUP_BPF
>         case BPF_PROG_ATTACH:
>                 err = bpf_prog_attach(&attr);
>                 break;
> @@ -2381,7 +2285,6 @@ SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, siz
>         case BPF_PROG_QUERY:
>                 err = bpf_prog_query(&attr, uattr);
>                 break;
> -#endif
>         case BPF_PROG_TEST_RUN:
>                 err = bpf_prog_test_run(&attr, uattr);
>                 break;
> --
> 2.17.1
>

^ permalink raw reply

* Re: [PATCH] bpf: attach type BPF_LIRC_MODE2 should not depend on CONFIG_BPF_CGROUP
From: Simon Horman @ 2018-06-07 18:12 UTC (permalink / raw)
  To: Sean Young
  Cc: Daniel Borkmann, Matthias Reichl, linux-media, linux-kernel,
	Alexei Starovoitov, Mauro Carvalho Chehab, netdev,
	Devin Heitmueller, Y Song, Quentin Monnet
In-Reply-To: <20180606210939.q3vviyc4b2h6gu3c@gofer.mess.org>

On Wed, Jun 06, 2018 at 10:09:40PM +0100, Sean Young wrote:
> Compile bpf_prog_{attach,detach,query} even if CONFIG_BPF_CGROUP is not
> set.
> 
> Signed-off-by: Sean Young <sean@mess.org>

Hi Sean,

This patch seems to involve rather extensive refactoring, the details of
which are not explained, in order to make a change whose motivation is also
not explained. I for one would value a more extensive changelog if
not a series of patches which implement discrete changes.

Thanks!

^ permalink raw reply

* Re: [PATCH 1/3] net: phy: Check phy_driver ready before accessing
From: Andrew Lunn @ 2018-06-07 18:12 UTC (permalink / raw)
  To: Brandon Maier
  Cc: netdev, f.fainelli, davem, michal.simek, Clayton Shotwell,
	Kristopher Cory, linux-kernel
In-Reply-To: <CA+fik51YC14k8yMLRkybMXQNn6uWB0u6aQUd1GMZrQtOF=cd3A@mail.gmail.com>

> Agreed. Another thing that looks suspicious to me is the driver
> overrides the private data of the device it's attaching too, in the
> `priv->phy_dev->priv = priv` bit. Seems like that could cause all
> sorts of driver corruption problems.

Ah, yes. That is very broken. Many PHYs will just explode sometime
later, since they use phdev->priv.

> But fixing that is going to require more drastic changes to how this
> driver works. So it'd be worth applying this patch in the mean time.

Patches welcome.

	Andrew

^ permalink raw reply

* Re: [PATCH bpf] bpf: reject passing modified ctx to helper functions
From: Edward Cree @ 2018-06-07 18:09 UTC (permalink / raw)
  To: Daniel Borkmann, ast; +Cc: netdev
In-Reply-To: <20180607154003.5368-1-daniel@iogearbox.net>

On 07/06/18 16:40, Daniel Borkmann wrote:
> As commit 28e33f9d78ee ("bpf: disallow arithmetic operations on
> context pointer") already describes, f1174f77b50c ("bpf/verifier:
> rework value tracking") removed the specific white-listed cases
> we had previously where we would allow for pointer arithmetic in
> order to further generalize it, and allow e.g. context access via
> modified registers. While the dereferencing of modified context
> pointers had been forbidden through 28e33f9d78ee, syzkaller did
> recently manage to trigger several KASAN splats for slab out of
> bounds access and use after frees by simply passing a modified
> context pointer to a helper function which would then do the bad
> access since verifier allowed it in adjust_ptr_min_max_vals().
>
> Rejecting arithmetic on ctx pointer in adjust_ptr_min_max_vals()
> generally could break existing programs as there's a valid use
> case in tracing in combination with passing the ctx to helpers as
> bpf_probe_read(), where the register then becomes unknown at
> verification time due to adding a non-constant offset to it. An
> access sequence may look like the following:
>
>   offset = args->filename;  /* field __data_loc filename */
>   bpf_probe_read(&dst, len, (char *)args + offset); // args is ctx
>
> There are two options: i) we could special case the ctx and as
> soon as we add a constant or bounded offset to it (hence ctx type
> wouldn't change) we could turn the ctx into an unknown scalar, or
> ii) we generalize the sanity test for ctx member access into a
> small helper and assert it on the ctx register that was passed
> as a function argument. Fwiw, latter is more obvious and less
> complex at the same time, and one case that may potentially be
> legitimate in future for ctx member access at least would be for
> ctx to carry a const offset. Therefore, fix follows approach
> from ii) and adds test cases to BPF kselftests.
>
> Fixes: f1174f77b50c ("bpf/verifier: rework value tracking")
> Reported-by: syzbot+3d0b2441dbb71751615e@syzkaller.appspotmail.com
> Reported-by: syzbot+c8504affd4fdd0c1b626@syzkaller.appspotmail.com
> Reported-by: syzbot+e5190cb881d8660fb1a3@syzkaller.appspotmail.com
> Reported-by: syzbot+efae31b384d5badbd620@syzkaller.appspotmail.com
> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
> Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Edward Cree <ecree@solarflare.com>

^ permalink raw reply

* Re: KMSAN: uninit-value in _copy_to_iter (2)
From: Al Viro @ 2018-06-07 18:04 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: syzbot, avagin, davem, dingtianhong, edumazet, elena.reshetova,
	jasowang, kvm, linux-kernel, matthew, mingo, netdev, pabeni,
	syzkaller-bugs, virtualization, willemb
In-Reply-To: <20180607205611-mutt-send-email-mst@kernel.org>

On Thu, Jun 07, 2018 at 08:59:06PM +0300, Michael S. Tsirkin wrote:
> On Thu, Jun 07, 2018 at 06:43:55PM +0100, Al Viro wrote:
> > On Thu, Jun 07, 2018 at 06:38:48PM +0300, Michael S. Tsirkin wrote:
> > > #syz test: https://github.com/google/kmsan.git/master d2d741e5d1898dfde1a75ea3d29a9a3e2edf0617
> > > 
> > > Subject: vhost: fix info leak
> > > 
> > > Fixes: CVE-2018-1118
> > > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > > ---
> > > diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> > > index f0be5f35ab28..9beefa6ed1ce 100644
> > > --- a/drivers/vhost/vhost.c
> > > +++ b/drivers/vhost/vhost.c
> > > @@ -2345,6 +2345,9 @@ struct vhost_msg_node *vhost_new_msg(struct vhost_virtqueue *vq, int type)
> > >  	struct vhost_msg_node *node = kmalloc(sizeof *node, GFP_KERNEL);
> > >  	if (!node)
> > >  		return NULL;
> > > +
> > > +	/* Make sure all padding within the structure is initialized. */
> > > +	memset(&node->msg, 0, sizeof node->msg);
> > 
> > Umm...  Maybe kzalloc(), then?  You have
> > 
> > struct vhost_msg_node {
> >   struct vhost_msg msg;
> >   struct vhost_virtqueue *vq;
> >   struct list_head node;
> > };
> > 
> > and that's what, 68 bytes in msg, then either 4 bytes pointer or
> > 4 bytes padding + 8 bytes pointer, then two pointers?  How much
> > does explicit partial memset() save you here?
> 
> Yes but 0 isn't a nop here so if this struct is used without
> a sensible initialization, it will crash elsewhere.
> I prefer KASAN to catch such uses.
> 
> 
> > >  	node->vq = vq;
> > >  	node->msg.type = type;

IDGI - what would your variant catch that kzalloc + 2 assignments won't?
Accesses to uninitialized ->node?  Because that's the only difference in
what is and is not initialized between those variants...

^ permalink raw reply

* Re: KMSAN: uninit-value in _copy_to_iter (2)
From: Michael S. Tsirkin @ 2018-06-07 17:59 UTC (permalink / raw)
  To: Al Viro
  Cc: syzbot, avagin, davem, dingtianhong, edumazet, elena.reshetova,
	jasowang, kvm, linux-kernel, matthew, mingo, netdev, pabeni,
	syzkaller-bugs, virtualization, willemb
In-Reply-To: <20180607174355.GR30522@ZenIV.linux.org.uk>

On Thu, Jun 07, 2018 at 06:43:55PM +0100, Al Viro wrote:
> On Thu, Jun 07, 2018 at 06:38:48PM +0300, Michael S. Tsirkin wrote:
> > #syz test: https://github.com/google/kmsan.git/master d2d741e5d1898dfde1a75ea3d29a9a3e2edf0617
> > 
> > Subject: vhost: fix info leak
> > 
> > Fixes: CVE-2018-1118
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > ---
> > diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> > index f0be5f35ab28..9beefa6ed1ce 100644
> > --- a/drivers/vhost/vhost.c
> > +++ b/drivers/vhost/vhost.c
> > @@ -2345,6 +2345,9 @@ struct vhost_msg_node *vhost_new_msg(struct vhost_virtqueue *vq, int type)
> >  	struct vhost_msg_node *node = kmalloc(sizeof *node, GFP_KERNEL);
> >  	if (!node)
> >  		return NULL;
> > +
> > +	/* Make sure all padding within the structure is initialized. */
> > +	memset(&node->msg, 0, sizeof node->msg);
> 
> Umm...  Maybe kzalloc(), then?  You have
> 
> struct vhost_msg_node {
>   struct vhost_msg msg;
>   struct vhost_virtqueue *vq;
>   struct list_head node;
> };
> 
> and that's what, 68 bytes in msg, then either 4 bytes pointer or
> 4 bytes padding + 8 bytes pointer, then two pointers?  How much
> does explicit partial memset() save you here?

Yes but 0 isn't a nop here so if this struct is used without
a sensible initialization, it will crash elsewhere.
I prefer KASAN to catch such uses.


> >  	node->vq = vq;
> >  	node->msg.type = type;
> >  	return node;

^ permalink raw reply

* Re: [PATCH] selftests: bpf: fix urandom_read build issue
From: Y Song @ 2018-06-07 17:52 UTC (permalink / raw)
  To: Anders Roxell
  Cc: Alexei Starovoitov, Daniel Borkmann, Shuah Khan, netdev, LKML,
	linux-kselftest
In-Reply-To: <20180607105712.553-1-anders.roxell@linaro.org>

On Thu, Jun 7, 2018 at 3:57 AM, Anders Roxell <anders.roxell@linaro.org> wrote:
> gcc complains that urandom_read gets built twice.
>
> gcc -o tools/testing/selftests/bpf/urandom_read
> -static urandom_read.c -Wl,--build-id
> gcc -Wall -O2 -I../../../include/uapi -I../../../lib -I../../../lib/bpf
> -I../../../../include/generated  -I../../../include    urandom_read.c
> urandom_read -lcap -lelf -lrt -lpthread -o
> tools/testing/selftests/bpf/urandom_read
> gcc: fatal error: input file
> ‘tools/testing/selftests/bpf/urandom_read’ is the
> same as output file
> compilation terminated.
> ../lib.mk:110: recipe for target
> 'tools/testing/selftests/bpf/urandom_read' failed

What is the build/make command to reproduce the above failure?

> To fix this issue remove the urandom_read target and so target
> TEST_CUSTOM_PROGS gets used.
>
> Fixes: 81f77fd0deeb ("bpf: add selftest for stackmap with BPF_F_STACK_BUILD_ID")
> Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
> ---
>  tools/testing/selftests/bpf/Makefile | 6 ++----
>  1 file changed, 2 insertions(+), 4 deletions(-)
>
> diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
> index 607ed8729c06..67285591ffd7 100644
> --- a/tools/testing/selftests/bpf/Makefile
> +++ b/tools/testing/selftests/bpf/Makefile
> @@ -16,10 +16,8 @@ LDLIBS += -lcap -lelf -lrt -lpthread
>  TEST_CUSTOM_PROGS = $(OUTPUT)/urandom_read
>  all: $(TEST_CUSTOM_PROGS)
>
> -$(TEST_CUSTOM_PROGS): urandom_read
> -
> -urandom_read: urandom_read.c
> -       $(CC) -o $(TEST_CUSTOM_PROGS) -static $< -Wl,--build-id
> +$(TEST_CUSTOM_PROGS): $(OUTPUT)/%: %.c
> +       $(CC) -o $@ -static $< -Wl,--build-id
>
>  # Order correspond to 'make run_tests' order
>  TEST_GEN_PROGS = test_verifier test_tag test_maps test_lru_map test_lpm_map test_progs \
> --
> 2.17.1
>

^ permalink raw reply

* Re: [PATCH bpf] bpf: reject passing modified ctx to helper functions
From: Y Song @ 2018-06-07 17:45 UTC (permalink / raw)
  To: Daniel Borkmann; +Cc: Alexei Starovoitov, netdev, Edward Cree
In-Reply-To: <20180607154003.5368-1-daniel@iogearbox.net>

On Thu, Jun 7, 2018 at 8:40 AM, Daniel Borkmann <daniel@iogearbox.net> wrote:
> As commit 28e33f9d78ee ("bpf: disallow arithmetic operations on
> context pointer") already describes, f1174f77b50c ("bpf/verifier:
> rework value tracking") removed the specific white-listed cases
> we had previously where we would allow for pointer arithmetic in
> order to further generalize it, and allow e.g. context access via
> modified registers. While the dereferencing of modified context
> pointers had been forbidden through 28e33f9d78ee, syzkaller did
> recently manage to trigger several KASAN splats for slab out of
> bounds access and use after frees by simply passing a modified
> context pointer to a helper function which would then do the bad
> access since verifier allowed it in adjust_ptr_min_max_vals().
>
> Rejecting arithmetic on ctx pointer in adjust_ptr_min_max_vals()
> generally could break existing programs as there's a valid use
> case in tracing in combination with passing the ctx to helpers as
> bpf_probe_read(), where the register then becomes unknown at
> verification time due to adding a non-constant offset to it. An
> access sequence may look like the following:
>
>   offset = args->filename;  /* field __data_loc filename */
>   bpf_probe_read(&dst, len, (char *)args + offset); // args is ctx
>
> There are two options: i) we could special case the ctx and as
> soon as we add a constant or bounded offset to it (hence ctx type
> wouldn't change) we could turn the ctx into an unknown scalar, or
> ii) we generalize the sanity test for ctx member access into a
> small helper and assert it on the ctx register that was passed
> as a function argument. Fwiw, latter is more obvious and less
> complex at the same time, and one case that may potentially be
> legitimate in future for ctx member access at least would be for
> ctx to carry a const offset. Therefore, fix follows approach
> from ii) and adds test cases to BPF kselftests.
>
> Fixes: f1174f77b50c ("bpf/verifier: rework value tracking")
> Reported-by: syzbot+3d0b2441dbb71751615e@syzkaller.appspotmail.com
> Reported-by: syzbot+c8504affd4fdd0c1b626@syzkaller.appspotmail.com
> Reported-by: syzbot+e5190cb881d8660fb1a3@syzkaller.appspotmail.com
> Reported-by: syzbot+efae31b384d5badbd620@syzkaller.appspotmail.com
> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
> Acked-by: Alexei Starovoitov <ast@kernel.org>

Tested with bcc for the above ctx + unknown_offset usage for bpf_probe_read.
No breakage.

Acked-by: Yonghong Song <yhs@fb.com>

> ---
>  kernel/bpf/verifier.c                       | 48 +++++++++++++++---------
>  tools/testing/selftests/bpf/test_verifier.c | 58 ++++++++++++++++++++++++++++-
>  2 files changed, 88 insertions(+), 18 deletions(-)
>
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index d6403b5..cced0c1 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -1617,6 +1617,30 @@ static int get_callee_stack_depth(struct bpf_verifier_env *env,
>  }
>  #endif
>
> +static int check_ctx_reg(struct bpf_verifier_env *env,
> +                        const struct bpf_reg_state *reg, int regno)
> +{
> +       /* Access to ctx or passing it to a helper is only allowed in
> +        * its original, unmodified form.
> +        */
> +
> +       if (reg->off) {
> +               verbose(env, "dereference of modified ctx ptr R%d off=%d disallowed\n",
> +                       regno, reg->off);
> +               return -EACCES;
> +       }
> +
> +       if (!tnum_is_const(reg->var_off) || reg->var_off.value) {
> +               char tn_buf[48];
> +
> +               tnum_strn(tn_buf, sizeof(tn_buf), reg->var_off);
> +               verbose(env, "variable ctx access var_off=%s disallowed\n", tn_buf);
> +               return -EACCES;
> +       }
> +
> +       return 0;
> +}
> +
>  /* truncate register to smaller size (in bytes)
>   * must be called with size < BPF_REG_SIZE
>   */
> @@ -1686,24 +1710,11 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn
>                         verbose(env, "R%d leaks addr into ctx\n", value_regno);
>                         return -EACCES;
>                 }
> -               /* ctx accesses must be at a fixed offset, so that we can
> -                * determine what type of data were returned.
> -                */
> -               if (reg->off) {
> -                       verbose(env,
> -                               "dereference of modified ctx ptr R%d off=%d+%d, ctx+const is allowed, ctx+const+const is not\n",
> -                               regno, reg->off, off - reg->off);
> -                       return -EACCES;
> -               }
> -               if (!tnum_is_const(reg->var_off) || reg->var_off.value) {
> -                       char tn_buf[48];
>
> -                       tnum_strn(tn_buf, sizeof(tn_buf), reg->var_off);
> -                       verbose(env,
> -                               "variable ctx access var_off=%s off=%d size=%d",
> -                               tn_buf, off, size);
> -                       return -EACCES;
> -               }
> +               err = check_ctx_reg(env, reg, regno);
> +               if (err < 0)
> +                       return err;
> +
>                 err = check_ctx_access(env, insn_idx, off, size, t, &reg_type);
>                 if (!err && t == BPF_READ && value_regno >= 0) {
>                         /* ctx access returns either a scalar, or a
> @@ -1984,6 +1995,9 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 regno,
>                 expected_type = PTR_TO_CTX;
>                 if (type != expected_type)
>                         goto err_type;
> +               err = check_ctx_reg(env, reg, regno);
> +               if (err < 0)
> +                       return err;
>         } else if (arg_type_is_mem_ptr(arg_type)) {
>                 expected_type = PTR_TO_STACK;
>                 /* One exception here. In case function allows for NULL to be
> diff --git a/tools/testing/selftests/bpf/test_verifier.c b/tools/testing/selftests/bpf/test_verifier.c
> index 7cb1d74..2ecd27b 100644
> --- a/tools/testing/selftests/bpf/test_verifier.c
> +++ b/tools/testing/selftests/bpf/test_verifier.c
> @@ -8647,7 +8647,7 @@ static struct bpf_test tests[] = {
>                                     offsetof(struct __sk_buff, mark)),
>                         BPF_EXIT_INSN(),
>                 },
> -               .errstr = "dereference of modified ctx ptr R1 off=68+8, ctx+const is allowed, ctx+const+const is not",
> +               .errstr = "dereference of modified ctx ptr",
>                 .result = REJECT,
>                 .prog_type = BPF_PROG_TYPE_SCHED_CLS,
>         },
> @@ -12258,6 +12258,62 @@ static struct bpf_test tests[] = {
>                 .result = ACCEPT,
>                 .retval = 5,
>         },
> +       {
> +               "pass unmodified ctx pointer to helper",
> +               .insns = {
> +                       BPF_MOV64_IMM(BPF_REG_2, 0),
> +                       BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
> +                                    BPF_FUNC_csum_update),
> +                       BPF_MOV64_IMM(BPF_REG_0, 0),
> +                       BPF_EXIT_INSN(),
> +               },
> +               .prog_type = BPF_PROG_TYPE_SCHED_CLS,
> +               .result = ACCEPT,
> +       },
> +       {
> +               "pass modified ctx pointer to helper, 1",
> +               .insns = {
> +                       BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, -612),
> +                       BPF_MOV64_IMM(BPF_REG_2, 0),
> +                       BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
> +                                    BPF_FUNC_csum_update),
> +                       BPF_MOV64_IMM(BPF_REG_0, 0),
> +                       BPF_EXIT_INSN(),
> +               },
> +               .prog_type = BPF_PROG_TYPE_SCHED_CLS,
> +               .result = REJECT,
> +               .errstr = "dereference of modified ctx ptr",
> +       },
> +       {
> +               "pass modified ctx pointer to helper, 2",
> +               .insns = {
> +                       BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, -612),
> +                       BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
> +                                    BPF_FUNC_get_socket_cookie),
> +                       BPF_MOV64_IMM(BPF_REG_0, 0),
> +                       BPF_EXIT_INSN(),
> +               },
> +               .result_unpriv = REJECT,
> +               .result = REJECT,
> +               .errstr_unpriv = "dereference of modified ctx ptr",
> +               .errstr = "dereference of modified ctx ptr",
> +       },
> +       {
> +               "pass modified ctx pointer to helper, 3",
> +               .insns = {
> +                       BPF_LDX_MEM(BPF_W, BPF_REG_3, BPF_REG_1, 0),
> +                       BPF_ALU64_IMM(BPF_AND, BPF_REG_3, 4),
> +                       BPF_ALU64_REG(BPF_ADD, BPF_REG_1, BPF_REG_3),
> +                       BPF_MOV64_IMM(BPF_REG_2, 0),
> +                       BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
> +                                    BPF_FUNC_csum_update),
> +                       BPF_MOV64_IMM(BPF_REG_0, 0),
> +                       BPF_EXIT_INSN(),
> +               },
> +               .prog_type = BPF_PROG_TYPE_SCHED_CLS,
> +               .result = REJECT,
> +               .errstr = "variable ctx access var_off=(0x0; 0x4)",
> +       },
>  };
>
>  static int probe_filter_length(const struct bpf_insn *fp)
> --
> 2.9.5
>

^ permalink raw reply

* Re: KMSAN: uninit-value in _copy_to_iter (2)
From: Al Viro @ 2018-06-07 17:43 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: syzbot, avagin, davem, dingtianhong, edumazet, elena.reshetova,
	jasowang, kvm, linux-kernel, matthew, mingo, netdev, pabeni,
	syzkaller-bugs, virtualization, willemb
In-Reply-To: <20180607183627-mutt-send-email-mst@kernel.org>

On Thu, Jun 07, 2018 at 06:38:48PM +0300, Michael S. Tsirkin wrote:
> #syz test: https://github.com/google/kmsan.git/master d2d741e5d1898dfde1a75ea3d29a9a3e2edf0617
> 
> Subject: vhost: fix info leak
> 
> Fixes: CVE-2018-1118
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> index f0be5f35ab28..9beefa6ed1ce 100644
> --- a/drivers/vhost/vhost.c
> +++ b/drivers/vhost/vhost.c
> @@ -2345,6 +2345,9 @@ struct vhost_msg_node *vhost_new_msg(struct vhost_virtqueue *vq, int type)
>  	struct vhost_msg_node *node = kmalloc(sizeof *node, GFP_KERNEL);
>  	if (!node)
>  		return NULL;
> +
> +	/* Make sure all padding within the structure is initialized. */
> +	memset(&node->msg, 0, sizeof node->msg);

Umm...  Maybe kzalloc(), then?  You have

struct vhost_msg_node {
  struct vhost_msg msg;
  struct vhost_virtqueue *vq;
  struct list_head node;
};

and that's what, 68 bytes in msg, then either 4 bytes pointer or
4 bytes padding + 8 bytes pointer, then two pointers?  How much
does explicit partial memset() save you here?

>  	node->vq = vq;
>  	node->msg.type = type;
>  	return node;

^ permalink raw reply

* Re: [PATCH 2/3] net: phy: xgmiitorgmii: Use correct mdio bus
From: Brandon Maier @ 2018-06-07 17:37 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: netdev, f.fainelli, davem, michal.simek, Clayton Shotwell,
	Kristopher Cory, linux-kernel
In-Reply-To: <20180607164920.GC25513@lunn.ch>

On Thu, Jun 7, 2018 at 11:49 AM, Andrew Lunn <andrew@lunn.ch> wrote:
> I think the assumption was they are both on the same bus.
> However, they don't need to be.

That was my assumption as well. We ran into a scenario where they were
on separate buses.

> Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Thanks

^ permalink raw reply

* Re: [PATCH 1/3] net: phy: Check phy_driver ready before accessing
From: Brandon Maier @ 2018-06-07 17:34 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: netdev, f.fainelli, davem, michal.simek, Clayton Shotwell,
	Kristopher Cory, linux-kernel
In-Reply-To: <20180607165227.GD25513@lunn.ch>

On Thu, Jun 7, 2018 at 11:52 AM, Andrew Lunn <andrew@lunn.ch> wrote:
> FYI: net-next is closed at the moment. Please resubmit these in two
> weeks time.

Ah, I didn't see networking/netdev-FAQ.txt. I'll resubmit these then.

> I'm sure there are more issues like this in the code.  e.g. there is
> no attempt made to hold a reference to the child phy. So it could be
> unbound. priv->phy_drv->read_status(phydev) is then going to do bad
> things.
>

Agreed. Another thing that looks suspicious to me is the driver
overrides the private data of the device it's attaching too, in the
`priv->phy_dev->priv = priv` bit. Seems like that could cause all
sorts of driver corruption problems.

But fixing that is going to require more drastic changes to how this
driver works. So it'd be worth applying this patch in the mean time.

^ permalink raw reply

* Re: KMSAN: uninit-value in ebt_stp_mt_check (2)
From: Dmitry Vyukov @ 2018-06-07 17:34 UTC (permalink / raw)
  To: syzbot
  Cc: bridge, coreteam, David Miller, Florian Westphal,
	Jozsef Kadlecsik, LKML, netdev, netfilter-devel,
	Pablo Neira Ayuso, stephen hemminger, syzkaller-bugs
In-Reply-To: <0000000000009a3b5b056e109e42@google.com>

On Thu, Jun 7, 2018 at 7:29 PM, syzbot
<syzbot+da4494182233c23a5fcf@syzkaller.appspotmail.com> wrote:
> Hello,
>
> syzbot found the following crash on:
>
> HEAD commit:    c6a6aed994b6 kmsan: remove dead code to trigger syzbot build
> git tree:       https://github.com/google/kmsan.git/master
> console output: https://syzkaller.appspot.com/x/log.txt?x=17bde74f800000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=848e40757852af3e
> dashboard link: https://syzkaller.appspot.com/bug?extid=da4494182233c23a5fcf
> compiler:       clang version 7.0.0 (trunk 334104)
>
> Unfortunately, I don't have any reproducer for this crash yet.
>
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+da4494182233c23a5fcf@syzkaller.appspotmail.com

Hi Stephen,

This strikes again.

The first part of the check is par->nft_compat. I don't see where
ebt_check_entry initializes it. Should it? Or matches should not look at it?



> ==================================================================
> BUG: KMSAN: uninit-value in ebt_stp_mt_check+0x24b/0x450
> net/bridge/netfilter/ebt_stp.c:162
> CPU: 0 PID: 12006 Comm: syz-executor7 Not tainted 4.17.0+ #3
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> Call Trace:
>  __dump_stack lib/dump_stack.c:77 [inline]
>  dump_stack+0x185/0x1d0 lib/dump_stack.c:113
>  kmsan_report+0x149/0x260 mm/kmsan/kmsan.c:1084
>  __msan_warning_32+0x6e/0xc0 mm/kmsan/kmsan_instr.c:620
>  ebt_stp_mt_check+0x24b/0x450 net/bridge/netfilter/ebt_stp.c:162
>  xt_check_match+0x1438/0x1650 net/netfilter/x_tables.c:506
>  ebt_check_match net/bridge/netfilter/ebtables.c:372 [inline]
>  ebt_check_entry net/bridge/netfilter/ebtables.c:702 [inline]
>  translate_table+0x4e88/0x6120 net/bridge/netfilter/ebtables.c:943
>  do_replace_finish+0x1258/0x2ea0 net/bridge/netfilter/ebtables.c:999
>  do_replace+0x719/0x780 net/bridge/netfilter/ebtables.c:1138
>  do_ebt_set_ctl+0x2ab/0x3c0 net/bridge/netfilter/ebtables.c:1517
>  nf_sockopt net/netfilter/nf_sockopt.c:106 [inline]
>  nf_setsockopt+0x47c/0x4e0 net/netfilter/nf_sockopt.c:115
>  ip_setsockopt+0x24b/0x2b0 net/ipv4/ip_sockglue.c:1251
>  udp_setsockopt+0x108/0x1b0 net/ipv4/udp.c:2416
>  sock_common_setsockopt+0x13b/0x170 net/core/sock.c:3039
>  __sys_setsockopt+0x496/0x540 net/socket.c:1903
>  __do_sys_setsockopt net/socket.c:1914 [inline]
>  __se_sys_setsockopt net/socket.c:1911 [inline]
>  __x64_sys_setsockopt+0x15c/0x1c0 net/socket.c:1911
>  do_syscall_64+0x15b/0x230 arch/x86/entry/common.c:287
>  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> RIP: 0033:0x4559f9
> RSP: 002b:00007f45b9246c68 EFLAGS: 00000246 ORIG_RAX: 0000000000000036
> RAX: ffffffffffffffda RBX: 00007f45b92476d4 RCX: 00000000004559f9
> RDX: 0000000000000080 RSI: 0000000000000000 RDI: 0000000000000014
> RBP: 000000000072bea0 R08: 0000000000000300 R09: 0000000000000000
> R10: 0000000020000480 R11: 0000000000000246 R12: 00000000ffffffff
> R13: 00000000004c0d6d R14: 00000000004d07c8 R15: 0000000000000000
>
> Local variable description: ----mtpar.i@translate_table
> Variable was created at:
>  translate_table+0xbb/0x6120 net/bridge/netfilter/ebtables.c:831
>  do_replace_finish+0x1258/0x2ea0 net/bridge/netfilter/ebtables.c:999
> ==================================================================
>
>
> ---
> This bug is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkaller@googlegroups.com.
>
> syzbot will keep track of this bug report. See:
> https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with
> syzbot.
>
> --
> You received this message because you are subscribed to the Google Groups
> "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to syzkaller-bugs+unsubscribe@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/syzkaller-bugs/0000000000009a3b5b056e109e42%40google.com.
> For more options, visit https://groups.google.com/d/optout.

^ permalink raw reply

* Re: KMSAN: uninit-value in _copy_to_iter (2)
From: syzbot @ 2018-06-07 17:31 UTC (permalink / raw)
  To: avagin, davem, dingtianhong, edumazet, elena.reshetova, jasowang,
	kvm, linux-kernel, matthew, mingo, mst, netdev, pabeni,
	syzkaller-bugs, viro, virtualization, willemb
In-Reply-To: <20180607183629-mutt-send-email-mst@kernel.org>

Hello,

syzbot has tested the proposed patch and the reproducer did not trigger  
crash:

Reported-and-tested-by:  
syzbot+87cfa083e727a224754b@syzkaller.appspotmail.com

Tested on:

commit:         c6a6aed994b6 kmsan: remove dead code to trigger syzbot build
git tree:       https://github.com/google/kmsan.git/master
kernel config:  https://syzkaller.appspot.com/x/.config?x=848e40757852af3e
compiler:       clang version 7.0.0 (trunk 334104)
patch:          https://syzkaller.appspot.com/x/patch.diff?x=1119eddf800000

Note: testing is done by a robot and is best-effort only.

^ permalink raw reply

* [PATCH net-next] bpfilter: fix OUTPUT_FORMAT
From: Alexei Starovoitov @ 2018-06-07 17:29 UTC (permalink / raw)
  To: David S . Miller; +Cc: daniel, netdev, kernel-team

From: Alexei Starovoitov <ast@fb.com>

CONFIG_OUTPUT_FORMAT is x86 only macro.
Used objdump to extract elf file format.

Fixes: d2ba09c17a06 ("net: add skeleton of bpfilter kernel module")
Reported-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 net/bpfilter/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/bpfilter/Makefile b/net/bpfilter/Makefile
index aafa72001fcd..e0bbe7583e58 100644
--- a/net/bpfilter/Makefile
+++ b/net/bpfilter/Makefile
@@ -21,7 +21,7 @@ endif
 # which bpfilter_kern.c passes further into umh blob loader at run-time
 quiet_cmd_copy_umh = GEN $@
       cmd_copy_umh = echo ':' > $(obj)/.bpfilter_umh.o.cmd; \
-      $(OBJCOPY) -I binary -O $(CONFIG_OUTPUT_FORMAT) \
+      $(OBJCOPY) -I binary -O `$(OBJDUMP) -f $<|grep format|cut -d' ' -f8` \
       -B `$(OBJDUMP) -f $<|grep architecture|cut -d, -f1|cut -d' ' -f2` \
       --rename-section .data=.init.rodata $< $@
 
-- 
2.9.5

^ permalink raw reply related

* KMSAN: uninit-value in ebt_stp_mt_check (2)
From: syzbot @ 2018-06-07 17:29 UTC (permalink / raw)
  To: bridge, coreteam, davem, fw, kadlec, linux-kernel, netdev,
	netfilter-devel, pablo, stephen, syzkaller-bugs

Hello,

syzbot found the following crash on:

HEAD commit:    c6a6aed994b6 kmsan: remove dead code to trigger syzbot build
git tree:       https://github.com/google/kmsan.git/master
console output: https://syzkaller.appspot.com/x/log.txt?x=17bde74f800000
kernel config:  https://syzkaller.appspot.com/x/.config?x=848e40757852af3e
dashboard link: https://syzkaller.appspot.com/bug?extid=da4494182233c23a5fcf
compiler:       clang version 7.0.0 (trunk 334104)

Unfortunately, I don't have any reproducer for this crash yet.

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+da4494182233c23a5fcf@syzkaller.appspotmail.com

==================================================================
BUG: KMSAN: uninit-value in ebt_stp_mt_check+0x24b/0x450  
net/bridge/netfilter/ebt_stp.c:162
CPU: 0 PID: 12006 Comm: syz-executor7 Not tainted 4.17.0+ #3
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
Call Trace:
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0x185/0x1d0 lib/dump_stack.c:113
  kmsan_report+0x149/0x260 mm/kmsan/kmsan.c:1084
  __msan_warning_32+0x6e/0xc0 mm/kmsan/kmsan_instr.c:620
  ebt_stp_mt_check+0x24b/0x450 net/bridge/netfilter/ebt_stp.c:162
  xt_check_match+0x1438/0x1650 net/netfilter/x_tables.c:506
  ebt_check_match net/bridge/netfilter/ebtables.c:372 [inline]
  ebt_check_entry net/bridge/netfilter/ebtables.c:702 [inline]
  translate_table+0x4e88/0x6120 net/bridge/netfilter/ebtables.c:943
  do_replace_finish+0x1258/0x2ea0 net/bridge/netfilter/ebtables.c:999
  do_replace+0x719/0x780 net/bridge/netfilter/ebtables.c:1138
  do_ebt_set_ctl+0x2ab/0x3c0 net/bridge/netfilter/ebtables.c:1517
  nf_sockopt net/netfilter/nf_sockopt.c:106 [inline]
  nf_setsockopt+0x47c/0x4e0 net/netfilter/nf_sockopt.c:115
  ip_setsockopt+0x24b/0x2b0 net/ipv4/ip_sockglue.c:1251
  udp_setsockopt+0x108/0x1b0 net/ipv4/udp.c:2416
  sock_common_setsockopt+0x13b/0x170 net/core/sock.c:3039
  __sys_setsockopt+0x496/0x540 net/socket.c:1903
  __do_sys_setsockopt net/socket.c:1914 [inline]
  __se_sys_setsockopt net/socket.c:1911 [inline]
  __x64_sys_setsockopt+0x15c/0x1c0 net/socket.c:1911
  do_syscall_64+0x15b/0x230 arch/x86/entry/common.c:287
  entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x4559f9
RSP: 002b:00007f45b9246c68 EFLAGS: 00000246 ORIG_RAX: 0000000000000036
RAX: ffffffffffffffda RBX: 00007f45b92476d4 RCX: 00000000004559f9
RDX: 0000000000000080 RSI: 0000000000000000 RDI: 0000000000000014
RBP: 000000000072bea0 R08: 0000000000000300 R09: 0000000000000000
R10: 0000000020000480 R11: 0000000000000246 R12: 00000000ffffffff
R13: 00000000004c0d6d R14: 00000000004d07c8 R15: 0000000000000000

Local variable description: ----mtpar.i@translate_table
Variable was created at:
  translate_table+0xbb/0x6120 net/bridge/netfilter/ebtables.c:831
  do_replace_finish+0x1258/0x2ea0 net/bridge/netfilter/ebtables.c:999
==================================================================


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with  
syzbot.

^ permalink raw reply

* [PATCH net-next] umh: fix race condition
From: Alexei Starovoitov @ 2018-06-07 17:23 UTC (permalink / raw)
  To: David S . Miller
  Cc: daniel, mcgrof, netdev, dvyukov, linux-kernel, kernel-team

kasan reported use-after-free:
BUG: KASAN: use-after-free in call_usermodehelper_exec_work+0x2d3/0x310 kernel/umh.c:195
Write of size 4 at addr ffff8801d9202370 by task kworker/u4:2/50
Workqueue: events_unbound call_usermodehelper_exec_work
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x1b9/0x294 lib/dump_stack.c:113
 print_address_description+0x6c/0x20b mm/kasan/report.c:256
 kasan_report_error mm/kasan/report.c:354 [inline]
 kasan_report.cold.7+0x242/0x2fe mm/kasan/report.c:412
 __asan_report_store4_noabort+0x17/0x20 mm/kasan/report.c:437
 call_usermodehelper_exec_work+0x2d3/0x310 kernel/umh.c:195
 process_one_work+0xc1e/0x1b50 kernel/workqueue.c:2145
 worker_thread+0x1cc/0x1440 kernel/workqueue.c:2279
 kthread+0x345/0x410 kernel/kthread.c:240
 ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:412

The reason is that 'sub_info' cannot be accessed out of parent task
context, since it will be freed by the child.
Instead remember the pid in the child task.

Fixes: 449325b52b7a ("umh: introduce fork_usermode_blob() helper")
Reported-by: syzbot+2c73319c406f1987d156@syzkaller.appspotmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 kernel/umh.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/kernel/umh.c b/kernel/umh.c
index 30db93fd7e39..c449858946af 100644
--- a/kernel/umh.c
+++ b/kernel/umh.c
@@ -99,6 +99,7 @@ static int call_usermodehelper_exec_async(void *data)
 
 	commit_creds(new);
 
+	sub_info->pid = task_pid_nr(current);
 	if (sub_info->file)
 		retval = do_execve_file(sub_info->file,
 					sub_info->argv, sub_info->envp);
@@ -191,8 +192,6 @@ static void call_usermodehelper_exec_work(struct work_struct *work)
 		if (pid < 0) {
 			sub_info->retval = pid;
 			umh_complete(sub_info);
-		} else {
-			sub_info->pid = pid;
 		}
 	}
 }
-- 
2.9.5

^ permalink raw reply related

* Re: [PATCH net] failover: eliminate callback hell
From: Michael S. Tsirkin @ 2018-06-07 17:22 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Alexander Duyck, Samudrala, Sridhar, Jiri Pirko, KY Srinivasan,
	Haiyang Zhang, David Miller, Netdev, Stephen Hemminger
In-Reply-To: <20180607091742.461dff83@xeon-e3>

On Thu, Jun 07, 2018 at 09:17:42AM -0700, Stephen Hemminger wrote:
> On Thu, 7 Jun 2018 18:41:31 +0300
> "Michael S. Tsirkin" <mst@redhat.com> wrote:
> 
> > > > Why would DPDK care what we do in the kernel? Isn't it just slapping
> > > > vfio-pci on the netdevs it sees?  
> > > 
> > > Alex, you are correct for Intel devices; but DPDK on Azure is not Intel based.,.
> > > The DPDK support uses:
> > >  * Mellanox MLX5 which uses the Infinband hooks to do DMA directly to
> > >    userspace. This means VF netdev device must exist and be visible.
> > >  * Slow path using kernel netvsc device, TAP and BPF to get exception
> > >    path packets to userspace.
> > >  * A autodiscovery mechanism that to set all this up that relies on
> > >    2 device model and sysfs.  
> > 
> > Could you describe what does it look for exactly? What will break if
> > instead of MLX5 being a child of the PV, it's a child of the failover
> > device?
> 
> So in DPDK there is an internal four device model:
> 	1. failsafe is like failover in your model
> 	2. TAP is used like netvsc in kernel
> 	3. MLX5 is the VF
> 	4. vdev_netvsc is a pseudo device whose only reason to exist
> 	   is to glue everything together.
> 
> Digging deeper inside...
> 
> Vdev_netvsc does:
>    * driver is started in a convuluted way off device arguments
>    * probe routine for driver runs
>       - scans list of kernel interfaces in sysfs
>       - matches those using VMBUS 

Could you tell a bit more what does this step entail?

>       - skip netvsc devices that have an IPV4 route
>    * scan for PCI devices that have same MAC address as kernel netvsc
>      devices discovered in previous step
>    * add these interfaces to arguments to failsafe
> 
> Then failsafe configures based on arguments on device
> 
> The code works but is specific to the Azure hardware model, and exposes lots
> of things to application that it should not have to care about.
> 
> If you  try and walk through this code in DPDK, you will see why I have developed
> a dislike for high levels of indirection.
> 
> 
> 	   

Thanks that was helpful!  I'll try to poke at it next week.  Just from
the description it seems the kernel is merely used to locate the MAC
address through sysfs and that for this DPDK code to keep working the
hidden device must be hidden from it in sysfs - is that a fair summary?

-- 
MST

^ permalink raw reply

* Re: [RFC PATCH 1/2] net: macb: Add CAP to disable hardware TX checksum offloading
From: Jennifer Dahm @ 2018-06-07 16:43 UTC (permalink / raw)
  To: Nicolas Ferre, netdev, David S . Miller; +Cc: Nathan Sullivan
In-Reply-To: <ad25fd2b-a3a0-28d5-4dd1-a7ccc935fff2@microchip.com>

Hi Nicolas,

On 06/04/2018 10:13 AM, Nicolas Ferre wrote:
> On 25/05/2018 at 23:44, Jennifer Dahm wrote:
>> diff --git a/drivers/net/ethernet/cadence/macb_main.c 
>> b/drivers/net/ethernet/cadence/macb_main.c
>> index 3e93df5..a5d564b 100644
>> --- a/drivers/net/ethernet/cadence/macb_main.c
>> +++ b/drivers/net/ethernet/cadence/macb_main.c
>> @@ -3360,8 +3360,12 @@ static int macb_init(struct platform_device 
>> *pdev)
>>           dev->hw_features |= MACB_NETIF_LSO;
>>         /* Checksum offload is only available on gem with packet 
>> buffer */
>> -    if (macb_is_gem(bp) && !(bp->caps & MACB_CAPS_FIFO_MODE))
>> -        dev->hw_features |= NETIF_F_HW_CSUM | NETIF_F_RXCSUM;
>> +    if (macb_is_gem(bp) && !(bp->caps & MACB_CAPS_FIFO_MODE)) {
>> +        if (!(bp->caps & MACB_CAPS_DISABLE_TX_HW_CSUM))
>
> Why not the other way around? negating a "disabled" feature is always 
> challenge ;-)
>
>> +            dev->hw_features |= NETIF_F_HW_CSUM | NETIF_F_RXCSUM;
>> +        else
>> +            dev->hw_features |= NETIF_F_RXCSUM;
>> +    }
>>       if (bp->caps & MACB_CAPS_SG_DISABLED)
>>           dev->hw_features &= ~NETIF_F_SG;
>>       dev->features = dev->hw_features;

I can switch the ordering of the if-else clauses if that's what you're 
nitpicking. ;)

Alternatively, if you're asking why the flag is used to disable rather 
than enable checksum offloading: I was working under the assumption that 
this was an isolated bug, and so an opt-out would require less 
maintainance than an opt-in. If we discover that this is a problem 
across a wide variety of Cadence IP, it would definitely make sense to 
replace it with an opt-in (i.e. MACB_CAPS_TX_HW_CSUM_ENABLED).

Best,
Jennifer Dahm

^ permalink raw reply

* Re: KMSAN: uninit-value in _copy_to_iter (2)
From: Michael S. Tsirkin @ 2018-06-07 17:10 UTC (permalink / raw)
  To: syzbot
  Cc: avagin, davem, dingtianhong, edumazet, elena.reshetova, jasowang,
	kvm, linux-kernel, matthew, mingo, netdev, pabeni, syzkaller-bugs,
	viro, virtualization, willemb
In-Reply-To: <000000000000cf4578056ab12452@google.com>

#syz test: https://github.com/google/kmsan.git master

Subject: vhost: fix info leak

Fixes: CVE-2018-1118
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index f0be5f35ab28..9beefa6ed1ce 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -2345,6 +2345,9 @@ struct vhost_msg_node *vhost_new_msg(struct vhost_virtqueue *vq, int type)
 	struct vhost_msg_node *node = kmalloc(sizeof *node, GFP_KERNEL);
 	if (!node)
 		return NULL;
+
+	/* Make sure all padding within the structure is initialized. */
+	memset(&node->msg, 0, sizeof node->msg);
 	node->vq = vq;
 	node->msg.type = type;
 	return node;

^ permalink raw reply related

* Re: KMSAN: uninit-value in _copy_to_iter (2)
From: syzbot @ 2018-06-07 17:04 UTC (permalink / raw)
  To: avagin, davem, dingtianhong, dvyukov, edumazet, elena.reshetova,
	jasowang, kvm, linux-kernel, matthew, mingo, mst, netdev, pabeni,
	syzkaller-bugs, viro, virtualization, willemb
In-Reply-To: <CACT4Y+btyLi+z09EO0JU-7AS_OfRbWiZvGn-KP9fndevh+xFNw@mail.gmail.com>

Hello,

syzbot has tested the proposed patch and the reproducer did not trigger  
crash:

Reported-and-tested-by:  
syzbot+87cfa083e727a224754b@syzkaller.appspotmail.com

Tested on:

commit:         c6a6aed994b6 kmsan: remove dead code to trigger syzbot build
git tree:       https://github.com/google/kmsan.git/master
kernel config:  https://syzkaller.appspot.com/x/.config?x=848e40757852af3e
compiler:       clang version 7.0.0 (trunk 334104)
patch:          https://syzkaller.appspot.com/x/patch.diff?x=17dc3a8f800000

Note: testing is done by a robot and is best-effort only.

^ permalink raw reply

* Re: KASAN: use-after-free Write in bpf_tcp_close
From: Dmitry Vyukov @ 2018-06-07 16:58 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: syzbot, Alexei Starovoitov, LKML, netdev, syzkaller-bugs,
	John Fastabend
In-Reply-To: <ca3eff64-adb0-6110-38ca-2e07b6ecc3d6@iogearbox.net>

On Mon, May 28, 2018 at 12:15 AM, Daniel Borkmann <daniel@iogearbox.net> wrote:
> [ +John ]
>
> On 05/27/2018 10:06 PM, syzbot wrote:
>> Hello,
>>
>> syzbot found the following crash on:
>>
>> HEAD commit:    ff4fb475cea8 Merge branch 'btf-uapi-cleanups'
>> git tree:       bpf-next
>> console output: https://syzkaller.appspot.com/x/log.txt?x=12b3d577800000
>> kernel config:  https://syzkaller.appspot.com/x/.config?x=b632d8e2c2ab2c1
>> dashboard link: https://syzkaller.appspot.com/bug?extid=31025a5f3f7650081204
>> compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
>> syzkaller repro:https://syzkaller.appspot.com/x/repro.syz?x=109a2f37800000
>> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=171a727b800000
>>
>> IMPORTANT: if you fix the bug, please add the following tag to the commit:
>> Reported-by: syzbot+31025a5f3f7650081204@syzkaller.appspotmail.com
>
> Should be fixed by: https://patchwork.ozlabs.org/patch/920695/

#syz fix: bpf: sockhash fix race with bpf_tcp_close and map delete

>> ==================================================================
>> BUG: KASAN: use-after-free in cmpxchg_size include/asm-generic/atomic-instrumented.h:355 [inline]
>> BUG: KASAN: use-after-free in bpf_tcp_close+0x6f5/0xf80 kernel/bpf/sockmap.c:265
>> Write of size 8 at addr ffff8801ca277680 by task syz-executor749/9723
>>
>> CPU: 0 PID: 9723 Comm: syz-executor749 Not tainted 4.17.0-rc4+ #19
>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
>> Call Trace:
>>  __dump_stack lib/dump_stack.c:77 [inline]
>>  dump_stack+0x1b9/0x294 lib/dump_stack.c:113
>>  print_address_description+0x6c/0x20b mm/kasan/report.c:256
>>  kasan_report_error mm/kasan/report.c:354 [inline]
>>  kasan_report.cold.7+0x242/0x2fe mm/kasan/report.c:412
>>  check_memory_region_inline mm/kasan/kasan.c:260 [inline]
>>  check_memory_region+0x13e/0x1b0 mm/kasan/kasan.c:267
>>  kasan_check_write+0x14/0x20 mm/kasan/kasan.c:278
>>  cmpxchg_size include/asm-generic/atomic-instrumented.h:355 [inline]
>>  bpf_tcp_close+0x6f5/0xf80 kernel/bpf/sockmap.c:265
>>  inet_release+0x104/0x1f0 net/ipv4/af_inet.c:427
>>  inet6_release+0x50/0x70 net/ipv6/af_inet6.c:459
>>  sock_release+0x96/0x1b0 net/socket.c:594
>>  sock_close+0x16/0x20 net/socket.c:1149
>>  __fput+0x34d/0x890 fs/file_table.c:209
>>  ____fput+0x15/0x20 fs/file_table.c:243
>>  task_work_run+0x1e4/0x290 kernel/task_work.c:113
>>  exit_task_work include/linux/task_work.h:22 [inline]
>>  do_exit+0x1aee/0x2730 kernel/exit.c:865
>>  do_group_exit+0x16f/0x430 kernel/exit.c:968
>>  __do_sys_exit_group kernel/exit.c:979 [inline]
>>  __se_sys_exit_group kernel/exit.c:977 [inline]
>>  __x64_sys_exit_group+0x3e/0x50 kernel/exit.c:977
>>  do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287
>>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
>> RIP: 0033:0x440a59
>> RSP: 002b:00007ffdadf92488 EFLAGS: 00000206 ORIG_RAX: 00000000000000e7
>> RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000440a59
>> RDX: 0000000000440a59 RSI: 0000000000000020 RDI: 0000000000000000
>> RBP: 0000000000000000 R08: 00000000004002c8 R09: 0000000000401ea0
>> R10: 00000000004002c8 R11: 0000000000000206 R12: 000000000001b5ac
>> R13: 0000000000401ea0 R14: 0000000000000000 R15: 0000000000000000
>>
>> Allocated by task 9723:
>>  save_stack+0x43/0xd0 mm/kasan/kasan.c:448
>>  set_track mm/kasan/kasan.c:460 [inline]
>>  kasan_kmalloc+0xc4/0xe0 mm/kasan/kasan.c:553
>>  __do_kmalloc_node mm/slab.c:3682 [inline]
>>  __kmalloc_node+0x47/0x70 mm/slab.c:3689
>>  kmalloc_node include/linux/slab.h:554 [inline]
>>  bpf_map_area_alloc+0x3f/0x90 kernel/bpf/syscall.c:144
>>  sock_map_alloc+0x376/0x410 kernel/bpf/sockmap.c:1555
>>  find_and_alloc_map kernel/bpf/syscall.c:126 [inline]
>>  map_create+0x393/0x1010 kernel/bpf/syscall.c:448
>>  __do_sys_bpf kernel/bpf/syscall.c:2128 [inline]
>>  __se_sys_bpf kernel/bpf/syscall.c:2105 [inline]
>>  __x64_sys_bpf+0x300/0x4f0 kernel/bpf/syscall.c:2105
>>  do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287
>>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
>>
>> Freed by task 4521:
>>  save_stack+0x43/0xd0 mm/kasan/kasan.c:448
>>  set_track mm/kasan/kasan.c:460 [inline]
>>  __kasan_slab_free+0x11a/0x170 mm/kasan/kasan.c:521
>>  kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528
>>  __cache_free mm/slab.c:3498 [inline]
>>  kfree+0xd9/0x260 mm/slab.c:3813
>>  kvfree+0x61/0x70 mm/util.c:440
>>  bpf_map_area_free+0x15/0x20 kernel/bpf/syscall.c:155
>>  sock_map_remove_complete kernel/bpf/sockmap.c:1443 [inline]
>>  sock_map_free+0x408/0x540 kernel/bpf/sockmap.c:1619
>>  bpf_map_free_deferred+0xba/0xf0 kernel/bpf/syscall.c:259
>>  process_one_work+0xc1e/0x1b50 kernel/workqueue.c:2145
>>  worker_thread+0x1cc/0x1440 kernel/workqueue.c:2279
>>  kthread+0x345/0x410 kernel/kthread.c:238
>>  ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:412
>>
>> The buggy address belongs to the object at ffff8801ca277680
>>  which belongs to the cache kmalloc-1024 of size 1024
>> The buggy address is located 0 bytes inside of
>>  1024-byte region [ffff8801ca277680, ffff8801ca277a80)
>> The buggy address belongs to the page:
>> page:ffffea0007289d80 count:1 mapcount:0 mapping:ffff8801ca276000 index:0x0 compound_mapcount: 0
>> flags: 0x2fffc0000008100(slab|head)
>> raw: 02fffc0000008100 ffff8801ca276000 0000000000000000 0000000100000007
>> raw: ffffea0006d12b20 ffffea000763bba0 ffff8801da800ac0 0000000000000000
>> page dumped because: kasan: bad access detected
>>
>> Memory state around the buggy address:
>>  ffff8801ca277580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>>  ffff8801ca277600: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>>> ffff8801ca277680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>                    ^
>>  ffff8801ca277700: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>  ffff8801ca277780: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>> ==================================================================
>>
>>
>> ---
>> This bug is generated by a bot. It may contain errors.
>> See https://goo.gl/tpsmEJ for more information about syzbot.
>> syzbot engineers can be reached at syzkaller@googlegroups.com.
>>
>> syzbot will keep track of this bug report. See:
>> https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with syzbot.
>> syzbot can test patches for this bug, for details see:
>> https://goo.gl/tpsmEJ#testing-patches
>
> --
> You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/ca3eff64-adb0-6110-38ca-2e07b6ecc3d6%40iogearbox.net.
> For more options, visit https://groups.google.com/d/optout.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox