Netdev List

Netdev List
 help / color / mirror / Atom feed

* INFO: task hung in ip6gre_exit_batch_net
From: syzbot @ 2018-06-04 15:03 UTC (permalink / raw)
  To: christian.brauner, davem, dsahern, fw, jbenc, ktkhai,
	linux-kernel, lucien.xin, mschiffer, netdev, syzkaller-bugs,
	vyasevich

Hello,

syzbot found the following crash on:

HEAD commit:    bc2dbc5420e8 Merge branch 'akpm' (patches from Andrew)
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=164e42b7800000
kernel config:  https://syzkaller.appspot.com/x/.config?x=982e2df1b9e60b02
dashboard link: https://syzkaller.appspot.com/bug?extid=bf78a74f82c1cf19069e
compiler:       gcc (GCC) 8.0.1 20180413 (experimental)

Unfortunately, I don't have any reproducer for this crash yet.

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+bf78a74f82c1cf19069e@syzkaller.appspotmail.com

INFO: task kworker/u4:1:22 blocked for more than 120 seconds.
       Not tainted 4.17.0-rc6+ #68
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kworker/u4:1    D13944    22      2 0x80000000
Workqueue: netns cleanup_net
Call Trace:
  context_switch kernel/sched/core.c:2859 [inline]
  __schedule+0x801/0x1e30 kernel/sched/core.c:3501
  schedule+0xef/0x430 kernel/sched/core.c:3545
  schedule_preempt_disabled+0x10/0x20 kernel/sched/core.c:3603
  __mutex_lock_common kernel/locking/mutex.c:833 [inline]
  __mutex_lock+0xe38/0x17f0 kernel/locking/mutex.c:893
  mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:908
  rtnl_lock+0x17/0x20 net/core/rtnetlink.c:74
  ip6gre_exit_batch_net+0xd5/0x7d0 net/ipv6/ip6_gre.c:1585
  ops_exit_list.isra.7+0x105/0x160 net/core/net_namespace.c:155
  cleanup_net+0x51d/0xb20 net/core/net_namespace.c:523
  process_one_work+0xc1e/0x1b50 kernel/workqueue.c:2145
  worker_thread+0x1cc/0x1440 kernel/workqueue.c:2279
  kthread+0x345/0x410 kernel/kthread.c:240
  ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:412

Showing all locks held in the system:
4 locks held by kworker/u4:1/22:
  #0: 0000000049a7b590 ((wq_completion)"%s""netns"){+.+.}, at:  
__write_once_size include/linux/compiler.h:215 [inline]
  #0: 0000000049a7b590 ((wq_completion)"%s""netns"){+.+.}, at:  
arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
  #0: 0000000049a7b590 ((wq_completion)"%s""netns"){+.+.}, at: atomic64_set  
include/asm-generic/atomic-instrumented.h:40 [inline]
  #0: 0000000049a7b590 ((wq_completion)"%s""netns"){+.+.}, at:  
atomic_long_set include/asm-generic/atomic-long.h:57 [inline]
  #0: 0000000049a7b590 ((wq_completion)"%s""netns"){+.+.}, at: set_work_data  
kernel/workqueue.c:617 [inline]
  #0: 0000000049a7b590 ((wq_completion)"%s""netns"){+.+.}, at:  
set_work_pool_and_clear_pending kernel/workqueue.c:644 [inline]
  #0: 0000000049a7b590 ((wq_completion)"%s""netns"){+.+.}, at:  
process_one_work+0xaef/0x1b50 kernel/workqueue.c:2116
  #1: 0000000030a00b6b (net_cleanup_work){+.+.}, at:  
process_one_work+0xb46/0x1b50 kernel/workqueue.c:2120
  #2: 000000007eb35e65 (pernet_ops_rwsem){++++}, at: cleanup_net+0x11a/0xb20  
net/core/net_namespace.c:490
  #3: 000000007eb32c75 (rtnl_mutex){+.+.}, at: rtnl_lock+0x17/0x20  
net/core/rtnetlink.c:74
3 locks held by kworker/1:1/24:
  #0: 000000001c9e6580 ((wq_completion)"%s"("ipv6_addrconf")){+.+.}, at:  
__write_once_size include/linux/compiler.h:215 [inline]
  #0: 000000001c9e6580 ((wq_completion)"%s"("ipv6_addrconf")){+.+.}, at:  
arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
  #0: 000000001c9e6580 ((wq_completion)"%s"("ipv6_addrconf")){+.+.}, at:  
atomic64_set include/asm-generic/atomic-instrumented.h:40 [inline]
  #0: 000000001c9e6580 ((wq_completion)"%s"("ipv6_addrconf")){+.+.}, at:  
atomic_long_set include/asm-generic/atomic-long.h:57 [inline]
  #0: 000000001c9e6580 ((wq_completion)"%s"("ipv6_addrconf")){+.+.}, at:  
set_work_data kernel/workqueue.c:617 [inline]
  #0: 000000001c9e6580 ((wq_completion)"%s"("ipv6_addrconf")){+.+.}, at:  
set_work_pool_and_clear_pending kernel/workqueue.c:644 [inline]
  #0: 000000001c9e6580 ((wq_completion)"%s"("ipv6_addrconf")){+.+.}, at:  
process_one_work+0xaef/0x1b50 kernel/workqueue.c:2116
  #1: 000000009edcfbe7 ((addr_chk_work).work){+.+.}, at:  
process_one_work+0xb46/0x1b50 kernel/workqueue.c:2120
  #2: 000000007eb32c75 (rtnl_mutex){+.+.}, at: rtnl_lock+0x17/0x20  
net/core/rtnetlink.c:74
2 locks held by khungtaskd/893:
  #0: 000000007eeb621a (rcu_read_lock){....}, at:  
check_hung_uninterruptible_tasks kernel/hung_task.c:175 [inline]
  #0: 000000007eeb621a (rcu_read_lock){....}, at: watchdog+0x1ff/0xf60  
kernel/hung_task.c:249
  #1: 00000000239f1b5e (tasklist_lock){.+.+}, at:  
debug_show_all_locks+0xde/0x34a kernel/locking/lockdep.c:4470
2 locks held by getty/4481:
  #0: 00000000cc114025 (&tty->ldisc_sem){++++}, at:  
ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
  #1: 000000006ad1f3fc (&ldata->atomic_read_lock){+.+.}, at:  
n_tty_read+0x321/0x1cc0 drivers/tty/n_tty.c:2131
2 locks held by getty/4482:
  #0: 00000000226a16cc (&tty->ldisc_sem){++++}, at:  
ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
  #1: 000000008cee8cdc (&ldata->atomic_read_lock){+.+.}, at:  
n_tty_read+0x321/0x1cc0 drivers/tty/n_tty.c:2131
2 locks held by getty/4483:
  #0: 0000000067bd3c39 (&tty->ldisc_sem){++++}, at:  
ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
  #1: 000000005d8bc81d (&ldata->atomic_read_lock){+.+.}, at:  
n_tty_read+0x321/0x1cc0 drivers/tty/n_tty.c:2131
2 locks held by getty/4484:
  #0: 00000000f0f8d839 (&tty->ldisc_sem){++++}, at:  
ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
  #1: 00000000a9d5f091 (&ldata->atomic_read_lock){+.+.}, at:  
n_tty_read+0x321/0x1cc0 drivers/tty/n_tty.c:2131
2 locks held by getty/4485:
  #0: 000000002c96ee9a (&tty->ldisc_sem){++++}, at:  
ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
  #1: 0000000033338ac7 (&ldata->atomic_read_lock){+.+.}, at:  
n_tty_read+0x321/0x1cc0 drivers/tty/n_tty.c:2131
2 locks held by getty/4486:
  #0: 00000000f6db39b5 (&tty->ldisc_sem){++++}, at:  
ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
  #1: 00000000bb7c6099 (&ldata->atomic_read_lock){+.+.}, at:  
n_tty_read+0x321/0x1cc0 drivers/tty/n_tty.c:2131
2 locks held by getty/4487:
  #0: 000000006be9659d (&tty->ldisc_sem){++++}, at:  
ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
  #1: 00000000e2edd3d0 (&ldata->atomic_read_lock){+.+.}, at:  
n_tty_read+0x321/0x1cc0 drivers/tty/n_tty.c:2131
3 locks held by kworker/1:3/27147:
  #0: 00000000c208aa7f ((wq_completion)"events"){+.+.}, at:  
__write_once_size include/linux/compiler.h:215 [inline]
  #0: 00000000c208aa7f ((wq_completion)"events"){+.+.}, at:  
arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
  #0: 00000000c208aa7f ((wq_completion)"events"){+.+.}, at: atomic64_set  
include/asm-generic/atomic-instrumented.h:40 [inline]
  #0: 00000000c208aa7f ((wq_completion)"events"){+.+.}, at: atomic_long_set  
include/asm-generic/atomic-long.h:57 [inline]
  #0: 00000000c208aa7f ((wq_completion)"events"){+.+.}, at: set_work_data  
kernel/workqueue.c:617 [inline]
  #0: 00000000c208aa7f ((wq_completion)"events"){+.+.}, at:  
set_work_pool_and_clear_pending kernel/workqueue.c:644 [inline]
  #0: 00000000c208aa7f ((wq_completion)"events"){+.+.}, at:  
process_one_work+0xaef/0x1b50 kernel/workqueue.c:2116
  #1: 00000000fa87e61f (deferred_process_work){+.+.}, at:  
process_one_work+0xb46/0x1b50 kernel/workqueue.c:2120
  #2: 000000007eb32c75 (rtnl_mutex){+.+.}, at: rtnl_lock+0x17/0x20  
net/core/rtnetlink.c:74
1 lock held by syz-executor7/29665:
  #0: 000000006e20d618 (sk_lock-AF_INET6){+.+.}, at: lock_sock  
include/net/sock.h:1469 [inline]
  #0: 000000006e20d618 (sk_lock-AF_INET6){+.+.}, at:  
tls_sw_sendmsg+0x1b9/0x12e0 net/tls/tls_sw.c:384
1 lock held by syz-executor7/29689:
  #0: 000000007eb32c75 (rtnl_mutex){+.+.}, at: rtnl_lock+0x17/0x20  
net/core/rtnetlink.c:74

=============================================

NMI backtrace for cpu 0
CPU: 0 PID: 893 Comm: khungtaskd Not tainted 4.17.0-rc6+ #68
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
Call Trace:
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0x1b9/0x294 lib/dump_stack.c:113
  nmi_cpu_backtrace.cold.4+0x19/0xce lib/nmi_backtrace.c:103
  nmi_trigger_cpumask_backtrace+0x151/0x192 lib/nmi_backtrace.c:62
  arch_trigger_cpumask_backtrace+0x14/0x20 arch/x86/kernel/apic/hw_nmi.c:38
  trigger_all_cpu_backtrace include/linux/nmi.h:138 [inline]
  check_hung_task kernel/hung_task.c:132 [inline]
  check_hung_uninterruptible_tasks kernel/hung_task.c:190 [inline]
  watchdog+0xc10/0xf60 kernel/hung_task.c:249
  kthread+0x345/0x410 kernel/kthread.c:240
  ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:412
Sending NMI from CPU 0 to CPUs 1:
NMI backtrace for cpu 1 skipped: idling at native_safe_halt+0x6/0x10  
arch/x86/include/asm/irqflags.h:54


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with  
syzbot.

^ permalink raw reply

* Re: [PATCH net] ipv4: igmp: hold wakelock to prevent delayed reports
From: Tejaswi Tanikella @ 2018-06-04 15:03 UTC (permalink / raw)
  To: Florian Fainelli; +Cc: netdev, Andrew Lunn
In-Reply-To: <8661f50b-876f-50d8-f1a5-61b1dabc5398@gmail.com>

On Fri, Jun 01, 2018 at 07:45:16AM -0700, Florian Fainelli wrote:

Thank you Florian for reviewing my patch.
I had a few questions.
> 
> 
> On 06/01/2018 07:05 AM, Tejaswi Tanikella wrote:
> > On receiving a IGMPv2/v3 query, based on max_delay set in the header a
> > timer is started to send out a response after a random time within
> > max_delay. If the system then moves into suspend state, Report is
> > delayed until system wakes up.
> > 
> > In one reported scenario, on arm64 devices, max_delay was set to 10s,
> > Reports were consistantly delayed if the timer is scheduled after 5 plus
> > seconds.
> > 
> > Hold wakelock while starting the timer to prevent moving into suspend
> > state.
> 
> I suppose this looks fine, but are not you going to be playing
> whack-a-mole through the network stack wherever there are similar
> patterns? Is not a better solution to move this to
> in_dev_get()/in_dev_put() where you could create a proper wakelock per
> network device?
I see that in_dev_get()/in_dev_put() are being used by some drivers.
won't addition of a wakelock be unnecessary for them?
Will adding two independent functions per in_dev to hold and release
wakelock be sufficient ?


> For instance, I could imagine ARP suffering from the same short comings...
I am not sure if ARP will be effected.
>From my understanding only systems with timers which trigger sending out
responses should be affected. For example IGMP and MLD must send reports
for the router to fwd mcast packets.
In ARP for instance, only moving between states will be delayed. Is there
something that I am missing ?


> 
> There is one thing that needs fixing though, see below.
Thanks, I'll fix them in the next patch.
> 
> > 
> > Signed-off-by: Tejaswi Tanikella <tejaswit@codeaurora.org>
> > ---
> >  include/linux/igmp.h |  1 +b
> >  net/ipv4/igmp.c      | 20 ++++++++++++++++++--
> >  2 files changed, 19 insertions(+), 2 deletions(-)
> > 
> > diff --git a/include/linux/igmp.h b/include/linux/igmp.h
> > index f823185..9be1c58 100644
> > --- a/include/linux/igmp.h
> > +++ b/include/linux/igmp.h
> > @@ -84,6 +84,7 @@ struct ip_mc_list {
> >  	};
> >  	struct ip_mc_list __rcu *next_hash;
> >  	struct timer_list	timer;
> > +	struct wakeup_source	*wakeup_src;
> 
> Since you are using this only when CONFIG_IP_MULTICAST is defined, you
> might was well save a few bytes by making this enclosed within an ifdef
> CONFIG_IP_MULTICAST here as well?
> 
> [snip]
> 
> > @@ -1415,6 +1429,8 @@ void ip_mc_inc_group(struct in_device *in_dev, __be32 addr)
> >  #ifdef CONFIG_IP_MULTICAST
> >  	timer_setup(&im->timer, igmp_timer_expire, 0);
> >  	im->unsolicit_count = net->ipv4.sysctl_igmp_qrv;
> > +	im->wakeup_src = wakeup_source_create("igmp_wakeup_source");
> 
> Missing error checking, wakeup_source_create() can return NULL here.
> -- 
> Florian

^ permalink raw reply

* Re: [net-next][PATCH] tcp: probe timer MUST not less than 5 minuter for tcp PMTU
From: David Miller @ 2018-06-04 14:59 UTC (permalink / raw)
  To: lirongqing; +Cc: netdev
In-Reply-To: <1527851039-6626-1-git-send-email-lirongqing@baidu.com>

From: Li RongQing <lirongqing@baidu.com>
Date: Fri,  1 Jun 2018 19:03:59 +0800

> RFC4821 say: The value for this timer MUST NOT be less than
> 5 minutes and is recommended to be 10 minutes, per RFC 1981.
> 
> Signed-off-by: Li RongQing <lirongqing@baidu.com>

As Eric stated, admins should be allowed to set this however they
wish (f.e. for specialized tests).

^ permalink raw reply

* Re: [PATCH net-next v2 1/3] bpf: implement bpf_get_current_cgroup_id() helper
From: Alexei Starovoitov @ 2018-06-04 14:58 UTC (permalink / raw)
  To: Daniel Borkmann; +Cc: Yonghong Song, ast, netdev, kernel-team
In-Reply-To: <0032678a-67ee-78ab-47d7-4ea3ecb1edd7@iogearbox.net>

On Mon, Jun 04, 2018 at 11:08:35AM +0200, Daniel Borkmann wrote:
> On 06/04/2018 12:59 AM, Yonghong Song wrote:
> > bpf has been used extensively for tracing. For example, bcc
> > contains an almost full set of bpf-based tools to trace kernel
> > and user functions/events. Most tracing tools are currently
> > either filtered based on pid or system-wide.
> > 
> > Containers have been used quite extensively in industry and
> > cgroup is often used together to provide resource isolation
> > and protection. Several processes may run inside the same
> > container. It is often desirable to get container-level tracing
> > results as well, e.g. syscall count, function count, I/O
> > activity, etc.
> > 
> > This patch implements a new helper, bpf_get_current_cgroup_id(),
> > which will return cgroup id based on the cgroup within which
> > the current task is running.
> > 
> > The later patch will provide an example to show that
> > userspace can get the same cgroup id so it could
> > configure a filter or policy in the bpf program based on
> > task cgroup id.
> > 
> > The helper is currently implemented for tracing. It can
> > be added to other program types as well when needed.
> > 
> > Acked-by: Alexei Starovoitov <ast@kernel.org>
> > Signed-off-by: Yonghong Song <yhs@fb.com>
> > ---
> >  include/linux/bpf.h      |  1 +
> >  include/uapi/linux/bpf.h |  8 +++++++-
> >  kernel/bpf/core.c        |  1 +
> >  kernel/bpf/helpers.c     | 15 +++++++++++++++
> >  kernel/trace/bpf_trace.c |  2 ++
> >  5 files changed, 26 insertions(+), 1 deletion(-)
> > 
> > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > index bbe2974..995c3b1 100644
> > --- a/include/linux/bpf.h
> > +++ b/include/linux/bpf.h
> > @@ -746,6 +746,7 @@ extern const struct bpf_func_proto bpf_get_stackid_proto;
> >  extern const struct bpf_func_proto bpf_get_stack_proto;
> >  extern const struct bpf_func_proto bpf_sock_map_update_proto;
> >  extern const struct bpf_func_proto bpf_sock_hash_update_proto;
> > +extern const struct bpf_func_proto bpf_get_current_cgroup_id_proto;
> >  
> >  /* Shared helpers among cBPF and eBPF. */
> >  void bpf_user_rnd_init_once(void);
> > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > index f0b6608..18712b0 100644
> > --- a/include/uapi/linux/bpf.h
> > +++ b/include/uapi/linux/bpf.h
> > @@ -2070,6 +2070,11 @@ union bpf_attr {
> >   * 		**CONFIG_SOCK_CGROUP_DATA** configuration option.
> >   * 	Return
> >   * 		The id is returned or 0 in case the id could not be retrieved.
> > + *
> > + * u64 bpf_get_current_cgroup_id(void)
> > + * 	Return
> > + * 		A 64-bit integer containing the current cgroup id based
> > + * 		on the cgroup within which the current task is running.
> >   */
> >  #define __BPF_FUNC_MAPPER(FN)		\
> >  	FN(unspec),			\
> > @@ -2151,7 +2156,8 @@ union bpf_attr {
> >  	FN(lwt_seg6_action),		\
> >  	FN(rc_repeat),			\
> >  	FN(rc_keydown),			\
> > -	FN(skb_cgroup_id),
> > +	FN(skb_cgroup_id),		\
> > +	FN(get_current_cgroup_id),
> >  
> >  /* integer value in 'imm' field of BPF_CALL instruction selects which helper
> >   * function eBPF program intends to call
> > diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
> > index 527587d..9f14937 100644
> > --- a/kernel/bpf/core.c
> > +++ b/kernel/bpf/core.c
> > @@ -1765,6 +1765,7 @@ const struct bpf_func_proto bpf_get_current_uid_gid_proto __weak;
> >  const struct bpf_func_proto bpf_get_current_comm_proto __weak;
> >  const struct bpf_func_proto bpf_sock_map_update_proto __weak;
> >  const struct bpf_func_proto bpf_sock_hash_update_proto __weak;
> > +const struct bpf_func_proto bpf_get_current_cgroup_id_proto __weak;
> >  
> >  const struct bpf_func_proto * __weak bpf_get_trace_printk_proto(void)
> >  {
> > diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> > index 3d24e23..73065e2 100644
> > --- a/kernel/bpf/helpers.c
> > +++ b/kernel/bpf/helpers.c
> > @@ -179,3 +179,18 @@ const struct bpf_func_proto bpf_get_current_comm_proto = {
> >  	.arg1_type	= ARG_PTR_TO_UNINIT_MEM,
> >  	.arg2_type	= ARG_CONST_SIZE,
> >  };
> > +
> > +#ifdef CONFIG_CGROUPS
> > +BPF_CALL_0(bpf_get_current_cgroup_id)
> > +{
> > +	struct cgroup *cgrp = task_dfl_cgroup(current);
> > +
> > +	return cgrp->kn->id.id;
> > +}
> > +
> > +const struct bpf_func_proto bpf_get_current_cgroup_id_proto = {
> > +	.func		= bpf_get_current_cgroup_id,
> > +	.gpl_only	= false,
> > +	.ret_type	= RET_INTEGER,
> > +};
> > +#endif
> 
> Nit: why not moving this function directly to bpf_trace.c?

my preference would be to keep it in helpers.c as-is.
imo bpf_trace.c is only for things that depend on kernel/trace/

> > diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> > index 752992c..e2ab5b7 100644
> > --- a/kernel/trace/bpf_trace.c
> > +++ b/kernel/trace/bpf_trace.c
> > @@ -564,6 +564,8 @@ tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
> >  		return &bpf_get_prandom_u32_proto;
> >  	case BPF_FUNC_probe_read_str:
> >  		return &bpf_probe_read_str_proto;
> > +	case BPF_FUNC_get_current_cgroup_id:
> > +		return &bpf_get_current_cgroup_id_proto;
> 
> When you have !CONFIG_CGROUPS, then it relies on the weak definition of the
> bpf_get_current_cgroup_id_proto, which I would think at latest in fixup_bpf_calls()
> bails out with 'kernel subsystem misconfigured func' due to func being NULL.
> 
> Can't we just do the #ifdef CONFIG_CGROUPS around BPF_FUNC_get_current_cgroup_id
> case instead? Then we bail out normally with 'unknown func' when cgroups are
> not configured?

good idea.

^ permalink raw reply

* Re: 答复: ANNOUNCE: Enhanced IP v1.4
From: Tom Herbert @ 2018-06-04 14:56 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: PKU.孙斌, Willy Tarreau,
	Linux Kernel Network Developers
In-Reply-To: <4d9e164d-58e3-caa0-a378-b9681eefa9d7@gmail.com>

On Mon, Jun 4, 2018 at 6:02 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
>
> On 06/03/2018 10:58 PM, PKU.孙斌 wrote:
>> On Sun, Jun 03, 2018 at 03:41:08PM -0700, Eric Dumazet wrote:
>>>
>>>
>>> On 06/03/2018 01:37 PM, Tom Herbert wrote:
>>>
>>>> This is not an inconsequential mechanism that is being proposed. It's
>>>> a modification to IP protocol that is intended to work on the
>>>> Internet, but it looks like the draft hasn't been updated for two
>>>> years and it is not adopted by any IETF working group. I don't see how
>>>> this can go anywhere without IETF support. Also, I suggest that you
>>>> look at the IPv10 proposal since that was very similar in intent. One
>>>> of the reasons that IPv10 shot down was because protocol transition
>>>> mechanisms were more interesting ten years ago than today. IPv6 has
>>>> good traction now. In fact, it's probably the case that it's now
>>>> easier to bring up IPv6 than to try to make IPv4 options work over the
>>>> Internet.
>>>
>>> +1
>>>
>>> Many hosts do not use IPv4 anymore.
>>>
>>> We even have the project making IPv4 support in linux optional.
>>
>> I guess then Linux kernel wouldn't be able to boot itself without IPv4 built in, e.g., when we only have old L2 links (without the IPv6 frame type)...
>
>
>
> *Optional* means that a CONFIG_IPV4 would be there, and some people could build a kernel with CONFIG_IPV4=n,
>
> Like IPv6 is optional today.
>
> Of course, most distros will select CONFIG_IPV4=y  (as they probably select CONFIG_IPV6=y today)
>
> Do not worry, IPv4 is not dead, but I doubt Enhanced IP v1.4 has any chance,
> it is at least 10 years too late.

There's also https://www.theregister.co.uk/2018/05/30/internet_engineers_united_nations_ipv6/.
We're reaching the point where it's the transition mechnanisms that
are hampering IPv6 adoption.

Tom

^ permalink raw reply

* Re: [bpf PATCH v2] bpf: sockmap, fix crash when ipv6 sock is added
From: Daniel Borkmann @ 2018-06-04 14:55 UTC (permalink / raw)
  To: John Fastabend, Eric Dumazet, edumazet, ast, Dave Watson; +Cc: netdev
In-Reply-To: <aeff8901-2ca4-99b6-6238-93703946b233@gmail.com>

On 06/04/2018 03:57 PM, John Fastabend wrote:
> On 06/04/2018 06:39 AM, Daniel Borkmann wrote:
>> On 06/02/2018 11:39 PM, John Fastabend wrote:
>>> On 06/01/2018 12:58 PM, Eric Dumazet wrote:
>>>> On 06/01/2018 03:46 PM, John Fastabend wrote:
>>>>> This fixes a crash where we assign tcp_prot to IPv6 sockets instead
>>>>> of tcpv6_prot.
>>>>
>>>> ...
>>>>
>>>>> +	/* ULPs are currently supported only for TCP sockets in ESTABLISHED
>>>>> +	 * state. Supporting sockets in LISTEN state will require us to
>>>>> +	 * modify the accept implementation to clone rather then share the
>>>>> +	 * ulp context.
>>>>> +	 */
>>>>> +	if (sock->sk_state != TCP_ESTABLISHED)
>>>>> +		return -ENOTSUPP;
>>>>> +
>>>>>  	/* 1. If sock map has BPF programs those will be inherited by the
>>>>>  	 * sock being added. If the sock is already attached to BPF programs
>>>>>  	 * this results in an error.
>>>>
>>>> Next question will be then : What happens if syzbot uses tcp_disconnect() and then listen() ?
>>>
>>> Yep we need to fix that as well :( Looks like we can plumb the
>>> unhash callback and remove it from the sockmap when the socket
>>> goes through tcp_disconnect().
>>>
>>> This patch should go in as-is though and we can fix the disconnect
>>> issue with a new patch.
>>>
>>> Adding Dave Watson to the thread as well because I'm guessing
>>> the disconnect() case is also applicable to TLS. At least I see
>>> a hw handler for unhash but there does not appear to be a handler
>>> in the SW case, at least from a quick glance.
>>>
>>> Thanks again!
>>
>> Given the discussion and fixes weren't resolved resp. ready in time for 4.17,
>> and last bpf pr for it went out last week, we need to route this via -stable
>> once all is hashed out.
> 
> OK.
> 
>> This fix here therefore needs to be rebased against bpf-next tree, and as far
>> as I can see another fix for hash map is also needed to address the same issue.
>>
> 
> This fix works for both sockmap and sockhash because they use the same
> ulp register and init paths. But, will rebase for net-next and send out
> this morning.

Ok, right, because in bpf-next this eventually goes into __sock_map_ctx_update_elem()
instead of sock_map_ctx_update_elem() call site.

>> After that, likely also fixes for the disconnect + listen case are needed.
> 
> Yep will have a fix today for this.
> 
>> (I can use the one here later on for -stable backport, but given merge window
>> is open this needs a rebase and a resolution for hash map.)
> 
> hash map is also resolved with the same patch but please do queue this
> up for -stable.

Will do, thanks!

^ permalink raw reply

* Re: [PATCH rdma-next v3 00/14] Verbs flow counters support
From: Jason Gunthorpe @ 2018-06-04 14:53 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Doug Ledford, RDMA mailing list, Boris Pismenny, Matan Barak,
	Michael J . Ruhl, Or Gerlitz, Raed Salem, Yishai Hadas,
	Saeed Mahameed, linux-netdev
In-Reply-To: <20180602050419.GH2843@mtr-leonro.mtl.com>

On Sat, Jun 02, 2018 at 08:04:19AM +0300, Leon Romanovsky wrote:
> On Fri, Jun 01, 2018 at 03:11:49PM -0600, Jason Gunthorpe wrote:
> > On Thu, May 31, 2018 at 04:43:27PM +0300, Leon Romanovsky wrote:
> > > From: Leon Romanovsky <leonro@mellanox.com>
> > >
> > > Changelog:
> > > v2->v3:
> > >  * Change function mlx5_fc_query signature to hide the details of
> > >    internal core driver struct mlx5_fc
> > >  * Add commen to data[] field at struct mlx5_ib_flow_counters_data (mlx5-abi.h)
> > >  * Use array of struct mlx5_ib_flow_counters_desc to clarify the output
> > > v1->v2:
> > >  * Removed conversion from struct mlx5_fc* to void*
> > >  * Fixed one place with double space in it
> > >  * Balanced release of hardware handler in case of counters allocation failure
> > >  * Added Tested-by
> > >  * Minimize time spent holding mutex lock
> > >  * Fixed deadlock caused by nested lock in error path
> > >  * Protect from handler pointer derefence in the error paths
> >
> > Okay,
> >
> > Acked-by: Jason Gunthorpe <jgg@mellanox.com>
> >
> > I've revised some of the commit messages, fixed the two bad
> > check-patch warnings, and fixed the patch ordering..
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git/log/?h=wip/jgg-counters
> >
> > Please send a PR with the mlx-core bits and above commits.
> 
> Hi,
> 
> I applied two mlx5-next commits to the relevant tree:
> https://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git/commit/?h=mlx5-next&id=930821e39d0a5f91ed58fea1692afe04f0fe0e1f
> https://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git/commit/?h=mlx5-next&id=5f9bf63ae80c4d0e5e986b6c1280bf8174978545
> 
> In first commit, I dropped the words "as used to be", per-Saeed's request.
> 
> The proper signed tag for whole the series is: verbs_flow_counters
> git://git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdma.git tags/verbs_flow_counters

Okay pulled, thanks

Jason

^ permalink raw reply

* RE: [PATCH net-next] qed: use dma_zalloc_coherent instead of allocator/memset
From: Tayar, Tomer @ 2018-06-04 14:53 UTC (permalink / raw)
  To: YueHaibing, davem@davemloft.net, Elior, Ariel
  Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	Dept-Eng Everest Linux L2
In-Reply-To: <20180604131031.24476-1-yuehaibing@huawei.com>

From: YueHaibing [mailto:yuehaibing@huawei.com]
Sent: Monday, June 04, 2018 4:11 PM

> Use dma_zalloc_coherent instead of dma_alloc_coherent
> followed by memset 0.
> 
> Signed-off-by: YueHaibing <yuehaibing@huawei.com>

Thanks
Acked-by: Tomer Tayar <Tomer.Tayar@cavium.com>

^ permalink raw reply

* Re: [PATCH 2/2 net] team: use netdev_features_t instead of u32
From: Jiri Pirko @ 2018-06-04 14:49 UTC (permalink / raw)
  To: Dan Carpenter; +Cc: David S. Miller, netdev, kernel-janitors
In-Reply-To: <20180604144601.wo5iqb5irfes4frv@kili.mountain>

Mon, Jun 04, 2018 at 04:46:01PM CEST, dan.carpenter@oracle.com wrote:
>This code was introduced in 2011 around the same time that we made
>netdev_features_t a u64 type.  These days a u32 is not big enough to
>hold all the potential features.
>
>Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>

Acked-by: Jiri Pirko <jiri@mellanox.com>

^ permalink raw reply

* [PATCH 2/2 net] team: use netdev_features_t instead of u32
From: Dan Carpenter @ 2018-06-04 14:46 UTC (permalink / raw)
  To: Jiri Pirko; +Cc: David S. Miller, netdev, kernel-janitors
In-Reply-To: <20180604.093147.1707102168081704551.davem@davemloft.net>

This code was introduced in 2011 around the same time that we made
netdev_features_t a u64 type.  These days a u32 is not big enough to
hold all the potential features.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>

diff --git a/drivers/net/team/team.c b/drivers/net/team/team.c
index 267dcc929f6c..8863fa023500 100644
--- a/drivers/net/team/team.c
+++ b/drivers/net/team/team.c
@@ -1004,7 +1004,8 @@ static void team_port_disable(struct team *team,
 static void __team_compute_features(struct team *team)
 {
 	struct team_port *port;
-	u32 vlan_features = TEAM_VLAN_FEATURES & NETIF_F_ALL_FOR_ALL;
+	netdev_features_t vlan_features = TEAM_VLAN_FEATURES &
+					  NETIF_F_ALL_FOR_ALL;
 	netdev_features_t enc_features  = TEAM_ENC_FEATURES;
 	unsigned short max_hard_header_len = ETH_HLEN;
 	unsigned int dst_release_flag = IFF_XMIT_DST_RELEASE |

^ permalink raw reply related

* [PATCH 1/2 v2 net-next] net_failover: Use netdev_features_t instead of u32
From: Dan Carpenter @ 2018-06-04 14:43 UTC (permalink / raw)
  To: David S. Miller, Sridhar Samudrala; +Cc: netdev, kernel-janitors
In-Reply-To: <20180604.093147.1707102168081704551.davem@davemloft.net>

The features mask needs to be a netdev_features_t (u64) because a u32
is not big enough.

Fixes: cfc80d9a1163 ("net: Introduce net_failover driver")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
---
v2: In the original patch, I thought that the & should be | and I
    introduced a bug.

diff --git a/drivers/net/net_failover.c b/drivers/net/net_failover.c
index 8b508e2cf29b..83f7420ddea5 100644
--- a/drivers/net/net_failover.c
+++ b/drivers/net/net_failover.c
@@ -380,7 +380,8 @@ static rx_handler_result_t net_failover_handle_frame(struct sk_buff **pskb)
 
 static void net_failover_compute_features(struct net_device *dev)
 {
-	u32 vlan_features = FAILOVER_VLAN_FEATURES & NETIF_F_ALL_FOR_ALL;
+	netdev_features_t vlan_features = FAILOVER_VLAN_FEATURES &
+					  NETIF_F_ALL_FOR_ALL;
 	netdev_features_t enc_features  = FAILOVER_ENC_FEATURES;
 	unsigned short max_hard_header_len = ETH_HLEN;
 	unsigned int dst_release_flag = IFF_XMIT_DST_RELEASE |

^ permalink raw reply related

* Re: [PATCH net-next] MAINTAINERS: TCP gets its first maintainer
From: David Miller @ 2018-06-04 14:32 UTC (permalink / raw)
  To: edumazet; +Cc: netdev, eric.dumazet
In-Reply-To: <20180604135029.241753-1-edumazet@google.com>

From: Eric Dumazet <edumazet@google.com>
Date: Mon,  4 Jun 2018 06:50:29 -0700

> Signed-off-by: Eric Dumazet <edumazet@google.com>

Thanks a lot Eric, applied to net-next. :-)

^ permalink raw reply

* Re: [PATCH net-next] MAINTAINERS: TCP gets its first maintainer
From: Jiri Pirko @ 2018-06-04 14:31 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David S . Miller, netdev, Eric Dumazet
In-Reply-To: <20180604135029.241753-1-edumazet@google.com>

Mon, Jun 04, 2018 at 03:50:29PM CEST, edumazet@google.com wrote:
>Signed-off-by: Eric Dumazet <edumazet@google.com>
>---
> MAINTAINERS | 13 +++++++++++++
> 1 file changed, 13 insertions(+)
>
>diff --git a/MAINTAINERS b/MAINTAINERS
>index 0ae0dbf0e15e74febca1b3469098a08704b59594..70d61c2b1be46c0927ae6648c644b8c7828cce48 100644
>--- a/MAINTAINERS
>+++ b/MAINTAINERS
>@@ -9862,6 +9862,19 @@ F:	net/ipv6/calipso.c
> F:	net/netfilter/xt_CONNSECMARK.c
> F:	net/netfilter/xt_SECMARK.c
> 
>+NETWORKING [TCP]
>+M:	Eric Dumazet <edumazet@google.com>

May the Force be with you...


>+L:	netdev@vger.kernel.org
>+S:	Maintained
>+F:	net/ipv4/tcp*.c
>+F:	net/ipv4/syncookies.c
>+F:	net/ipv6/tcp*.c
>+F:	net/ipv6/syncookies.c
>+F:	include/uapi/linux/tcp.h
>+F:	include/net/tcp.h
>+F:	include/linux/tcp.h
>+F:	include/trace/events/tcp.h
>+
> NETWORKING [TLS]
> M:	Boris Pismenny <borisp@mellanox.com>
> M:	Aviad Yehezkel <aviadye@mellanox.com>
>-- 
>2.17.1.1185.g55be947832-goog
>

^ permalink raw reply

* Re: [PATCH] net: ethernet: mlx4: Remove unnecessary parentheses
From: David Miller @ 2018-06-04 14:15 UTC (permalink / raw)
  To: rvarsha016
  Cc: tariqt, der.herr, lukas.bulwahn, netdev, linux-rdma, linux-kernel
In-Reply-To: <20180601020049.3704-1-rvarsha016@gmail.com>

From: Varsha Rao <rvarsha016@gmail.com>
Date: Fri,  1 Jun 2018 07:30:49 +0530

> This patch fixes the clang warning of extraneous parentheses, with the
> following coccinelle script.
 ...
> Suggested-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
> Signed-off-by: Varsha Rao <rvarsha016@gmail.com>

Applied to net-next, thanks.

^ permalink raw reply

* Re: [PATCH v3 net-next] net: stmmac: Add Flexible PPS support
From: David Miller @ 2018-06-04 14:13 UTC (permalink / raw)
  To: Jose.Abreu
  Cc: netdev, Joao.Pinto, Vitor.Soares, peppe.cavallaro,
	alexandre.torgue, richardcochran
In-Reply-To: <6f0f69081c8352845da413f2737f313d7904d3ee.1527785912.git.joabreu@synopsys.com>

From: Jose Abreu <Jose.Abreu@synopsys.com>
Date: Thu, 31 May 2018 18:01:27 +0100

> This adds support for Flexible PPS output (which is equivalent
> to per_out output of PTP subsystem).
> 
> Tested using an oscilloscope and the following commands:
> 
> 1) Start PTP4L:
> 	# ptp4l -A -4 -H -m -i eth0 &
> 2) Set Flexible PPS frequency:
> 	# echo <idx> <ts> <tns> <ps> <pns> > /sys/class/ptp/ptpX/period
> 
> Where, ts/tns is start time and ps/pns is period time, and ptpX is ptp
> of eth0.
> 
> Signed-off-by: Jose Abreu <joabreu@synopsys.com>
> Cc: David S. Miller <davem@davemloft.net>
> Cc: Joao Pinto <jpinto@synopsys.com>
> Cc: Vitor Soares <soares@synopsys.com>
> Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
> Cc: Alexandre Torgue <alexandre.torgue@st.com>
> Cc: Richard Cochran <richardcochran@gmail.com>
> ---
> Changes from v2:
> 	- Remove PPS support as we can't input the event to PTP
> 	subsystem
> Changes from v1:
> 	- Correct kbuild errors in some archs

Applied, thanks.

^ permalink raw reply

* Re: [PATCH net-next v2 0/2] qed: Fix issues in UFP feature commit 'cac6f691'.
From: David Miller @ 2018-06-04 14:11 UTC (permalink / raw)
  To: sudarsana.kalluru; +Cc: netdev, Ariel.Elior, Michal.Kalderon
In-Reply-To: <20180601014737.6164-1-sudarsana.kalluru@cavium.com>

From: Sudarsana Reddy Kalluru <sudarsana.kalluru@cavium.com>
Date: Thu, 31 May 2018 18:47:35 -0700

> From: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
> 
> This patch series fixes couple of issues in the UFP feature commit,
>    cac6f691: Add support for Unified Fabric Port.
> 
> Changes from previous version:
> ------------------------------
> v2: Added "Fixes:" tag.
> 
> Please consider applying it to "net-next".

Series applied, thank you.

^ permalink raw reply

* Re: [PATCH net-next 0/2] selftests: forwarding: mirror_vlan: Fixlets
From: David Miller @ 2018-06-04 14:09 UTC (permalink / raw)
  To: petrm; +Cc: netdev, linux-kselftest, shuah, idosch
In-Reply-To: <cover.1527805500.git.petrm@mellanox.com>

From: Petr Machata <petrm@mellanox.com>
Date: Fri, 01 Jun 2018 00:37:29 +0200

> This patchset includes two small fixes for the tests that were
> introduced in commit 1bb58d2d3cbe ("Merge branch
> 'Mirroring-tests-involving-VLAN'").
> 
> In patch #1, a "tc action trap" is uninstalled after the suite runs,
> instead of being installed again.
> 
> In patch #2, a test in suite is renamed to differentiate it from another
> test of the same name.

Series applied, thank you.

^ permalink raw reply

* Re: [bpf PATCH v2] bpf: sockmap, fix crash when ipv6 sock is added
From: John Fastabend @ 2018-06-04 13:57 UTC (permalink / raw)
  To: Daniel Borkmann, Eric Dumazet, edumazet, ast, Dave Watson; +Cc: netdev
In-Reply-To: <c9eab906-6793-1e98-b9d8-01d665ac1c3c@iogearbox.net>

On 06/04/2018 06:39 AM, Daniel Borkmann wrote:
> Hey guys,
> 
> On 06/02/2018 11:39 PM, John Fastabend wrote:
>> On 06/01/2018 12:58 PM, Eric Dumazet wrote:
>>> On 06/01/2018 03:46 PM, John Fastabend wrote:
>>>> This fixes a crash where we assign tcp_prot to IPv6 sockets instead
>>>> of tcpv6_prot.
>>>
>>> ...
>>>
>>>> +	/* ULPs are currently supported only for TCP sockets in ESTABLISHED
>>>> +	 * state. Supporting sockets in LISTEN state will require us to
>>>> +	 * modify the accept implementation to clone rather then share the
>>>> +	 * ulp context.
>>>> +	 */
>>>> +	if (sock->sk_state != TCP_ESTABLISHED)
>>>> +		return -ENOTSUPP;
>>>> +
>>>>  	/* 1. If sock map has BPF programs those will be inherited by the
>>>>  	 * sock being added. If the sock is already attached to BPF programs
>>>>  	 * this results in an error.
>>>
>>> Next question will be then : What happens if syzbot uses tcp_disconnect() and then listen() ?
>>
>> Yep we need to fix that as well :( Looks like we can plumb the
>> unhash callback and remove it from the sockmap when the socket
>> goes through tcp_disconnect().
>>
>> This patch should go in as-is though and we can fix the disconnect
>> issue with a new patch.
>>
>> Adding Dave Watson to the thread as well because I'm guessing
>> the disconnect() case is also applicable to TLS. At least I see
>> a hw handler for unhash but there does not appear to be a handler
>> in the SW case, at least from a quick glance.
>>
>> Thanks again!
> 
> Given the discussion and fixes weren't resolved resp. ready in time for 4.17,
> and last bpf pr for it went out last week, we need to route this via -stable
> once all is hashed out.
> 

OK.

> This fix here therefore needs to be rebased against bpf-next tree, and as far
> as I can see another fix for hash map is also needed to address the same issue.
> 

This fix works for both sockmap and sockhash because they use the same
ulp register and init paths. But, will rebase for net-next and send out
this morning.

> After that, likely also fixes for the disconnect + listen case are needed.
> 

Yep will have a fix today for this.

> (I can use the one here later on for -stable backport, but given merge window
> is open this needs a rebase and a resolution for hash map.)
> 

hash map is also resolved with the same patch but please do queue this
up for -stable.


> Thanks,
> Daniel
> 

^ permalink raw reply

* Re: [bpf-next V2 PATCH 3/8] ixgbe: implement flush flag for ndo_xdp_xmit
From: Jesper Dangaard Brouer @ 2018-06-04 13:53 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: netdev, Daniel Borkmann, Alexei Starovoitov, liu.song.a23,
	songliubraving, John Fastabend, brouer
In-Reply-To: <156d6d45-8557-0303-edeb-10d04c2be474@iogearbox.net>

On Mon, 4 Jun 2018 15:19:05 +0200
Daniel Borkmann <daniel@iogearbox.net> wrote:

> > +static void ixgbe_xdp_ring_update_tail(struct ixgbe_ring *ring)
> > +{
> > +	/* Force memory writes to complete before letting h/w know there
> > +	 * are new descriptors to fetch.
> > +	 */
> > +	wmb();
> > +	writel(ring->next_to_use, ring->tail);
> > +}  
> 
> Did you double check that this doesn't become a function call? Should this
> get an __always_inline attribute?

I did check this doesn't become a function call.  The same kind of code
happens other places in the driver, but I choose not to generalize
this, exactly to avoid this becoming a function call ;-)

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply

* Re: [PATCH net-next 0/3] selftests/net: various
From: David Miller @ 2018-06-04 13:50 UTC (permalink / raw)
  To: willemdebruijn.kernel; +Cc: netdev, willemb
In-Reply-To: <20180531161440.89709-1-willemdebruijn.kernel@gmail.com>

From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Date: Thu, 31 May 2018 12:14:37 -0400

> From: Willem de Bruijn <willemb@google.com>
> 
> A few odds and ends to network tests:
> 
> - msg_zerocopy: run as part of kselftest
> - udp gso:      add missing bounds test for minimal sizes
> - psocket_snd:  initial basic conformance test

Always great to see new tests.

Series applied, thanks Willem.

^ permalink raw reply

* [PATCH net-next] MAINTAINERS: TCP gets its first maintainer
From: Eric Dumazet @ 2018-06-04 13:50 UTC (permalink / raw)
  To: David S . Miller; +Cc: netdev, Eric Dumazet, Eric Dumazet

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 MAINTAINERS | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 0ae0dbf0e15e74febca1b3469098a08704b59594..70d61c2b1be46c0927ae6648c644b8c7828cce48 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -9862,6 +9862,19 @@ F:	net/ipv6/calipso.c
 F:	net/netfilter/xt_CONNSECMARK.c
 F:	net/netfilter/xt_SECMARK.c
 
+NETWORKING [TCP]
+M:	Eric Dumazet <edumazet@google.com>
+L:	netdev@vger.kernel.org
+S:	Maintained
+F:	net/ipv4/tcp*.c
+F:	net/ipv4/syncookies.c
+F:	net/ipv6/tcp*.c
+F:	net/ipv6/syncookies.c
+F:	include/uapi/linux/tcp.h
+F:	include/net/tcp.h
+F:	include/linux/tcp.h
+F:	include/trace/events/tcp.h
+
 NETWORKING [TLS]
 M:	Boris Pismenny <borisp@mellanox.com>
 M:	Aviad Yehezkel <aviadye@mellanox.com>
-- 
2.17.1.1185.g55be947832-goog

^ permalink raw reply related

* Re: [PATCH 09/10] dpaa_eth: add support for hardware timestamping
From: Richard Cochran @ 2018-06-04 13:49 UTC (permalink / raw)
  To: Yangbo Lu
  Cc: netdev, madalin.bucur, Rob Herring, Shawn Guo, David S . Miller,
	devicetree, linuxppc-dev, linux-arm-kernel, linux-kernel
In-Reply-To: <20180604070837.19265-10-yangbo.lu@nxp.com>

On Mon, Jun 04, 2018 at 03:08:36PM +0800, Yangbo Lu wrote:

> +if FSL_DPAA_ETH
> +config FSL_DPAA_ETH_TS
> +	bool "DPAA hardware timestamping support"
> +	select PTP_1588_CLOCK_QORIQ
> +	default n
> +	help
> +	  Enable DPAA hardware timestamping support.
> +	  This option is useful for applications to get
> +	  hardware time stamps on the Ethernet packets
> +	  using the SO_TIMESTAMPING API.
> +endif

You should drop this #ifdef.  In general, if a MAC supports time
stamping and PHC, then the driver support should simply be compiled
in.

[ When time stamping incurs a large run time performance penalty to
  non-PTP users, then it might make sense to have a Kconfig option to
  disable it, but that doesn't appear to be the case here. ]

> @@ -1615,6 +1635,24 @@ static int dpaa_eth_refill_bpools(struct dpaa_priv *priv)
>  	skbh = (struct sk_buff **)phys_to_virt(addr);
>  	skb = *skbh;
>  
> +#ifdef CONFIG_FSL_DPAA_ETH_TS
> +	if (priv->tx_tstamp &&
> +	    skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP) {

This condition fits on one line easily.

> +		struct skb_shared_hwtstamps shhwtstamps;
> +		u64 ns;

Local variables belong at the top of the function.

> +		memset(&shhwtstamps, 0, sizeof(shhwtstamps));
> +
> +		if (!dpaa_get_tstamp_ns(priv->net_dev, &ns,
> +					priv->mac_dev->port[TX],
> +					(void *)skbh)) {
> +			shhwtstamps.hwtstamp = ns_to_ktime(ns);
> +			skb_tstamp_tx(skb, &shhwtstamps);
> +		} else {
> +			dev_warn(dev, "dpaa_get_tstamp_ns failed!\n");
> +		}
> +	}
> +#endif
>  	if (unlikely(qm_fd_get_format(fd) == qm_fd_sg)) {
>  		nr_frags = skb_shinfo(skb)->nr_frags;
>  		dma_unmap_single(dev, addr, qm_fd_get_offset(fd) +
> @@ -2086,6 +2124,14 @@ static int dpaa_start_xmit(struct sk_buff *skb, struct net_device *net_dev)
>  	if (unlikely(err < 0))
>  		goto skb_to_fd_failed;
>  
> +#ifdef CONFIG_FSL_DPAA_ETH_TS
> +	if (priv->tx_tstamp &&
> +	    skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP) {

One line please.

> +		fd.cmd |= FM_FD_CMD_UPD;
> +		skb_shinfo(skb)->tx_flags |= SKBTX_IN_PROGRESS;
> +	}
> +#endif
> +
>  	if (likely(dpaa_xmit(priv, percpu_stats, queue_mapping, &fd) == 0))
>  		return NETDEV_TX_OK;
>  

Thanks,
Richard

^ permalink raw reply

* Re: [bpf PATCH v2] bpf: sockmap, fix crash when ipv6 sock is added
From: Daniel Borkmann @ 2018-06-04 13:39 UTC (permalink / raw)
  To: John Fastabend, Eric Dumazet, edumazet, ast, Dave Watson; +Cc: netdev
In-Reply-To: <81abd5f7-5343-a27a-6715-8b413f6c5a27@gmail.com>

Hey guys,

On 06/02/2018 11:39 PM, John Fastabend wrote:
> On 06/01/2018 12:58 PM, Eric Dumazet wrote:
>> On 06/01/2018 03:46 PM, John Fastabend wrote:
>>> This fixes a crash where we assign tcp_prot to IPv6 sockets instead
>>> of tcpv6_prot.
>>
>> ...
>>
>>> +	/* ULPs are currently supported only for TCP sockets in ESTABLISHED
>>> +	 * state. Supporting sockets in LISTEN state will require us to
>>> +	 * modify the accept implementation to clone rather then share the
>>> +	 * ulp context.
>>> +	 */
>>> +	if (sock->sk_state != TCP_ESTABLISHED)
>>> +		return -ENOTSUPP;
>>> +
>>>  	/* 1. If sock map has BPF programs those will be inherited by the
>>>  	 * sock being added. If the sock is already attached to BPF programs
>>>  	 * this results in an error.
>>
>> Next question will be then : What happens if syzbot uses tcp_disconnect() and then listen() ?
> 
> Yep we need to fix that as well :( Looks like we can plumb the
> unhash callback and remove it from the sockmap when the socket
> goes through tcp_disconnect().
> 
> This patch should go in as-is though and we can fix the disconnect
> issue with a new patch.
> 
> Adding Dave Watson to the thread as well because I'm guessing
> the disconnect() case is also applicable to TLS. At least I see
> a hw handler for unhash but there does not appear to be a handler
> in the SW case, at least from a quick glance.
> 
> Thanks again!

Given the discussion and fixes weren't resolved resp. ready in time for 4.17,
and last bpf pr for it went out last week, we need to route this via -stable
once all is hashed out.

This fix here therefore needs to be rebased against bpf-next tree, and as far
as I can see another fix for hash map is also needed to address the same issue.

After that, likely also fixes for the disconnect + listen case are needed.

(I can use the one here later on for -stable backport, but given merge window
is open this needs a rebase and a resolution for hash map.)

Thanks,
Daniel

^ permalink raw reply

* Re: [PATCH] net: virtio: simplify the virtnet_find_vqs
From: David Miller @ 2018-06-04 13:33 UTC (permalink / raw)
  To: xiangxia.m.yue; +Cc: netdev
In-Reply-To: <1527776192-26928-1-git-send-email-xiangxia.m.yue@gmail.com>

From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Date: Thu, 31 May 2018 07:16:32 -0700

> Use the common free functions while return successfully.
> 
> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>

This looks fine, applied, thanks.

^ permalink raw reply

* Re: [PATCH 1/2 net-next] net_failover: fix net_failover_compute_features()
From: David Miller @ 2018-06-04 13:31 UTC (permalink / raw)
  To: dan.carpenter; +Cc: sridhar.samudrala, netdev, kernel-janitors
In-Reply-To: <20180531120124.pc4txiifxnrslbei@kili.mountain>

From: Dan Carpenter <dan.carpenter@oracle.com>
Date: Thu, 31 May 2018 15:01:25 +0300

> @@ -380,7 +380,8 @@ static rx_handler_result_t net_failover_handle_frame(struct sk_buff **pskb)
>  
>  static void net_failover_compute_features(struct net_device *dev)
>  {
> -	u32 vlan_features = FAILOVER_VLAN_FEATURES & NETIF_F_ALL_FOR_ALL;
> +	netdev_features_t vlan_features = FAILOVER_VLAN_FEATURES |
> +					  NETIF_F_ALL_FOR_ALL;

The type does need to be corrected to netdev_features_t, but the
logical operation is correct.

It's a policy operation that was simply by-hand propagated all
over the place where these kinds of calculations are performed.

So vlan_features is starting with a value of 0 intentionally.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox