* [PATCH 2/2] xfrm: Fix return value check of copy_sec_ctx.
From: Steffen Klassert @ 2017-09-01 7:30 UTC (permalink / raw)
To: David Miller; +Cc: Herbert Xu, Steffen Klassert, netdev
In-Reply-To: <1504251046-17106-1-git-send-email-steffen.klassert@secunet.com>
A recent commit added an output_mark. When copying
this output_mark, the return value of copy_sec_ctx
is overwitten without a check. Fix this by copying
the output_mark before the security context.
Fixes: 077fbac405bf ("net: xfrm: support setting an output mark.")
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
---
net/xfrm/xfrm_user.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c
index cc3268d..490132d 100644
--- a/net/xfrm/xfrm_user.c
+++ b/net/xfrm/xfrm_user.c
@@ -900,13 +900,13 @@ static int copy_to_user_state_extra(struct xfrm_state *x,
ret = copy_user_offload(&x->xso, skb);
if (ret)
goto out;
- if (x->security)
- ret = copy_sec_ctx(x->security, skb);
if (x->props.output_mark) {
ret = nla_put_u32(skb, XFRMA_OUTPUT_MARK, x->props.output_mark);
if (ret)
goto out;
}
+ if (x->security)
+ ret = copy_sec_ctx(x->security, skb);
out:
return ret;
}
--
2.7.4
^ permalink raw reply related
* pull request (net-next): ipsec-next 2017-09-01
From: Steffen Klassert @ 2017-09-01 7:30 UTC (permalink / raw)
To: David Miller; +Cc: Herbert Xu, Steffen Klassert, netdev
This should be the last ipsec-next pull request for this
release cycle:
1) Support netdevice ESP trailer removal when decryption
is offloaded. From Yossi Kuperman.
2) Fix overwritten return value of copy_sec_ctx().
Please pull or let me know if there are problems.
Thanks!
The following changes since commit acfb98b99647aa7dc7c111db52d5f4199d2b641f:
liquidio: fix crash in presence of zeroed-out base address regs (2017-08-30 22:07:09 -0700)
are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next.git master
for you to fetch changes up to 8598112d04af21cf6c895670e72dcb8a9f58e74f:
xfrm: Fix return value check of copy_sec_ctx. (2017-08-31 10:37:00 +0200)
----------------------------------------------------------------
Steffen Klassert (1):
xfrm: Fix return value check of copy_sec_ctx.
Yossi Kuperman (1):
xfrm: Add support for network devices capable of removing the ESP trailer
include/net/xfrm.h | 1 +
net/ipv4/esp4.c | 70 ++++++++++++++++++++++++++++++++++-----------------
net/ipv6/esp6.c | 51 ++++++++++++++++++++++++++-----------
net/xfrm/xfrm_input.c | 5 ++++
net/xfrm/xfrm_user.c | 4 +--
5 files changed, 91 insertions(+), 40 deletions(-)
^ permalink raw reply
* Re: [RFC PATCH] net: frag limit checks need to use percpu_counter_compare
From: Jesper Dangaard Brouer @ 2017-09-01 7:16 UTC (permalink / raw)
To: liujian (CE)
Cc: Michal Kubecek, netdev@vger.kernel.org, Florian Westphal, brouer
In-Reply-To: <4F88C5DDA1E80143B232E89585ACE27D018F6A9B@DGGEMA502-MBX.china.huawei.com>
On Fri, 1 Sep 2017 02:25:32 +0000 "liujian (CE)" <liujian56@huawei.com> wrote:
> > -----Original Message-----
> > From: Michal Kubecek [mailto:mkubecek@suse.cz]
> > Sent: Friday, September 01, 2017 12:24 AM
> > To: Jesper Dangaard Brouer
> > Cc: liujian (CE); netdev@vger.kernel.org; Florian Westphal
> > Subject: Re: [RFC PATCH] net: frag limit checks need to use
> > percpu_counter_compare
> >
> > On Thu, Aug 31, 2017 at 12:20:19PM +0200, Jesper Dangaard Brouer wrote:
> > > To: Liujian can you please test this patch?
> > > I want to understand if using __percpu_counter_compare() solves the
> > > problem correctness wise (even-though this will be slower than using
> > > a simple atomic_t on your big system).
>
> I have test the patch, it can work.
Thanks for confirming this.
> 1. make sure frag_mem_limit reach to thresh
> ===>FRAG: inuse 0 memory 0 frag_mem_limit 5386864
> 2. change NIC rx irq's affinity to a fixed CPU
If you pin the NIC RX queue to a single CPU, then the error issue
basically cannot happen. Different CPU need to have a chance to "own"
part of the percpu_counter. I guess default setup with irqbalance
could eventually screw the percpu_counter enough given enough CPUs, or
a network load with enough different L2-headers to high different RX
queues.
> 3. iperf -u -c 9.83.1.41 -l 10000 -i 1 -t 1000 -P 10 -b 20M
> And check /proc/net/snmp, there are no ReasmFails.
My quick check command is:
nstat > /dev/null && sleep 1 && nstat && grep FRAG /proc/net/sockstat
> And I think it is a better way that adding some counter sync points
> as you said.
I've discussed this offlist with Florian, while it is doable, we are
adding too much complexity for something that can be solved much
simpler with an atomic_t (as before my patch). Thus, I'm now looking
at reverting my original change (commit 6d7b857d541e ("net: use
lib/percpu_counter API for fragmentation mem accounting")).
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
LinkedIn: http://www.linkedin.com/in/brouer
^ permalink raw reply
* Re: tip -ENOBOOT - bisected to locking/refcounts, x86/asm: Implement fast refcount overflow protection
From: Mike Galbraith @ 2017-09-01 6:57 UTC (permalink / raw)
To: Kees Cook
Cc: David S. Miller, Peter Zijlstra, LKML, Ingo Molnar,
Reshetova, Elena, Network Development
In-Reply-To: <CAGXu5j+RPDAP-dK+dizQV4prmWBhqU_G1PccWpME=924-2985w@mail.gmail.com>
On Thu, 2017-08-31 at 11:45 -0700, Kees Cook wrote:
> On Thu, Aug 31, 2017 at 10:19 AM, Mike Galbraith <efault@gmx.de> wrote:
> > On Thu, 2017-08-31 at 10:00 -0700, Kees Cook wrote:
> >>
> >> Oh! So it's gcc-version sensitive? That's alarming. Is this mapping correct:
> >>
> >> 4.8.5: WARN, eventual kernel hang
> >> 6.3.1, 7.0.1: WARN, but continues working
> >
> > Yeah, that's correct. I find that troubling, simply because this gcc
> > version has been through one hell of a lot of kernels with me. Yeah, I
> > know, that doesn't exempt it from having bugs, but color me suspicious.
>
> I still can't hit this with a 4.8.5 build. :(
>
> With _RATELIMIT removed, this should, in theory, report whatever goes
> negative first...
I applied the other patch you posted, and built with gcc-6.3.1 to
remove the gcc-4.8.5 aspect. Look below the resulting splat.
[ 1.293962] NET: Registered protocol family 10
[ 1.294635] refcount_t silent saturation at in6_dev_get+0x25/0x104 in swapper/0[1], uid/euid: 0/0
[ 1.295616] ------------[ cut here ]------------
[ 1.296120] WARNING: CPU: 0 PID: 1 at kernel/panic.c:612 refcount_error_report+0x94/0x9e
[ 1.296950] Modules linked in:
[ 1.297276] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.13.0.g152d54a-tip-default #53
[ 1.299179] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014
[ 1.300743] task: ffff88013ab84040 task.stack: ffffc9000062c000
[ 1.301825] RIP: 0010:refcount_error_report+0x94/0x9e
[ 1.302804] RSP: 0018:ffffc9000062fc10 EFLAGS: 00010282
[ 1.303791] RAX: 0000000000000055 RBX: ffffffff81a34274 RCX: ffffffff81c605e8
[ 1.304991] RDX: 0000000000000001 RSI: 0000000000000096 RDI: 0000000000000246
[ 1.306189] RBP: ffffc9000062fd58 R08: 0000000000000000 R09: 0000000000000175
[ 1.307392] R10: 0000000000000000 R11: 0000000000000001 R12: ffff88013ab84040
[ 1.308583] R13: 0000000000000000 R14: 0000000000000004 R15: ffffffff81a256c8
[ 1.309768] FS: 0000000000000000(0000) GS:ffff88013fc00000(0000) knlGS:0000000000000000
[ 1.311052] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1.312100] CR2: 00007f4631fe8df0 CR3: 0000000137d09003 CR4: 00000000001606f0
[ 1.313301] Call Trace:
[ 1.314012] ex_handler_refcount+0x63/0x70
[ 1.314893] fixup_exception+0x32/0x40
[ 1.315737] do_trap+0x8c/0x170
[ 1.316519] do_error_trap+0x70/0xd0
[ 1.317340] ? in6_dev_get+0x23/0x104
[ 1.318172] ? netlink_broadcast_filtered+0x2bd/0x430
[ 1.319156] ? kmem_cache_alloc_trace+0xce/0x5d0
[ 1.320098] ? set_debug_rodata+0x11/0x11
[ 1.320964] invalid_op+0x1e/0x30
[ 1.322520] RIP: 0010:in6_dev_get+0x25/0x104
[ 1.323631] RSP: 0018:ffffc9000062fe00 EFLAGS: 00010202
[ 1.324614] RAX: ffff880137de2400 RBX: ffff880137df4600 RCX: ffff880137de24f0
[ 1.325793] RDX: ffff88013a5e4000 RSI: 00000000fffffe00 RDI: ffff88013a5e4000
[ 1.326964] RBP: 00000000000000d1 R08: 0000000000000000 R09: ffff880137de7600
[ 1.328150] R10: 0000000000000000 R11: ffff8801398a4df8 R12: 0000000000000000
[ 1.329374] R13: ffffffff82137872 R14: 014200ca00000000 R15: 0000000000000000
[ 1.330547] ? set_debug_rodata+0x11/0x11
[ 1.331392] ip6_route_init_special_entries+0x2a/0x89
[ 1.332369] addrconf_init+0x9e/0x203
[ 1.333173] inet6_init+0x1af/0x365
[ 1.333956] ? af_unix_init+0x4e/0x4e
[ 1.334753] do_one_initcall+0x4e/0x190
[ 1.335555] ? set_debug_rodata+0x11/0x11
[ 1.336369] kernel_init_freeable+0x189/0x20e
[ 1.337230] ? rest_init+0xd0/0xd0
[ 1.337999] kernel_init+0xa/0xf7
[ 1.338744] ret_from_fork+0x25/0x30
[ 1.339500] Code: 48 8b 95 80 00 00 00 41 55 49 8d 8c 24 f0 0a 00 00 45 8b 84 24 10 09 00 00 41 89 c1 48 89 de 48 c7 c7 60 7a a3 81 e8 07 de 05 00 <0f> ff 58 5b 5d 41 5c 41 5d c3 0f 1f 44 00 00 55 48 89 e5 41 56
[ 1.342243] ---[ end trace b5d40c0fccce776c ]---
Back yours out, and...
# tracer: nop
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
swapper/0-1 [000] ...1 1.974114: in6_dev_getx: refs.counter:-1073741824
swapper/0-1 [000] ...1 1.974116: in6_dev_getx: refs.counter:-1073741824
---
arch/x86/include/asm/refcount.h | 9 +++++++++
include/net/addrconf.h | 12 ++++++++++++
net/ipv6/route.c | 4 ++--
3 files changed, 23 insertions(+), 2 deletions(-)
--- a/arch/x86/include/asm/refcount.h
+++ b/arch/x86/include/asm/refcount.h
@@ -55,6 +55,15 @@ static __always_inline void refcount_inc
: : "cc", "cx");
}
+static __always_inline void refcount_inc_x(refcount_t *r)
+{
+ trace_printk("refs.counter:%d\n", r->refs.counter);
+ asm volatile(LOCK_PREFIX "incl %0\n\t"
+ REFCOUNT_CHECK_LT_ZERO
+ : [counter] "+m" (r->refs.counter)
+ : : "cc", "cx");
+}
+
static __always_inline void refcount_dec(refcount_t *r)
{
asm volatile(LOCK_PREFIX "decl %0\n\t"
--- a/include/net/addrconf.h
+++ b/include/net/addrconf.h
@@ -321,6 +321,18 @@ static inline struct inet6_dev *in6_dev_
return idev;
}
+static inline struct inet6_dev *in6_dev_getx(const struct net_device *dev)
+{
+ struct inet6_dev *idev;
+
+ rcu_read_lock();
+ idev = rcu_dereference(dev->ip6_ptr);
+ if (idev)
+ refcount_inc_x(&idev->refcnt);
+ rcu_read_unlock();
+ return idev;
+}
+
static inline struct neigh_parms *__in6_dev_nd_parms_get_rcu(const struct net_device *dev)
{
struct inet6_dev *idev = __in6_dev_get(dev);
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -4044,9 +4044,9 @@ void __init ip6_route_init_special_entri
init_net.ipv6.ip6_null_entry->rt6i_idev = in6_dev_get(init_net.loopback_dev);
#ifdef CONFIG_IPV6_MULTIPLE_TABLES
init_net.ipv6.ip6_prohibit_entry->dst.dev = init_net.loopback_dev;
- init_net.ipv6.ip6_prohibit_entry->rt6i_idev = in6_dev_get(init_net.loopback_dev);
+ init_net.ipv6.ip6_prohibit_entry->rt6i_idev = in6_dev_getx(init_net.loopback_dev);
init_net.ipv6.ip6_blk_hole_entry->dst.dev = init_net.loopback_dev;
- init_net.ipv6.ip6_blk_hole_entry->rt6i_idev = in6_dev_get(init_net.loopback_dev);
+ init_net.ipv6.ip6_blk_hole_entry->rt6i_idev = in6_dev_getx(init_net.loopback_dev);
#endif
}
^ permalink raw reply
* Re: netdev watchdog enp1s0 (tg3): transmit queue 0 timed out
From: Frans van Berckel @ 2017-09-01 6:43 UTC (permalink / raw)
To: Michael Chan, Siva Reddy Kallam; +Cc: linux-netdev
In-Reply-To: <CACKFLi=TXmPpHt_XRECHS7OhyyEyWVi=jcps5FNpg01Kjw6vUw@mail.gmail.com>
Dear Michael and Siva,
On Thu, 2017-08-31 at 23:36 -0700, Michael Chan wrote:
> On Thu, Aug 31, 2017 at 11:10 PM, Frans van Berckel
> <fberckel@xs4all.nl> wrote:
> >
> > <snap> a long list of likely the same type of error codes.
> >
>
> Please post the entire register dump.
[ 237.169194] tg3 0000:01:00.0 enp1s0: Link is up at 1000 Mbps, full
duplex
[ 237.169335] tg3 0000:01:00.0 enp1s0: Flow control is on for TX and
on for RX
[ 237.169375] IPv6: ADDRCONF(NETDEV_CHANGE): enp1s0: link becomes
ready
[ 243.683910] tg3 0000:01:00.0 enp1s0: DMA Status error. Resetting
chip.
[ 243.759610] hrtimer: interrupt took 9464192 ns
[ 245.317566] tg3 0000:01:00.0 enp1s0: 0x00000000: 0x165914e4,
0x00100406, 0x02000021, 0x00000010
[ 245.476295] tg3 0000:01:00.0 enp1s0: 0x00000010: 0xefcf0004,
0x00000000, 0x00000000, 0x00000000
[ 245.476295] tg3 0000:01:00.0 enp1s0: 0x00000020: 0x00000000,
0x00000000, 0x00000000, 0x01eb1028
[ 245.476295] tg3 0000:01:00.0 enp1s0: 0x00000030: 0x00000000,
0x00000048, 0x00000000, 0x00000105
[ 245.476295] tg3 0000:01:00.0 enp1s0: 0x00000040: 0x00000000,
0x00000000, 0xc0025001, 0x64002000
[ 245.476295] tg3 0000:01:00.0 enp1s0: 0x00000050: 0x00fc5803,
0x80818283, 0x0087d005, 0xfee0300c
[ 245.476295] tg3 0000:01:00.0 enp1s0: 0x00000060: 0x00000000,
0x00004122, 0x42010298, 0x76180000
[ 245.476295] tg3 0000:01:00.0 enp1s0: 0x00000070: 0x000010f2,
0x000000a0, 0x00000000, 0x00000000
[ 245.476295] tg3 0000:01:00.0 enp1s0: 0x00000080: 0x165914e4,
0x3c1d0002, 0x04130034, 0x3c085082
[ 245.476295] tg3 0000:01:00.0 enp1s0: 0x00000090: 0x01009509,
0x00000000, 0x00000000, 0x000000ce
[ 245.476295] tg3 0000:01:00.0 enp1s0: 0x000000a0: 0x00000000,
0x00000006, 0x00000000, 0x0000000c
[ 245.476295] tg3 0000:01:00.0 enp1s0: 0x000000b0: 0x00000000,
0x00000001, 0x00000000, 0x00000001
[ 245.476295] tg3 0000:01:00.0 enp1s0: 0x000000c0: 0x00000000,
0x00000000, 0x0000000e, 0x00000000
[ 245.476295] tg3 0000:01:00.0 enp1s0: 0x000000d0: 0x00010010,
0x00000fa0, 0x00122104, 0x00036c11
[ 245.476295] tg3 0000:01:00.0 enp1s0: 0x000000e0: 0x10110000,
0x00000000, 0x00000000, 0x00000000
[ 245.476295] tg3 0000:01:00.0 enp1s0: 0x00000100: 0x13c10001,
0x00014000, 0x0011c000, 0x000e3010
[ 245.476295] tg3 0000:01:00.0 enp1s0: 0x00000110: 0x00000000,
0x000011c1, 0x000000a0, 0x00000000
[ 245.476295] tg3 0000:01:00.0 enp1s0: 0x00000130: 0x00000000,
0x00000000, 0x00000000, 0x16010002
[ 245.476295] tg3 0000:01:00.0 enp1s0: 0x00000150: 0x800000ff,
0x00000000, 0x00000000, 0x00000000
[ 245.476295] tg3 0000:01:00.0 enp1s0: 0x00000160: 0x16c10003,
0xfe3b2547, 0x001ec9ff, 0x00010004
[ 245.476295] tg3 0000:01:00.0 enp1s0: 0x00000170: 0x00000000,
0x0007810e, 0x00000001, 0x2c0c3c0e
[ 245.476295] tg3 0000:01:00.0 enp1s0: 0x00000180: 0x3f062304,
0x00000000, 0x00000000, 0x00000000
[ 245.476295] tg3 0000:01:00.0 enp1s0: 0x00000200: 0x00000000,
0x00000001, 0x00000000, 0x00000001
[ 245.476295] tg3 0000:01:00.0 enp1s0: 0x00000210: 0x00000000,
0x00000001, 0x00000000, 0x00000001
[ 245.476295] tg3 0000:01:00.0 enp1s0: 0x00000220: 0x00000000,
0x00000001, 0x00000000, 0x00000001
[ 245.476295] tg3 0000:01:00.0 enp1s0: 0x00000230: 0x00000000,
0x00000001, 0x00000000, 0x00000001
[ 245.476295] tg3 0000:01:00.0 enp1s0: 0x00000240: 0x00000000,
0x00000001, 0x00000000, 0x00000001
[ 245.476295] tg3 0000:01:00.0 enp1s0: 0x00000250: 0x00000000,
0x00000001, 0x00000000, 0x00000001
[ 245.476295] tg3 0000:01:00.0 enp1s0: 0x00000260: 0x00000000,
0x00000001, 0x00000000, 0x000000ce
[ 245.476295] tg3 0000:01:00.0 enp1s0: 0x00000270: 0x00000000,
0x00000001, 0x00000000, 0x00000001
[ 245.476295] tg3 0000:01:00.0 enp1s0: 0x00000280: 0x00000000,
0x00000006, 0x00000000, 0x00000006
[ 245.476295] tg3 0000:01:00.0 enp1s0: 0x00000290: 0x00000000,
0x00000006, 0x00000000, 0x00000006
[ 245.476295] tg3 0000:01:00.0 enp1s0: 0x000002a0: 0x00000000,
0x00000006, 0x00000000, 0x00000006
[ 245.476295] tg3 0000:01:00.0 enp1s0: 0x000002b0: 0x00000000,
0x00000006, 0x00000000, 0x00000006
[ 245.476295] tg3 0000:01:00.0 enp1s0: 0x000002c0: 0x00000000,
0x00000006, 0x00000000, 0x00000006
[ 245.476295] tg3 0000:01:00.0 enp1s0: 0x000002d0: 0x00000000,
0x00000006, 0x00000000, 0x00000006
[ 245.476295] tg3 0000:01:00.0 enp1s0: 0x000002e0: 0x00000000,
0x00000006, 0x00000000, 0x00000006
[ 245.476295] tg3 0000:01:00.0 enp1s0: 0x000002f0: 0x00000000,
0x00000006, 0x00000000, 0x00000006
[ 245.476295] tg3 0000:01:00.0 enp1s0: 0x00000300: 0x00000000,
0x0000000f, 0x00000000, 0x0000000f
[ 245.476295] tg3 0000:01:00.0 enp1s0: 0x00000310: 0x00000000,
0x0000000f, 0x00000000, 0x0000000f
[ 245.476295] tg3 0000:01:00.0 enp1s0: 0x00000320: 0x00000000,
0x0000000f, 0x00000000, 0x0000000f
[ 245.476295] tg3 0000:01:00.0 enp1s0: 0x00000330: 0x00000000,
0x0000000f, 0x00000000, 0x0000000f
[ 245.476295] tg3 0000:01:00.0 enp1s0: 0x00000340: 0x00000000,
0x0000000f, 0x00000000, 0x0000000f
[ 245.476295] tg3 0000:01:00.0 enp1s0: 0x00000350: 0x00000000,
0x0000000f, 0x00000000, 0x0000000f
[ 245.476295] tg3 0000:01:00.0 enp1s0: 0x00000360: 0x00000000,
0x0000000f, 0x00000000, 0x0000000f
[ 245.476295] tg3 0000:01:00.0 enp1s0: 0x00000370: 0x00000000,
0x0000000f, 0x00000000, 0x0000000f
[ 245.476295] tg3 0000:01:00.0 enp1s0: 0x00000380: 0x00000000,
0x0000000c, 0x00000000, 0x0000000c
[ 245.476295] tg3 0000:01:00.0 enp1s0: 0x00000390: 0x00000000,
0x0000000c, 0x00000000, 0x0000000c
[ 245.476295] tg3 0000:01:00.0 enp1s0: 0x000003a0: 0x00000000,
0x0000000c, 0x00000000, 0x0000000c
[ 245.476295] tg3 0000:01:00.0 enp1s0: 0x000003b0: 0x00000000,
0x0000000c, 0x00000000, 0x0000000c
[ 245.476295] tg3 0000:01:00.0 enp1s0: 0x000003c0: 0x00000000,
0x0000000c, 0x00000000, 0x0000000c
[ 245.476295] tg3 0000:01:00.0 enp1s0: 0x000003d0: 0x00000000,
0x0000000c, 0x00000000, 0x0000000c
[ 245.476295] tg3 0000:01:00.0 enp1s0: 0x000003e0: 0x00000000,
0x0000000c, 0x00000000, 0x0000000c
[ 245.476295] tg3 0000:01:00.0 enp1s0: 0x000003f0: 0x00000000,
0x0000000c, 0x00000000, 0x0000000c
[ 245.476295] tg3 0000:01:00.0 enp1s0: 0x00000400: 0x00e04808,
0x00400000, 0x00001000, 0x00000880
[ 245.476295] tg3 0000:01:00.0 enp1s0: 0x00000410: 0x0000001e,
0xc93b2547, 0x0000001e, 0xc93b2547
[ 245.502584] tg3 0000:01:00.0 enp1s0: 0x00000420: 0x0000001e,
0xc93b2547, 0x0000001e, 0xc93b2547
[ 245.504134] tg3 0000:01:00.0 enp1s0: 0x00000430: 0x00000000,
0x00000000, 0x000002f8, 0x000005f2
[ 245.505060] tg3 0000:01:00.0 enp1s0: 0x00000440: 0x00000000,
0x00000000, 0x00000000, 0x08310301
[ 245.506073] tg3 0000:01:00.0 enp1s0: 0x00000450: 0x00000001,
0x000c0000, 0x00000000, 0x00000012
[ 245.508062] tg3 0000:01:00.0 enp1s0: 0x00000460: 0x00000008,
0x00002620, 0x00000006, 0x00000000
[ 245.509057] tg3 0000:01:00.0 enp1s0: 0x00000470: 0xa0000080,
0x00000000, 0x00000000, 0x50000000
[ 245.510120] tg3 0000:01:00.0 enp1s0: 0x00000480: 0x42000000,
0x7fffffff, 0x06000004, 0x7fffffff
[ 245.512072] tg3 0000:01:00.0 enp1s0: 0x00000500: 0x00000008,
0x00000002, 0x00000000, 0x00000000
[ 245.513052] tg3 0000:01:00.0 enp1s0: 0x00000590: 0x01e00000,
0x00000000, 0x00000000, 0x00000000
[ 245.514132] tg3 0000:01:00.0 enp1s0: 0x00000600: 0xffffffff,
0x00f80011, 0x00000000, 0x00001f04
[ 245.516108] tg3 0000:01:00.0 enp1s0: 0x00000610: 0xffffffff,
0x00000000, 0x07c00004, 0x36ecf800
[ 245.517074] tg3 0000:01:00.0 enp1s0: 0x00000620: 0x00000040,
0x00000000, 0x00000000, 0x00000000
[ 245.518121] tg3 0000:01:00.0 enp1s0: 0x00000800: 0x00000000,
0xffffffff, 0x00000000, 0x00000000
[ 245.520060] tg3 0000:01:00.0 enp1s0: 0x00000810: 0x00000000,
0xffffffff, 0x00000000, 0x00000000
[ 245.521077] tg3 0000:01:00.0 enp1s0: 0x00000820: 0x00000000,
0x00000000, 0xffffffff, 0x00000000
[ 245.523047] tg3 0000:01:00.0 enp1s0: 0x00000830: 0x00000000,
0xffffffff, 0xffffffff, 0xffffffff
[ 245.524073] tg3 0000:01:00.0 enp1s0: 0x00000840: 0xffffffff,
0xffffffff, 0xffffffff, 0xffffffff
[ 245.525063] tg3 0000:01:00.0 enp1s0: 0x00000850: 0xffffffff,
0xffffffff, 0xffffffff, 0xffffffff
[ 245.527048] tg3 0000:01:00.0 enp1s0: 0x00000860: 0xffffffff,
0xffffffff, 0xffffffff, 0x00000000
[ 245.528073] tg3 0000:01:00.0 enp1s0: 0x00000880: 0x00000040,
0x000001a8, 0x00000000, 0x00000001
[ 245.529070] tg3 0000:01:00.0 enp1s0: 0x000008f0: 0x00000001,
0x00000000, 0x00000000, 0x00000000
[ 245.530174] tg3 0000:01:00.0 enp1s0: 0x00000c00: 0x0000000a,
0x00000000, 0x00000003, 0x00000001
[ 245.532064] tg3 0000:01:00.0 enp1s0: 0x00000c10: 0x00000000,
0x00000000, 0x00000000, 0x004d0000
[ 245.533072] tg3 0000:01:00.0 enp1s0: 0x00000c80: 0x0000000c,
0x00000000, 0x00000000, 0x00000000
[ 245.534079] tg3 0000:01:00.0 enp1s0: 0x00000ce0: 0x790cb002,
0x00000000, 0x0000004d, 0x00041028
[ 245.536073] tg3 0000:01:00.0 enp1s0: 0x00000cf0: 0x00000000,
0x5000000c, 0x00000000, 0x00000000
[ 245.537070] tg3 0000:01:00.0 enp1s0: 0x00001000: 0x00000002,
0x00000000, 0xa0000618, 0x00000000
[ 245.539096] tg3 0000:01:00.0 enp1s0: 0x00001010: 0x000c00c0,
0x00000618, 0x00000000, 0x00000000
[ 245.540065] tg3 0000:01:00.0 enp1s0: 0x00001400: 0x00000006,
0x00000000, 0x00000000, 0x00000000
[ 245.542048] tg3 0000:01:00.0 enp1s0: 0x00001440: 0x0000000c,
0x0000000c, 0x0000000c, 0x0000000c
[ 245.543056] tg3 0000:01:00.0 enp1s0: 0x00001450: 0x0000000c,
0x0000000c, 0x0000000c, 0x0000000c
[ 245.544064] tg3 0000:01:00.0 enp1s0: 0x00001460: 0x0000000c,
0x0000000c, 0x0000000c, 0x0000000c
[ 245.546056] tg3 0000:01:00.0 enp1s0: 0x00001470: 0x0000000c,
0x0000000c, 0x0000000c, 0x0000000c
[ 245.548047] tg3 0000:01:00.0 enp1s0: 0x00001480: 0x00000001,
0x00000000, 0x00000000, 0x00000000
[ 245.549060] tg3 0000:01:00.0 enp1s0: 0x00001800: 0x00000016,
0x00000000, 0x0000000d, 0x00000000
[ 245.551128] tg3 0000:01:00.0 enp1s0: 0x00001840: 0x00000000,
0x00000000, 0x00000710, 0x00000010
[ 245.552065] tg3 0000:01:00.0 enp1s0: 0x00001850: 0x752260d0,
0x00000000, 0x000040d0, 0x000d000e
[ 245.554053] tg3 0000:01:00.0 enp1s0: 0x00001860: 0x01000100,
0x00000000, 0x00000000, 0x00000000
[ 245.555073] tg3 0000:01:00.0 enp1s0: 0x00001c00: 0x00000002,
0x00000000, 0x00000000, 0x00000000
[ 245.557059] tg3 0000:01:00.0 enp1s0: 0x00002000: 0x00000002,
0x00000000, 0x00000000, 0x00000000
[ 245.559055] tg3 0000:01:00.0 enp1s0: 0x00002010: 0x00000181,
0x00000001, 0x00790807, 0x00000000
[ 245.560071] tg3 0000:01:00.0 enp1s0: 0x00002100: 0x000c6c36,
0x000c6c36, 0x00000000, 0x00000000
[ 245.562058] tg3 0000:01:00.0 enp1s0: 0x00002110: 0x000c6c36,
0x000c6c36, 0x00000000, 0x00000000
[ 245.564065] tg3 0000:01:00.0 enp1s0: 0x00002120: 0x000c6c36,
0x000c6c36, 0x00000000, 0x00000000
[ 245.566127] tg3 0000:01:00.0 enp1s0: 0x00002130: 0x000c6c36,
0x000c6c36, 0x00000000, 0x00000000
[ 245.567066] tg3 0000:01:00.0 enp1s0: 0x00002140: 0x000c6c36,
0x000c6c36, 0x00000000, 0x00000000
[ 245.569055] tg3 0000:01:00.0 enp1s0: 0x00002150: 0x000c6c36,
0x000c6c36, 0x00000000, 0x00000000
[ 245.571063] tg3 0000:01:00.0 enp1s0: 0x00002160: 0x000c6c36,
0x000c6c36, 0x00000000, 0x00000000
[ 245.573128] tg3 0000:01:00.0 enp1s0: 0x00002170: 0x000c6c36,
0x000c6c36, 0x00000000, 0x00000000
[ 245.574065] tg3 0000:01:00.0 enp1s0: 0x00002180: 0x000c6c36,
0x000c6c36, 0x00000000, 0x00000000
[ 245.576057] tg3 0000:01:00.0 enp1s0: 0x00002190: 0x000c6c36,
0x000c6c36, 0x00000000, 0x00000000
[ 245.578063] tg3 0000:01:00.0 enp1s0: 0x000021a0: 0x000c6c36,
0x000c6c36, 0x00000000, 0x00000000
[ 245.580060] tg3 0000:01:00.0 enp1s0: 0x000021b0: 0x000c6c36,
0x000c6c36, 0x00000000, 0x00000000
[ 245.582128] tg3 0000:01:00.0 enp1s0: 0x000021c0: 0x000c6c36,
0x000c6c36, 0x00000000, 0x00000000
[ 245.583057] tg3 0000:01:00.0 enp1s0: 0x000021d0: 0x000c6c36,
0x000c6c36, 0x00000000, 0x00000000
[ 245.585055] tg3 0000:01:00.0 enp1s0: 0x000021e0: 0x000c6c36,
0x000c6c36, 0x00000000, 0x00000000
[ 245.587060] tg3 0000:01:00.0 enp1s0: 0x000021f0: 0x000c6c36,
0x000c6c36, 0x00000000, 0x00000000
[ 245.589126] tg3 0000:01:00.0 enp1s0: 0x00002200: 0x00000009,
0x00000000, 0x00000000, 0x00000000
[ 245.590057] tg3 0000:01:00.0 enp1s0: 0x00002400: 0x00000012,
0x00000000, 0x00000000, 0x00000000
[ 245.592058] tg3 0000:01:00.0 enp1s0: 0x00002450: 0x00000000,
0x74b1c000, 0x02000000, 0x00006000
[ 245.594128] tg3 0000:01:00.0 enp1s0: 0x00002470: 0x00000000,
0x00000007, 0x00000000, 0x00000000
[ 245.595057] tg3 0000:01:00.0 enp1s0: 0x000024c0: 0x08000001,
0x00000000, 0x00000000, 0x00000000
[ 245.597062] tg3 0000:01:00.0 enp1s0: 0x00002800: 0x00000006,
0x00000000, 0x00000000, 0x00000000
[ 245.599128] tg3 0000:01:00.0 enp1s0: 0x00002c00: 0x00000006,
0x00000000, 0x00000000, 0x0000007f
[ 245.600064] tg3 0000:01:00.0 enp1s0: 0x00002c10: 0x00000000,
0x00000000, 0x00000008, 0x00000000
[ 245.602066] tg3 0000:01:00.0 enp1s0: 0x00002c20: 0x00000001,
0x00000000, 0x00000000, 0x00000000
[ 245.603057] tg3 0000:01:00.0 enp1s0: 0x00003000: 0x00000006,
0x00000000, 0x00000000, 0x0000007f
[ 245.605064] tg3 0000:01:00.0 enp1s0: 0x00003c00: 0x00000302,
0x00000000, 0x00000014, 0x00000048
[ 245.606056] tg3 0000:01:00.0 enp1s0: 0x00003c10: 0x00000005,
0x00000035, 0x00000000, 0x00000000
[ 245.608059] tg3 0000:01:00.0 enp1s0: 0x00003c20: 0x00000005,
0x00000005, 0x00000000, 0x00000000
[ 245.610128] tg3 0000:01:00.0 enp1s0: 0x00003c30: 0x00000000,
0x00000000, 0x00000000, 0x73e53000
[ 245.611065] tg3 0000:01:00.0 enp1s0: 0x00003c40: 0x00000000,
0x00000b00, 0x00000000, 0x00000000
[ 245.613128] tg3 0000:01:00.0 enp1s0: 0x00003c50: 0x00000000,
0x00000007, 0x00000000, 0x00000000
[ 245.614060] tg3 0000:01:00.0 enp1s0: 0x00003c80: 0x00000007,
0x00000000, 0x00000000, 0x00000000
[ 245.616070] tg3 0000:01:00.0 enp1s0: 0x00003cc0: 0x0000000c,
0x00000000, 0x00000000, 0x00000000
[ 245.617061] tg3 0000:01:00.0 enp1s0: 0x00004000: 0x00000002,
0x00000000, 0x001ebffc, 0x001635cf
[ 245.619063] tg3 0000:01:00.0 enp1s0: 0x00004010: 0x004186a0,
0x00236012, 0x00800440, 0x01001862
[ 245.620062] tg3 0000:01:00.0 enp1s0: 0x00004020: 0x0020eb11,
0x00000000, 0x00000010, 0x00000000
[ 245.622067] tg3 0000:01:00.0 enp1s0: 0x00004030: 0x00000010,
0x00000030, 0x00000000, 0x00000000
[ 245.623060] tg3 0000:01:00.0 enp1s0: 0x00004040: 0x00000000,
0x00000000, 0x01081620, 0x00000000
[ 245.625060] tg3 0000:01:00.0 enp1s0: 0x00004050: 0x00000000,
0x00000000, 0x00236010, 0x00413002
[ 245.626060] tg3 0000:01:00.0 enp1s0: 0x00004060: 0x00419000,
0x00000000, 0x00000000, 0x00000000
[ 245.628066] tg3 0000:01:00.0 enp1s0: 0x00004400: 0x00000006,
0x00000000, 0x00011080, 0x0000df80
[ 245.630096] tg3 0000:01:00.0 enp1s0: 0x00004410: 0x00000000,
0x00000010, 0x00000060, 0x00000000
[ 245.631066] tg3 0000:01:00.0 enp1s0: 0x00004420: 0x0000003d,
0x00000000, 0x00000000, 0x00000000
[ 245.633129] tg3 0000:01:00.0 enp1s0: 0x00004440: 0x00000000,
0x00000000, 0x00000000, 0x0180601a
[ 245.634062] tg3 0000:01:00.0 enp1s0: 0x00004450: 0x000103be,
0x00160017, 0x00000000, 0x00000000
[ 245.636063] tg3 0000:01:00.0 enp1s0: 0x00004800: 0x080303fe,
0x00000010, 0x01000019, 0x00000020
[ 245.637054] tg3 0000:01:00.0 enp1s0: 0x00004810: 0x00000011,
0x00000004, 0x00000000, 0x00000004
[ 245.639064] tg3 0000:01:00.0 enp1s0: 0x00004820: 0x00100019,
0x00000000, 0xa0a40010, 0x752260c0
[ 245.641128] tg3 0000:01:00.0 enp1s0: 0x00004830: 0x00000010,
0x000000e4, 0x000000e4, 0x00000000
[ 245.642055] tg3 0000:01:00.0 enp1s0: 0x00004840: 0x0000000c,
0x0000000c, 0x030e2200, 0x192a3f66
[ 245.644065] tg3 0000:01:00.0 enp1s0: 0x00004850: 0xffff7da7,
0x83b1004d, 0x13181318, 0x00000000
[ 245.646128] tg3 0000:01:00.0 enp1s0: 0x00004c00: 0x000003fe,
0x00000000, 0x00000000, 0x00000000
[ 245.647075] tg3 0000:01:00.0 enp1s0: 0x00004c10: 0x00000000,
0x00000000, 0x00000006, 0x00000000
[ 245.649061] tg3 0000:01:00.0 enp1s0: 0x00004c20: 0x00000000,
0x00000000, 0x00000000, 0x00000006
[ 245.651079] tg3 0000:01:00.0 enp1s0: 0x00004c30: 0x00000000,
0x00000000, 0x00000056, 0x00000056
[ 245.653047] tg3 0000:01:00.0 enp1s0: 0x00004c40: 0x00000000,
0x73e53000, 0x00010020, 0x00000020
[ 245.654055] tg3 0000:01:00.0 enp1s0: 0x00005000: 0x00009800,
0x80004000, 0x00000000, 0x00000000
[ 245.656071] tg3 0000:01:00.0 enp1s0: 0x00005010: 0x00000000,
0x00000000, 0x00000000, 0x0001081c
[ 245.658101] tg3 0000:01:00.0 enp1s0: 0x00005020: 0x8fb20018,
0x00000000, 0x00000000, 0x40000020
[ 245.659056] tg3 0000:01:00.0 enp1s0: 0x00005030: 0x00000000,
0x0000001d, 0x00000000, 0x00000000
[ 245.661063] tg3 0000:01:00.0 enp1s0: 0x00005040: 0x00000000,
0x00000000, 0x00010dd2, 0x00000000
[ 245.663067] tg3 0000:01:00.0 enp1s0: 0x00005200: 0x00000000,
0xb49a89ab, 0xfffeffff, 0x00000000
[ 245.665055] tg3 0000:01:00.0 enp1s0: 0x00005210: 0x0001ff90,
0x00010820, 0x00000000, 0x0001ff90
[ 245.667060] tg3 0000:01:00.0 enp1s0: 0x00005220: 0xb49a89ab,
0x00000000, 0xb49a89ab, 0x00000000
[ 245.669071] tg3 0000:01:00.0 enp1s0: 0x00005230: 0x00000000,
0x00000000, 0x0001ff90, 0x00000000
[ 245.671066] tg3 0000:01:00.0 enp1s0: 0x00005240: 0x0001ff90,
0x00000000, 0x00000000, 0x00000000
[ 245.673101] tg3 0000:01:00.0 enp1s0: 0x00005250: 0x00000000,
0x0001ff90, 0x00000000, 0x0001ff90
[ 245.675092] tg3 0000:01:00.0 enp1s0: 0x00005260: 0x00000000,
0x0001ff90, 0x00000000, 0xb49a89ab
[ 245.677079] tg3 0000:01:00.0 enp1s0: 0x00005270: 0x00000000,
0x0001ff90, 0x00010820, 0x00000000
[ 245.679067] tg3 0000:01:00.0 enp1s0: 0x00005800: 0x00000000,
0x00000001, 0x00000000, 0x00000001
[ 245.681070] tg3 0000:01:00.0 enp1s0: 0x00005810: 0x00000000,
0x00000001, 0x00000000, 0x00000001
[ 245.683076] tg3 0000:01:00.0 enp1s0: 0x00005820: 0x00000000,
0x00000001, 0x00000000, 0x00000001
[ 245.685072] tg3 0000:01:00.0 enp1s0: 0x00005830: 0x00000000,
0x00000001, 0x00000000, 0x00000001
[ 245.687070] tg3 0000:01:00.0 enp1s0: 0x00005840: 0x00000000,
0x00000001, 0x00000000, 0x00000001
[ 245.689073] tg3 0000:01:00.0 enp1s0: 0x00005850: 0x00000000,
0x00000001, 0x00000000, 0x00000001
[ 245.691074] tg3 0000:01:00.0 enp1s0: 0x00005860: 0x00000000,
0x00000001, 0x00000000, 0x000000ce
[ 245.693070] tg3 0000:01:00.0 enp1s0: 0x00005870: 0x00000000,
0x00000001, 0x00000000, 0x00000001
[ 245.695068] tg3 0000:01:00.0 enp1s0: 0x00005880: 0x00000000,
0x00000006, 0x00000000, 0x00000001
[ 245.697060] tg3 0000:01:00.0 enp1s0: 0x00005890: 0x00000000,
0x00000001, 0x00000000, 0x00000001
[ 245.699057] tg3 0000:01:00.0 enp1s0: 0x000058a0: 0x00000000,
0x00000001, 0x00000000, 0x00000001
[ 245.701053] tg3 0000:01:00.0 enp1s0: 0x000058b0: 0x00000000,
0x00000001, 0x00000000, 0x00000001
[ 245.703058] tg3 0000:01:00.0 enp1s0: 0x000058c0: 0x00000000,
0x00000001, 0x00000000, 0x00000001
[ 245.705128] tg3 0000:01:00.0 enp1s0: 0x000058d0: 0x00000000,
0x00000001, 0x00000000, 0x00000001
[ 245.707128] tg3 0000:01:00.0 enp1s0: 0x000058e0: 0x00000000,
0x00000001, 0x00000000, 0x00000001
[ 245.708075] tg3 0000:01:00.0 enp1s0: 0x000058f0: 0x00000000,
0x00000001, 0x00000000, 0x00000001
[ 245.710063] tg3 0000:01:00.0 enp1s0: 0x00005900: 0x00000000,
0x0000000f, 0x00000000, 0x00000001
[ 245.712060] tg3 0000:01:00.0 enp1s0: 0x00005910: 0x00000000,
0x00000001, 0x00000000, 0x00000001
[ 245.714068] tg3 0000:01:00.0 enp1s0: 0x00005920: 0x00000000,
0x00000001, 0x00000000, 0x00000001
[ 245.716073] tg3 0000:01:00.0 enp1s0: 0x00005930: 0x00000000,
0x00000001, 0x00000000, 0x00000001
[ 245.718052] tg3 0000:01:00.0 enp1s0: 0x00005940: 0x00000000,
0x00000001, 0x00000000, 0x00000001
[ 245.720081] tg3 0000:01:00.0 enp1s0: 0x00005950: 0x00000000,
0x00000001, 0x00000000, 0x00000001
[ 245.722047] tg3 0000:01:00.0 enp1s0: 0x00005960: 0x00000000,
0x00000001, 0x00000000, 0x00000001
[ 245.723070] tg3 0000:01:00.0 enp1s0: 0x00005970: 0x00000000,
0x00000001, 0x00000000, 0x00000001
[ 245.725063] tg3 0000:01:00.0 enp1s0: 0x00005980: 0x00000000,
0x0000000c, 0x00000000, 0x00000001
[ 245.727055] tg3 0000:01:00.0 enp1s0: 0x00005990: 0x00000000,
0x00000001, 0x00000000, 0x00000001
[ 245.729047] tg3 0000:01:00.0 enp1s0: 0x000059a0: 0x00000000,
0x00000001, 0x00000000, 0x00000001
[ 245.730074] tg3 0000:01:00.0 enp1s0: 0x000059b0: 0x00000000,
0x00000001, 0x00000000, 0x00000001
[ 245.732060] tg3 0000:01:00.0 enp1s0: 0x000059c0: 0x00000000,
0x00000001, 0x00000000, 0x00000001
[ 245.734095] tg3 0000:01:00.0 enp1s0: 0x000059d0: 0x00000000,
0x00000001, 0x00000000, 0x00000001
[ 245.735079] tg3 0000:01:00.0 enp1s0: 0x000059e0: 0x00000000,
0x00000001, 0x00000000, 0x00000001
[ 245.737056] tg3 0000:01:00.0 enp1s0: 0x000059f0: 0x00000000,
0x00000001, 0x00000000, 0x00000001
[ 245.738058] tg3 0000:01:00.0 enp1s0: 0x00005a00: 0x00012001,
0x00000000, 0x00010000, 0x00000000
[ 245.740054] tg3 0000:01:00.0 enp1s0: 0x00006000: 0x00000002,
0x00000000, 0x00000000, 0x00000000
[ 245.742127] tg3 0000:01:00.0 enp1s0: 0x00006800: 0x04130034,
0x3c085082, 0x01009509, 0x0c502cdc
[ 245.743051] tg3 0000:01:00.0 enp1s0: 0x00006810: 0x81120000,
0xffffffff, 0x00000000, 0x00000000
[ 245.745067] tg3 0000:01:00.0 enp1s0: 0x00006830: 0xfffc3ccf,
0xfffc0fff, 0x00000000, 0x00000000
[ 245.746066] tg3 0000:01:00.0 enp1s0: 0x00006840: 0x00000024,
0x00000000, 0x00000000, 0x00000000
[ 245.748061] tg3 0000:01:00.0 enp1s0: 0x00006c00: 0x00000040,
0x00000000, 0x00000000, 0x00000000
[ 245.749074] tg3 0000:01:00.0 enp1s0: 0x00006c40: 0x00000000,
0x000f0000, 0x00000000, 0x000000f7
[ 245.751054] tg3 0000:01:00.0 enp1s0: 0x00006c50: 0x00000000,
0x00000000, 0x000000f7, 0x00000000
[ 245.753096] tg3 0000:01:00.0 enp1s0: 0x00007000: 0x00000188,
0x00000000, 0x00000000, 0x000000c0
[ 245.754055] tg3 0000:01:00.0 enp1s0: 0x00007010: 0x0a000064,
0x02008273, 0x00570081, 0x68848353
[ 245.756109] tg3 0000:01:00.0 enp1s0: 0x00007020: 0x00000000,
0x00000000, 0xaf000400, 0x00000000
[ 245.757052] tg3 0000:01:00.0 enp1s0: 0x00007400: 0x00000000,
0x000000aa, 0x00000000, 0x00000000
[ 245.759057] tg3 0000:01:00.0 enp1s0: 0x00007800: 0x00000000,
0x00000000, 0x00000001, 0x00000000
[ 245.760055] tg3 0000:01:00.0 enp1s0: 0x00007810: 0x00000000,
0x00000060, 0x0000000d, 0x00000000
[ 245.762065] tg3 0000:01:00.0 enp1s0: 0: Host status block
[00000005:00000016:(0000:0007:0000):(0007:000c)]
[ 245.764128] tg3 0000:01:00.0 enp1s0: 0: NAPI info
[00000015:00000015:(000f:000c:01ff):0006:(00ce:0000:0000:0000)]
[ 245.867391] tg3 0000:01:00.0: tg3_stop_block timed out, ofs=2c00
enable_bit=2
[ 245.971098] tg3 0000:01:00.0: tg3_stop_block timed out, ofs=4800
enable_bit=2
[ 245.993343] tg3 0000:01:00.0 enp1s0: Link is down
[ 249.731158] tg3 0000:01:00.0 enp1s0: Link is up at 1000 Mbps, full
duplex
[ 249.731336] tg3 0000:01:00.0 enp1s0: Flow control is on for TX and
on for RX
[ 254.944022] ------------[ cut here ]------------
[ 254.945010] WARNING: CPU: 1 PID: 0 at net/sched/sch_generic.c:316
dev_watchdog+0x221/0x230
[ 254.945010] NETDEV WATCHDOG: enp1s0 (tg3): transmit queue 0 timed
out
[ 254.945010] Modules linked in: xt_CHECKSUM ipt_MASQUERADE
nf_nat_masquerade_ipv4 tun fuse nf_conntrack_netbios_ns
nf_conntrack_broadcast xt_CT ip6t_rpfilter ip6t_REJECT nf_reject_ipv6
xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge
ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6
ip6table_mangle ip6table_raw ip6table_security iptable_nat
nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack
libcrc32c iptable_mangle iptable_raw iptable_security ebtable_filter
ebtables ip6table_filter ip6_tables powernow_k8 amd64_edac_mod
edac_mce_amd edac_core kvm_amd kvm irqbypass amdkfd amd_iommu_v2 dcdbas
radeon k8temp ipmi_ssif i2c_algo_bit ttm drm_kms_helper drm ipmi_si
ipmi_devintf tpm_tis ipmi_msghandler tpm_tis_core tpm i2c_piix4 shpchp
nls_utf8 isofs squashfs ata_generic
[ 254.945010] pata_acpi uas usb_storage 8021q garp stp llc mrp
serio_raw tg3 sata_sil24 ptp pps_core pata_serverworks sunrpc
scsi_transport_iscsi loop
[ 254.945010] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.11.8-
300.fc26.x86_64 #1
[ 254.945010] Hardware name: Dell Inc. PowerEdge SC1435/0YR707, BIOS
2.2.5 03/21/2008
[ 254.945010] Call Trace:
[ 254.945010] <IRQ>
[ 254.945010] dump_stack+0x63/0x84
[ 254.945010] __warn+0xcb/0xf0
[ 254.945010] warn_slowpath_fmt+0x5a/0x80
[ 254.945010] dev_watchdog+0x221/0x230
[ 254.945010] ? qdisc_rcu_free+0x50/0x50
[ 254.945010] call_timer_fn+0x33/0x130
[ 254.945010] run_timer_softirq+0x3ee/0x440
[ 254.945010] ? ktime_get+0x40/0xb0
[ 254.945010] ? lapic_next_event+0x1d/0x30
[ 254.945010] __do_softirq+0xea/0x2e3
[ 254.945010] irq_exit+0xfb/0x100
[ 254.945010] smp_apic_timer_interrupt+0x3d/0x50
[ 254.945010] apic_timer_interrupt+0x93/0xa0
[ 254.945010] RIP: 0010:native_safe_halt+0x6/0x10
[ 254.945010] RSP: 0018:ffffbabf4038be60 EFLAGS: 00000246 ORIG_RAX:
ffffffffffffff10
[ 254.945010] RAX: 6874754100002d40 RBX: ffff926d39d04880 RCX:
0000000000000000
[ 254.945010] RDX: 0000000000000000 RSI: 0000000000000000 RDI:
0000000000000000
[ 254.945010] RBP: ffffbabf4038be60 R08: ffff926d3d052ae0 R09:
0000000000000000
[ 254.945010] R10: 0000000000000000 R11: 0000000000000000 R12:
0000000000000001
[ 254.945010] R13: ffff926d39d04880 R14: 0000000000000000 R15:
0000000000000000
[ 254.945010] </IRQ>
[ 254.945010] default_idle+0x20/0xe0
[ 254.945010] amd_e400_idle+0x3f/0x50
[ 254.945010] arch_cpu_idle+0xf/0x20
[ 254.945010] default_idle_call+0x23/0x30
[ 254.945010] do_idle+0x170/0x200
[ 254.945010] cpu_startup_entry+0x71/0x80
[ 254.945010] start_secondary+0x154/0x190
[ 254.945010] start_cpu+0x14/0x14
[ 255.080137] ---[ end trace 25a535e6d8610c90 ]---
[ 255.080137] tg3 0000:01:00.0 enp1s0: transmit timed out, resetting
^ permalink raw reply
* Re: netdev watchdog enp1s0 (tg3): transmit queue 0 timed out
From: Michael Chan @ 2017-09-01 6:36 UTC (permalink / raw)
To: Frans van Berckel, Siva Reddy Kallam; +Cc: linux-netdev
In-Reply-To: <1504246208.2050.15.camel@xs4all.nl>
On Thu, Aug 31, 2017 at 11:10 PM, Frans van Berckel <fberckel@xs4all.nl> wrote:
> Dear NetDev Team,
>
> I am new to this machine. On a marketplace website I
> bought a Dell PowerEdge sc1435. Booting a today's Fedora (or even a
> Debian) amd64 Live CD from usb, and goes all fine.
>
> [ 0.000000] Linux version 4.11.8-300.fc26.x86_64 (mockbuild@bkernel0
> 2.phx2.fedoraproject.org) (gcc version 7.1.1 20170622 (Red Hat 7.1.1-
> 3) (GCC) ) #1 SMP Thu Jun 29 20:09:48 UTC 2017
>
> Until .. I plunged in the ethernet cable for the first time. I have got
> console output, what frightens me a bit. It's about the driver for
> bcm95721. This kernel does a DMA Status error. And next it's calling a
> watchdog for enp1s0 (tg3): transmit queue 0 timed out.
>
> [ 237.169194] tg3 0000:01:00.0 enp1s0: Link is up at 1000 Mbps, full
> du
> plex
> [ 237.169335] tg3 0000:01:00.0 enp1s0: Flow control is on for TX
> and
> on for RX
> [ 237.169375] IPv6: ADDRCONF(NETDEV_CHANGE): enp1s0: link
> becomes
> ready
> [ 243.683910] tg3 0000:01:00.0 enp1s0: DMA Status
> error. Resetting
> chip.
> [ 243.759610] hrtimer: interrupt took 9464192 ns
> [ 245.317566] tg3 0000:01:00.0 enp1s0: 0x00000000: 0x165914e4,
> 0x00100406, 0x02000021, 0x00000010
>
> <snap> a long list of likely the same type of error codes.
>
Please post the entire register dump.
^ permalink raw reply
* [PATCH net-next 3/3] bpf: Only set node->ref = 1 if it has not been set
From: Martin KaFai Lau @ 2017-09-01 6:27 UTC (permalink / raw)
To: netdev; +Cc: Alexei Starovoitov, Daniel Borkmann, kernel-team
In-Reply-To: <20170901062713.1842249-1-kafai@fb.com>
This patch writes 'node->ref = 1' only if node->ref is 0.
The number of lookups/s for a ~1M entries LRU map increased by
~30% (260097 to 343313).
Other writes on 'node->ref = 0' is not changed. In those cases, the
same cache line has to be changed anyway.
First column: Size of the LRU hash
Second column: Number of lookups/s
Before:
> echo "$((2**20+1)): $(./map_perf_test 1024 1 $((2**20+1)) 10000000 | awk '{print $3}')"
1048577: 260097
After:
> echo "$((2**20+1)): $(./map_perf_test 1024 1 $((2**20+1)) 10000000 | awk '{print $3}')"
1048577: 343313
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
---
kernel/bpf/bpf_lru_list.h | 3 ++-
kernel/bpf/hashtab.c | 7 ++++++-
2 files changed, 8 insertions(+), 2 deletions(-)
diff --git a/kernel/bpf/bpf_lru_list.h b/kernel/bpf/bpf_lru_list.h
index 5c35a98d02bf..7d4f89b7cb84 100644
--- a/kernel/bpf/bpf_lru_list.h
+++ b/kernel/bpf/bpf_lru_list.h
@@ -69,7 +69,8 @@ static inline void bpf_lru_node_set_ref(struct bpf_lru_node *node)
/* ref is an approximation on access frequency. It does not
* have to be very accurate. Hence, no protection is used.
*/
- node->ref = 1;
+ if (!node->ref)
+ node->ref = 1;
}
int bpf_lru_init(struct bpf_lru *lru, bool percpu, u32 hash_offset,
diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
index 682f4543fefa..431126f31ea3 100644
--- a/kernel/bpf/hashtab.c
+++ b/kernel/bpf/hashtab.c
@@ -519,9 +519,14 @@ static u32 htab_lru_map_gen_lookup(struct bpf_map *map,
{
struct bpf_insn *insn = insn_buf;
const int ret = BPF_REG_0;
+ const int ref_reg = BPF_REG_1;
*insn++ = BPF_EMIT_CALL((u64 (*)(u64, u64, u64, u64, u64))__htab_map_lookup_elem);
- *insn++ = BPF_JMP_IMM(BPF_JEQ, ret, 0, 2);
+ *insn++ = BPF_JMP_IMM(BPF_JEQ, ret, 0, 4);
+ *insn++ = BPF_LDX_MEM(BPF_B, ref_reg, ret,
+ offsetof(struct htab_elem, lru_node) +
+ offsetof(struct bpf_lru_node, ref));
+ *insn++ = BPF_JMP_IMM(BPF_JNE, ref_reg, 0, 1);
*insn++ = BPF_ST_MEM(BPF_B, ret,
offsetof(struct htab_elem, lru_node) +
offsetof(struct bpf_lru_node, ref),
--
2.9.5
^ permalink raw reply related
* [PATCH net-next 2/3] bpf: Inline LRU map lookup
From: Martin KaFai Lau @ 2017-09-01 6:27 UTC (permalink / raw)
To: netdev; +Cc: Alexei Starovoitov, Daniel Borkmann, kernel-team
In-Reply-To: <20170901062713.1842249-1-kafai@fb.com>
Inline the lru map lookup to save the cost in making calls to
bpf_map_lookup_elem() and htab_lru_map_lookup_elem().
Different LRU hash size is tested. The benefit diminishes when
the cache miss starts to dominate in the bigger LRU hash.
Considering the change is simple, it is still worth to optimize.
First column: Size of the LRU hash
Second column: Number of lookups/s
Before:
> for i in $(seq 9 20); do echo "$((2**i+1)): $(./map_perf_test 1024 1 $((2**i+1)) 10000000 | awk '{print $3}')"; done
513: 1132020
1025: 1056826
2049: 1007024
4097: 853298
8193: 742723
16385: 712600
32769: 688142
65537: 677028
131073: 619437
262145: 498770
524289: 316695
1048577: 260038
After:
> for i in $(seq 9 20); do echo "$((2**i+1)): $(./map_perf_test 1024 1 $((2**i+1)) 10000000 | awk '{print $3}')"; done
513: 1221851
1025: 1144695
2049: 1049902
4097: 884460
8193: 773731
16385: 729673
32769: 721989
65537: 715530
131073: 671665
262145: 516987
524289: 321125
1048577: 260048
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
---
kernel/bpf/hashtab.c | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)
diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
index d246905f2bb1..682f4543fefa 100644
--- a/kernel/bpf/hashtab.c
+++ b/kernel/bpf/hashtab.c
@@ -514,6 +514,24 @@ static void *htab_lru_map_lookup_elem(struct bpf_map *map, void *key)
return NULL;
}
+static u32 htab_lru_map_gen_lookup(struct bpf_map *map,
+ struct bpf_insn *insn_buf)
+{
+ struct bpf_insn *insn = insn_buf;
+ const int ret = BPF_REG_0;
+
+ *insn++ = BPF_EMIT_CALL((u64 (*)(u64, u64, u64, u64, u64))__htab_map_lookup_elem);
+ *insn++ = BPF_JMP_IMM(BPF_JEQ, ret, 0, 2);
+ *insn++ = BPF_ST_MEM(BPF_B, ret,
+ offsetof(struct htab_elem, lru_node) +
+ offsetof(struct bpf_lru_node, ref),
+ 1);
+ *insn++ = BPF_ALU64_IMM(BPF_ADD, ret,
+ offsetof(struct htab_elem, key) +
+ round_up(map->key_size, 8));
+ return insn - insn_buf;
+}
+
/* It is called from the bpf_lru_list when the LRU needs to delete
* older elements from the htab.
*/
@@ -1137,6 +1155,7 @@ const struct bpf_map_ops htab_lru_map_ops = {
.map_lookup_elem = htab_lru_map_lookup_elem,
.map_update_elem = htab_lru_map_update_elem,
.map_delete_elem = htab_lru_map_delete_elem,
+ .map_gen_lookup = htab_lru_map_gen_lookup,
};
/* Called from eBPF program */
--
2.9.5
^ permalink raw reply related
* [PATCH net-next 1/3] bpf: Add lru_hash_lookup performance test
From: Martin KaFai Lau @ 2017-09-01 6:27 UTC (permalink / raw)
To: netdev; +Cc: Alexei Starovoitov, Daniel Borkmann, kernel-team
In-Reply-To: <20170901062713.1842249-1-kafai@fb.com>
Create a new case to test the LRU lookup performance.
At the beginning, the LRU map is fully loaded (i.e. the number of keys
is equal to map->max_entries). The lookup is done through key 0
to num_map_entries and then repeats from 0 again.
This patch also creates an anonymous struct to properly
name the test params in stress_lru_hmap_alloc() in map_perf_test_kern.c.
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
---
samples/bpf/map_perf_test_kern.c | 44 +++++++++++++++++++----
samples/bpf/map_perf_test_user.c | 77 ++++++++++++++++++++++++++++++++++++++--
2 files changed, 112 insertions(+), 9 deletions(-)
diff --git a/samples/bpf/map_perf_test_kern.c b/samples/bpf/map_perf_test_kern.c
index ca3b22ed577a..098c857f1eda 100644
--- a/samples/bpf/map_perf_test_kern.c
+++ b/samples/bpf/map_perf_test_kern.c
@@ -88,6 +88,13 @@ struct bpf_map_def SEC("maps") array_map = {
.max_entries = MAX_ENTRIES,
};
+struct bpf_map_def SEC("maps") lru_hash_lookup_map = {
+ .type = BPF_MAP_TYPE_LRU_HASH,
+ .key_size = sizeof(u32),
+ .value_size = sizeof(long),
+ .max_entries = MAX_ENTRIES,
+};
+
SEC("kprobe/sys_getuid")
int stress_hmap(struct pt_regs *ctx)
{
@@ -148,12 +155,23 @@ int stress_percpu_hmap_alloc(struct pt_regs *ctx)
SEC("kprobe/sys_connect")
int stress_lru_hmap_alloc(struct pt_regs *ctx)
{
+ char fmt[] = "Failed at stress_lru_hmap_alloc. ret:%dn";
+ union {
+ u16 dst6[8];
+ struct {
+ u16 magic0;
+ u16 magic1;
+ u16 tcase;
+ u16 unused16;
+ u32 unused32;
+ u32 key;
+ };
+ } test_params;
struct sockaddr_in6 *in6;
- u16 test_case, dst6[8];
+ u16 test_case;
int addrlen, ret;
- char fmt[] = "Failed at stress_lru_hmap_alloc. ret:%d\n";
long val = 1;
- u32 key = bpf_get_prandom_u32();
+ u32 key = 0;
in6 = (struct sockaddr_in6 *)PT_REGS_PARM2(ctx);
addrlen = (int)PT_REGS_PARM3(ctx);
@@ -161,14 +179,18 @@ int stress_lru_hmap_alloc(struct pt_regs *ctx)
if (addrlen != sizeof(*in6))
return 0;
- ret = bpf_probe_read(dst6, sizeof(dst6), &in6->sin6_addr);
+ ret = bpf_probe_read(test_params.dst6, sizeof(test_params.dst6),
+ &in6->sin6_addr);
if (ret)
goto done;
- if (dst6[0] != 0xdead || dst6[1] != 0xbeef)
+ if (test_params.magic0 != 0xdead ||
+ test_params.magic1 != 0xbeef)
return 0;
- test_case = dst6[7];
+ test_case = test_params.tcase;
+ if (test_case != 3)
+ key = bpf_get_prandom_u32();
if (test_case == 0) {
ret = bpf_map_update_elem(&lru_hash_map, &key, &val, BPF_ANY);
@@ -188,6 +210,16 @@ int stress_lru_hmap_alloc(struct pt_regs *ctx)
ret = bpf_map_update_elem(nolocal_lru_map, &key, &val,
BPF_ANY);
+ } else if (test_case == 3) {
+ u32 i;
+
+ key = test_params.key;
+
+#pragma clang loop unroll(full)
+ for (i = 0; i < 32; i++) {
+ bpf_map_lookup_elem(&lru_hash_lookup_map, &key);
+ key++;
+ }
} else {
ret = -EINVAL;
}
diff --git a/samples/bpf/map_perf_test_user.c b/samples/bpf/map_perf_test_user.c
index bccbf8478e43..f388254896f6 100644
--- a/samples/bpf/map_perf_test_user.c
+++ b/samples/bpf/map_perf_test_user.c
@@ -46,6 +46,7 @@ enum test_type {
HASH_LOOKUP,
ARRAY_LOOKUP,
INNER_LRU_HASH_PREALLOC,
+ LRU_HASH_LOOKUP,
NR_TESTS,
};
@@ -60,6 +61,7 @@ const char *test_map_names[NR_TESTS] = {
[HASH_LOOKUP] = "hash_map",
[ARRAY_LOOKUP] = "array_map",
[INNER_LRU_HASH_PREALLOC] = "inner_lru_hash_map",
+ [LRU_HASH_LOOKUP] = "lru_hash_lookup_map",
};
static int test_flags = ~0;
@@ -67,6 +69,8 @@ static uint32_t num_map_entries;
static uint32_t inner_lru_hash_size;
static int inner_lru_hash_idx = -1;
static int array_of_lru_hashs_idx = -1;
+static int lru_hash_lookup_idx = -1;
+static int lru_hash_lookup_test_entries = 32;
static uint32_t max_cnt = 1000000;
static int check_test_flags(enum test_type t)
@@ -86,6 +90,32 @@ static void test_hash_prealloc(int cpu)
cpu, max_cnt * 1000000000ll / (time_get_ns() - start_time));
}
+static int pre_test_lru_hash_lookup(int tasks)
+{
+ int fd = map_fd[lru_hash_lookup_idx];
+ uint32_t key;
+ long val = 1;
+ int ret;
+
+ if (num_map_entries > lru_hash_lookup_test_entries)
+ lru_hash_lookup_test_entries = num_map_entries;
+
+ /* Populate the lru_hash_map for LRU_HASH_LOOKUP perf test.
+ *
+ * It is fine that the user requests for a map with
+ * num_map_entries < 32 and some of the later lru hash lookup
+ * may return not found. For LRU map, we are not interested
+ * in such small map performance.
+ */
+ for (key = 0; key < lru_hash_lookup_test_entries; key++) {
+ ret = bpf_map_update_elem(fd, &key, &val, BPF_NOEXIST);
+ if (ret)
+ return ret;
+ }
+
+ return 0;
+}
+
static void do_test_lru(enum test_type test, int cpu)
{
static int inner_lru_map_fds[MAX_NR_CPUS];
@@ -135,13 +165,17 @@ static void do_test_lru(enum test_type test, int cpu)
if (test == LRU_HASH_PREALLOC) {
test_name = "lru_hash_map_perf";
- in6.sin6_addr.s6_addr16[7] = 0;
+ in6.sin6_addr.s6_addr16[2] = 0;
} else if (test == NOCOMMON_LRU_HASH_PREALLOC) {
test_name = "nocommon_lru_hash_map_perf";
- in6.sin6_addr.s6_addr16[7] = 1;
+ in6.sin6_addr.s6_addr16[2] = 1;
} else if (test == INNER_LRU_HASH_PREALLOC) {
test_name = "inner_lru_hash_map_perf";
- in6.sin6_addr.s6_addr16[7] = 2;
+ in6.sin6_addr.s6_addr16[2] = 2;
+ } else if (test == LRU_HASH_LOOKUP) {
+ test_name = "lru_hash_lookup_perf";
+ in6.sin6_addr.s6_addr16[2] = 3;
+ in6.sin6_addr.s6_addr32[3] = 0;
} else {
assert(0);
}
@@ -150,6 +184,11 @@ static void do_test_lru(enum test_type test, int cpu)
for (i = 0; i < max_cnt; i++) {
ret = connect(-1, (const struct sockaddr *)&in6, sizeof(in6));
assert(ret == -1 && errno == EBADF);
+ if (in6.sin6_addr.s6_addr32[3] <
+ lru_hash_lookup_test_entries - 32)
+ in6.sin6_addr.s6_addr32[3] += 32;
+ else
+ in6.sin6_addr.s6_addr32[3] = 0;
}
printf("%d:%s pre-alloc %lld events per sec\n",
cpu, test_name,
@@ -171,6 +210,11 @@ static void test_inner_lru_hash_prealloc(int cpu)
do_test_lru(INNER_LRU_HASH_PREALLOC, cpu);
}
+static void test_lru_hash_lookup(int cpu)
+{
+ do_test_lru(LRU_HASH_LOOKUP, cpu);
+}
+
static void test_percpu_hash_prealloc(int cpu)
{
__u64 start_time;
@@ -243,6 +287,11 @@ static void test_array_lookup(int cpu)
cpu, max_cnt * 1000000000ll * 64 / (time_get_ns() - start_time));
}
+typedef int (*pre_test_func)(int tasks);
+const pre_test_func pre_test_funcs[] = {
+ [LRU_HASH_LOOKUP] = pre_test_lru_hash_lookup,
+};
+
typedef void (*test_func)(int cpu);
const test_func test_funcs[] = {
[HASH_PREALLOC] = test_hash_prealloc,
@@ -255,8 +304,25 @@ const test_func test_funcs[] = {
[HASH_LOOKUP] = test_hash_lookup,
[ARRAY_LOOKUP] = test_array_lookup,
[INNER_LRU_HASH_PREALLOC] = test_inner_lru_hash_prealloc,
+ [LRU_HASH_LOOKUP] = test_lru_hash_lookup,
};
+static int pre_test(int tasks)
+{
+ int i;
+
+ for (i = 0; i < NR_TESTS; i++) {
+ if (pre_test_funcs[i] && check_test_flags(i)) {
+ int ret = pre_test_funcs[i](tasks);
+
+ if (ret)
+ return ret;
+ }
+ }
+
+ return 0;
+}
+
static void loop(int cpu)
{
cpu_set_t cpuset;
@@ -277,6 +343,8 @@ static void run_perf_test(int tasks)
pid_t pid[tasks];
int i;
+ assert(!pre_test(tasks));
+
for (i = 0; i < tasks; i++) {
pid[i] = fork();
if (pid[i] == 0) {
@@ -344,6 +412,9 @@ static void fixup_map(struct bpf_map_data *map, int idx)
array_of_lru_hashs_idx = idx;
}
+ if (!strcmp("lru_hash_lookup_map", map->name))
+ lru_hash_lookup_idx = idx;
+
if (num_map_entries <= 0)
return;
--
2.9.5
^ permalink raw reply related
* [PATCH net-next 0/3] bpf: Improve LRU map lookup performance
From: Martin KaFai Lau @ 2017-09-01 6:27 UTC (permalink / raw)
To: netdev; +Cc: Alexei Starovoitov, Daniel Borkmann, kernel-team
This patchset improves the lookup performance of the LRU map.
Please see individual patch for details.
Martin KaFai Lau (3):
bpf: Add lru_hash_lookup performance test
bpf: Inline LRU map lookup
bpf: Only set node->ref = 1 if it has not been set
kernel/bpf/bpf_lru_list.h | 3 +-
kernel/bpf/hashtab.c | 24 +++++++++++++
samples/bpf/map_perf_test_kern.c | 44 +++++++++++++++++++----
samples/bpf/map_perf_test_user.c | 77 ++++++++++++++++++++++++++++++++++++++--
4 files changed, 138 insertions(+), 10 deletions(-)
--
2.9.5
^ permalink raw reply
* netdev watchdog enp1s0 (tg3): transmit queue 0 timed out
From: Frans van Berckel @ 2017-09-01 6:10 UTC (permalink / raw)
To: linux-netdev
Dear NetDev Team,
I am new to this machine. On a marketplace website I
bought a Dell PowerEdge sc1435. Booting a today's Fedora (or even a
Debian) amd64 Live CD from usb, and goes all fine.
[ 0.000000] Linux version 4.11.8-300.fc26.x86_64 (mockbuild@bkernel0
2.phx2.fedoraproject.org) (gcc version 7.1.1 20170622 (Red Hat 7.1.1-
3) (GCC) ) #1 SMP Thu Jun 29 20:09:48 UTC 2017
Until .. I plunged in the ethernet cable for the first time. I have got
console output, what frightens me a bit. It's about the driver for
bcm95721. This kernel does a DMA Status error. And next it's calling a
watchdog for enp1s0 (tg3): transmit queue 0 timed out.
[ 237.169194] tg3 0000:01:00.0 enp1s0: Link is up at 1000 Mbps, full
du
plex
[ 237.169335] tg3 0000:01:00.0 enp1s0: Flow control is on for TX
and
on for RX
[ 237.169375] IPv6: ADDRCONF(NETDEV_CHANGE): enp1s0: link
becomes
ready
[ 243.683910] tg3 0000:01:00.0 enp1s0: DMA Status
error. Resetting
chip.
[ 243.759610] hrtimer: interrupt took 9464192 ns
[ 245.317566] tg3 0000:01:00.0 enp1s0: 0x00000000: 0x165914e4,
0x00100406, 0x02000021, 0x00000010
<snap> a long list of likely the same type of error codes.
[ 245.760055] tg3 0000:01:00.0 enp1s0: 0x00007810: 0x00000000,
0x00000060, 0x0000000d, 0x00000000
[ 245.762065] tg3 0000:01:00.0 enp1s0: 0: Host status block
[00000005:00000016:(0000:0007:0000):(0007:000c)]
[ 245.764128] tg3 0000:01:00.0 enp1s0: 0: NAPI info
[00000015:00000015:(000f:000c:01ff):0006:(00ce:0000:0000:0000)]
[ 245.867391] tg3 0000:01:00.0: tg3_stop_block timed out, ofs=2c00
enable_bit=2
[ 245.971098] tg3 0000:01:00.0: tg3_stop_block timed out, ofs=4800
enable_bit=2
[ 245.993343] tg3 0000:01:00.0 enp1s0: Link is down
[ 249.731158] tg3 0000:01:00.0 enp1s0: Link is up at 1000 Mbps, full
duplex
[ 249.731336] tg3 0000:01:00.0 enp1s0: Flow control is on for TX and
on for RX
[ 254.944022] ------------[ cut here ]------------
[ 254.945010] WARNING: CPU: 1 PID: 0 at net/sched/sch_generic.c:316
dev_watchdog+0x221/0x230
[ 254.945010] NETDEV WATCHDOG: enp1s0 (tg3): transmit queue 0 timed
out
[ 254.945010] Modules linked in: xt_CHECKSUM ipt_MASQUERADE
nf_nat_masquerade_ipv4 tun fuse nf_conntrack_netbios_ns
nf_conntrack_broadcast xt_CT ip6t_rpfilter ip6t_REJECT nf_reject_ipv6
xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge
ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6
ip6table_mangle ip6table_raw ip6table_security iptable_nat
nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack
libcrc32c iptable_mangle iptable_raw iptable_security ebtable_filter
ebtables ip6table_filter ip6_tables powernow_k8 amd64_edac_mod
edac_mce_amd edac_core kvm_amd kvm irqbypass amdkfd amd_iommu_v2 dcdbas
radeon k8temp ipmi_ssif i2c_algo_bit ttm drm_kms_helper drm ipmi_si
ipmi_devintf tpm_tis ipmi_msghandler tpm_tis_core tpm i2c_piix4 shpchp
nls_utf8 isofs squashfs ata_generic
[ 254.945010] pata_acpi uas usb_storage 8021q garp stp llc mrp
serio_raw tg3 sata_sil24 ptp pps_core pata_serverworks sunrpc
scsi_transport_iscsi loop
[ 254.945010] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.11.8-
300.fc26.x86_64 #1
[ 254.945010] Hardware name: Dell Inc. PowerEdge SC1435/0YR707, BIOS
2.2.5 03/21/2008
[ 254.945010] Call Trace:
[ 254.945010] <IRQ>
[ 254.945010] dump_stack+0x63/0x84
[ 254.945010] __warn+0xcb/0xf0
[ 254.945010] warn_slowpath_fmt+0x5a/0x80
[ 254.945010] dev_watchdog+0x221/0x230
[ 254.945010] ? qdisc_rcu_free+0x50/0x50
[ 254.945010] call_timer_fn+0x33/0x130
[ 254.945010] run_timer_softirq+0x3ee/0x440
[ 254.945010] ? ktime_get+0x40/0xb0
[ 254.945010] ? lapic_next_event+0x1d/0x30
[ 254.945010] __do_softirq+0xea/0x2e3
[ 254.945010] irq_exit+0xfb/0x100
[ 254.945010] smp_apic_timer_interrupt+0x3d/0x50
[ 254.945010] apic_timer_interrupt+0x93/0xa0
[ 254.945010] RIP: 0010:native_safe_halt+0x6/0x10
[ 254.945010] RSP: 0018:ffffbabf4038be60 EFLAGS: 00000246 ORIG_RAX:
ffffffffffffff10
[ 254.945010] RAX: 6874754100002d40 RBX: ffff926d39d04880 RCX:
0000000000000000
[ 254.945010] RDX: 0000000000000000 RSI: 0000000000000000 RDI:
0000000000000000
[ 254.945010] RBP: ffffbabf4038be60 R08: ffff926d3d052ae0 R09:
0000000000000000
[ 254.945010] R10: 0000000000000000 R11: 0000000000000000 R12:
0000000000000001
[ 254.945010] R13: ffff926d39d04880 R14: 0000000000000000 R15:
0000000000000000
[ 254.945010] </IRQ>
[ 254.945010] default_idle+0x20/0xe0
[ 254.945010] amd_e400_idle+0x3f/0x50
[ 254.945010] arch_cpu_idle+0xf/0x20
[ 254.945010] default_idle_call+0x23/0x30
[ 254.945010] do_idle+0x170/0x200
[ 254.945010] cpu_startup_entry+0x71/0x80
[ 254.945010] start_secondary+0x154/0x190
[ 254.945010] start_cpu+0x14/0x14
[ 255.080137] ---[ end trace 25a535e6d8610c90 ]---
[ 255.080137] tg3 0000:01:00.0 enp1s0: transmit timed out, resetting
And resetting is what 'he' does again and again. Until you take out the
UTP cable. This counts for the second ethernet port as well. Is there
something with the machine bios or a setting that i can switch? Or did
i find a driver bug? And do i need to find out what's wrong.
Thanks,
Frans van Berckel
^ permalink raw reply
* Re: [PATCH 2/3] security: bpf: Add eBPF LSM hooks and security field to eBPF map
From: Jeffrey Vander Stoep @ 2017-09-01 5:50 UTC (permalink / raw)
To: Alexei Starovoitov
Cc: Chenbo Feng, Daniel Borkmann, LSM List, netdev, SELinux,
Lorenzo Colitti, Chenbo Feng
In-Reply-To: <20170901020520.uifv6b7tvelgxumf@ast-mbp>
On Thu, Aug 31, 2017 at 7:05 PM, Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
> On Thu, Aug 31, 2017 at 01:56:34PM -0700, Chenbo Feng wrote:
>> From: Chenbo Feng <fengc@google.com>
>>
>> Introduce a pointer into struct bpf_map to hold the security information
>> about the map. The actual security struct varies based on the security
>> models implemented. Place the LSM hooks before each of the unrestricted
>> eBPF operations, the map_update_elem and map_delete_elem operations are
>> checked by security_map_modify. The map_lookup_elem and map_get_next_key
>> operations are checked by securtiy_map_read.
>>
>> Signed-off-by: Chenbo Feng <fengc@google.com>
>
> ...
>
>> @@ -410,6 +418,10 @@ static int map_lookup_elem(union bpf_attr *attr)
>> if (IS_ERR(map))
>> return PTR_ERR(map);
>>
>> + err = security_map_read(map);
>> + if (err)
>> + return -EACCES;
>> +
>> key = memdup_user(ukey, map->key_size);
>> if (IS_ERR(key)) {
>> err = PTR_ERR(key);
>> @@ -490,6 +502,10 @@ static int map_update_elem(union bpf_attr *attr)
>> if (IS_ERR(map))
>> return PTR_ERR(map);
>>
>> + err = security_map_modify(map);
>
> I don't feel these extra hooks are really thought through.
> With such hook you'll disallow map_update for given map. That's it.
> The key/values etc won't be used in such security decision.
> In such case you don't need such hooks in update/lookup at all.
> Only in map_creation and object_get calls where FD can be received.
> In other words I suggest to follow standard unix practices:
> Do permissions checks in open() and allow read/write() if FD is valid.
> Same here. Do permission checks in prog_load/map_create/obj_pin/get
> and that will be enough to jail bpf subsystem.
> bpf cmds that need to be fast (like lookup and update) should not
> have security hooks.
>
I do think we want to distinguish between read/write (or read/modify)
for these objects. Essentially, we want to implement the example
described in patch 0/3 where eBPF objects can be passed to less
privileged processes which can read, but not modify the map. What
would be the best way to do this? Add a mode field to the bpf_map
struct?
^ permalink raw reply
* RE: netdev carrier changes is one even after ethernet link up.
From: Bhadram Varka @ 2017-09-01 5:49 UTC (permalink / raw)
To: Florian Fainelli, andrew@lunn.ch; +Cc: linux-netdev
In-Reply-To: <25c3b4ca-02d7-e217-fe27-ca3b7d2cbf07@gmail.com>
Thanks for responding. Now responding inline
> -----Original Message-----
> From: Florian Fainelli [mailto:f.fainelli@gmail.com]
> Sent: Friday, September 01, 2017 5:53 AM
> To: Bhadram Varka <vbhadram@nvidia.com>; andrew@lunn.ch
> Cc: linux-netdev <netdev@vger.kernel.org>
> Subject: Re: netdev carrier changes is one even after ethernet link up.
>
> On 08/30/2017 10:53 PM, Bhadram Varka wrote:
> > Hi,
> >
> >
> >
> > I have observed that carrier_changes is one even in case of the
> > ethernet link is up.
> >
> >
> >
> > After investigating the code below is my observation –
> >
> >
> >
> > ethernet_driver_probe()
> >
> > +--->phy_connect()
> >
> > | +--->phy_attach_direct()
> >
> > | +---> netif_carrier_off() : which increments
> > carrier_changes to one.
> >
> > +--->register_netdevice() : will the carrier_changes becomes zero here ?
> >
> > +--->netif_carrier_off(): not increment the carrier_changes since
> > __LINK_STATE_NOCARRIER already set.
> >
> >
> >
> > From ethernet driver open will start the PHY and trigger the
> > phy_state_machine.
> >
> > Phy_state_machine workqueue calling netif_carrier_on() once the link is
> UP.
> >
> > netif_carrier_on() increments the carrier_changes by one.
>
> If the call trace is correct, then there is at least two problems here:
>
> - phy_connect() does start the PHY machine which means that as soon as it
> detects a link state of any kind (up or down) it can call
> netif_carrier_off() respectively netif_carrier_on()
>
> - as soon as you call register_netdevice() notifiers run and other parts of the
> kernel or user-space programs can see an inconsistent link state
>
> I would suggest doing the following sequence instead:
>
> netif_carrier_off()
> register_netdevice()
> phy_connect()
>
> Which should result in a consistent link state and carrier value.
>
Yes, It will address the issue.
If we did the phy_conect in ndo_open it will make the carrier changes as two. But if we did in probe function then it's not working.
In ethernet driver probe - (below sequence is not working)
phy_connect()
register_netdevice()
netif_carrier_off()
working sequence:
In probe():
register_netdevice()
ndo_open:
phy_connect()
After reverting - https://lkml.org/lkml/2016/1/9/173 this works if we do phy_connect in probe as well.
Thanks,
Bhadram.
> >
> >
> >
> > After link is UP if we check the carrier_changes sysfs node - it will
> > be one only.
> >
> >
> >
> > $ cat /sys/class/net/eth0/carrier_changes
> >
> > 1
> >
> >
> >
> > After reverting the change - https://lkml.org/lkml/2016/1/9/173 (net:
> > phy: turn carrier off on phy attach) then I could see the carrier
> > changes incremented to 2 after Link UP.
> >
> > $ cat /sys/class/net/eth0/carrier_changes
> >
> > 2
> >
> >
> >
> > Thanks,
> >
> > Bhadram.
> >
> > ----------------------------------------------------------------------
> > -- This email message is for the sole use of the intended recipient(s)
> > and may contain confidential information. Any unauthorized review,
> > use, disclosure or distribution is prohibited. If you are not the
> > intended recipient, please contact the sender by reply email and
> > destroy all copies of the original message.
> > ----------------------------------------------------------------------
> > --
>
>
> --
> Florian
^ permalink raw reply
* RE: netdev carrier changes is one even after ethernet link up.
From: Bhadram Varka @ 2017-09-01 5:41 UTC (permalink / raw)
To: Florian Fainelli, andrew@lunn.ch; +Cc: linux-netdev
In-Reply-To: <25c3b4ca-02d7-e217-fe27-ca3b7d2cbf07@gmail.com>
Thanks for responding.
-----Original Message-----
From: Florian Fainelli [mailto:f.fainelli@gmail.com]
Sent: Friday, September 01, 2017 5:53 AM
To: Bhadram Varka <vbhadram@nvidia.com>; andrew@lunn.ch
Cc: linux-netdev <netdev@vger.kernel.org>
Subject: Re: netdev carrier changes is one even after ethernet link up.
On 08/30/2017 10:53 PM, Bhadram Varka wrote:
> Hi,
>
>
>
> I have observed that carrier_changes is one even in case of the
> ethernet link is up.
>
>
>
> After investigating the code below is my observation –
>
>
>
> ethernet_driver_probe()
>
> +--->phy_connect()
>
> | +--->phy_attach_direct()
>
> | +---> netif_carrier_off() : which increments
> carrier_changes to one.
>
> +--->register_netdevice() : will the carrier_changes becomes zero here ?
>
> +--->netif_carrier_off(): not increment the carrier_changes since
> __LINK_STATE_NOCARRIER already set.
>
>
>
> From ethernet driver open will start the PHY and trigger the
> phy_state_machine.
>
> Phy_state_machine workqueue calling netif_carrier_on() once the link is UP.
>
> netif_carrier_on() increments the carrier_changes by one.
If the call trace is correct, then there is at least two problems here:
- phy_connect() does start the PHY machine which means that as soon as it detects a link state of any kind (up or down) it can call
netif_carrier_off() respectively netif_carrier_on()
- as soon as you call register_netdevice() notifiers run and other parts of the kernel or user-space programs can see an inconsistent link state
I would suggest doing the following sequence instead:
netif_carrier_off()
register_netdevice()
phy_connect()
Which should result in a consistent link state and carrier value.
Yes, It will address the issue.
If we did the phy_conect in ndo_open it will make the carrier changes as one. But if we did in probe function then it's not working.
In ethernet driver probe - (below sequence is not working)
phy_connect()
register_netdevice()
netif_carrier_off()
working sequence:
In probe():
register_netdevice()
ndo_open:
phy_connect()
Thanks,
Bhadram.
>
>
>
> After link is UP if we check the carrier_changes sysfs node - it will
> be one only.
>
>
>
> $ cat /sys/class/net/eth0/carrier_changes
>
> 1
>
>
>
> After reverting the change - https://lkml.org/lkml/2016/1/9/173 (net:
> phy: turn carrier off on phy attach) then I could see the carrier
> changes incremented to 2 after Link UP.
>
> $ cat /sys/class/net/eth0/carrier_changes
>
> 2
>
>
>
> Thanks,
>
> Bhadram.
>
> ----------------------------------------------------------------------
> -- This email message is for the sole use of the intended recipient(s)
> and may contain confidential information. Any unauthorized review,
> use, disclosure or distribution is prohibited. If you are not the
> intended recipient, please contact the sender by reply email and
> destroy all copies of the original message.
> ----------------------------------------------------------------------
> --
--
Florian
^ permalink raw reply
* Re: [PATCH 13/31] timer: Remove meaningless .data/.function assignments
From: Greg Kroah-Hartman @ 2017-09-01 5:09 UTC (permalink / raw)
To: Kees Cook
Cc: Thomas Gleixner, Krzysztof Halasa, Aditya Shankar, Ganesh Krishna,
Jens Axboe, netdev, linux-wireless, devel, linux-kernel
In-Reply-To: <1504222183-61202-14-git-send-email-keescook@chromium.org>
On Thu, Aug 31, 2017 at 04:29:25PM -0700, Kees Cook wrote:
> Several timer users needlessly reset their .function/.data fields during
> their timer callback, but nothing else changes them. Some users do not
> use their .data field at all. Each instance is removed here.
>
> Cc: Krzysztof Halasa <khc@pm.waw.pl>
> Cc: Aditya Shankar <aditya.shankar@microchip.com>
> Cc: Ganesh Krishna <ganesh.krishna@microchip.com>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Jens Axboe <axboe@fb.com>
> Cc: netdev@vger.kernel.org
> Cc: linux-wireless@vger.kernel.org
> Cc: devel@driverdev.osuosl.org
> Signed-off-by: Kees Cook <keescook@chromium.org>
> ---
> drivers/block/amiflop.c | 3 +--
> drivers/net/wan/hdlc_cisco.c | 2 --
> drivers/net/wan/hdlc_fr.c | 2 --
> drivers/staging/wilc1000/wilc_wfi_cfgoperations.c | 4 +---
> 4 files changed, 2 insertions(+), 9 deletions(-)
For the staging driver:
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
^ permalink raw reply
* Re: [PATCH v3 net-next 0/7] bpf: Add option to set mark and priority in cgroup sock programs
From: David Miller @ 2017-09-01 5:06 UTC (permalink / raw)
To: dsahern; +Cc: netdev, daniel, ast
In-Reply-To: <1504217150-16151-1-git-send-email-dsahern@gmail.com>
From: David Ahern <dsahern@gmail.com>
Date: Thu, 31 Aug 2017 15:05:43 -0700
> Add option to set mark and priority in addition to bound device for newly
> created sockets. Also, allow the bpf programs to use the get_current_uid_gid
> helper meaning socket marks, priority and device can be set based on the
> uid/gid of the running process.
>
> Sample programs are updated to demonstrate the new options.
>
> v3
> - no changes to Patches 1 and 2 which Alexei acked in previous versions
> - dropped change related to recursive programs in a cgroup
> - updated tests per dropped patch
>
> v2
> - added flag to control recursive behavior as requested by Alexei
> - added comment to sock_filter_func_proto regarding use of
> get_current_uid_gid helper
> - updated test programs for recursive option
Series applied, please follow up to Daniel's feedback.
Thanks.
^ permalink raw reply
* (unknown),
From: adriix.addy @ 2017-09-01 4:59 UTC (permalink / raw)
To: netdev
[-- Attachment #1: 2798076558.doc --]
[-- Type: application/msword, Size: 40462 bytes --]
^ permalink raw reply
* [PATCH] net: ethernet: ibm-emac: Add 5482 PHY init for OpenBlocks 600
From: Benjamin Herrenschmidt @ 2017-09-01 4:44 UTC (permalink / raw)
To: netdev
The vendor patches initialize those registers to get the
PHY working properly.
Sadly I don't have that PHY spec and whatever Broadcom PHY
code we already have don't seem to document these two shadow
registers (unless I miscalculated the address) so I'm keeping
this as "vendor magic for that board". The vendor has long
abandoned that product, but I find it handy to test ppc405
kernels and so would like to keep it alive upstream :-)
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
Note: Ideally, the whole driver should switch over to the
generic PHY layer. However this is a much bigger undertaking
which requires access to a bunch of HW to test, and for which
I have neither the time nor the HW available these days.
(Some of the HW could prove hard to find ...)
---
drivers/net/ethernet/ibm/emac/phy.c | 30 ++++++++++++++++++++++++++++++
1 file changed, 30 insertions(+)
diff --git a/drivers/net/ethernet/ibm/emac/phy.c b/drivers/net/ethernet/ibm/emac/phy.c
index 35865d05fccd..daa10de542fb 100644
--- a/drivers/net/ethernet/ibm/emac/phy.c
+++ b/drivers/net/ethernet/ibm/emac/phy.c
@@ -24,6 +24,7 @@
#include <linux/mii.h>
#include <linux/ethtool.h>
#include <linux/delay.h>
+#include <linux/of.h>
#include "emac.h"
#include "phy.h"
@@ -363,6 +364,34 @@ static struct mii_phy_def bcm5248_phy_def = {
.ops = &generic_phy_ops
};
+static int bcm5482_init(struct mii_phy *phy)
+{
+ if (!of_machine_is_compatible("plathome,obs600"))
+ return 0;
+
+ /* Magic inits from vendor original patches */
+ phy_write(phy, 0x1c, 0xa410);
+ phy_write(phy, 0x1c, 0x8804);
+
+ return 0;
+}
+
+static const struct mii_phy_ops bcm5482_phy_ops = {
+ .init = bcm5482_init,
+ .setup_aneg = genmii_setup_aneg,
+ .setup_forced = genmii_setup_forced,
+ .poll_link = genmii_poll_link,
+ .read_link = genmii_read_link
+};
+
+static struct mii_phy_def bcm5482_phy_def = {
+
+ .phy_id = 0x0143bcb0,
+ .phy_id_mask = 0x0ffffff0,
+ .name = "BCM5482 Gigabit Ethernet",
+ .ops = &bcm5482_phy_ops
+};
+
static int m88e1111_init(struct mii_phy *phy)
{
pr_debug("%s: Marvell 88E1111 Ethernet\n", __func__);
@@ -499,6 +528,7 @@ static struct mii_phy_def *mii_phy_table[] = {
&et1011c_phy_def,
&cis8201_phy_def,
&bcm5248_phy_def,
+ &bcm5482_phy_def,
&m88e1111_phy_def,
&m88e1112_phy_def,
&ar8035_phy_def,
--
2.13.5
^ permalink raw reply related
* Re: [RFC net-next 0/8] net: dsa: Multi-queue awareness
From: Florian Fainelli @ 2017-09-01 4:10 UTC (permalink / raw)
To: Andrew Lunn, jiri, jhs; +Cc: netdev, davem, xiyou.wangcong, vivien.didelot
In-Reply-To: <20170901000502.GB28960@lunn.ch>
On 08/31/2017 05:05 PM, Andrew Lunn wrote:
> On Wed, Aug 30, 2017 at 05:18:44PM -0700, Florian Fainelli wrote:
>> This patch series is sent as reference, especially because the last patch
>> is trying not to be creating too many layer violations, but clearly there
>> are a little bit being created here anyways.
>>
>> Essentially what I am trying to achieve is that you have a stacked device which
>> is multi-queue aware, that applications will be using, and for which they can
>> control the queue selection (using mq) the way they want. Each of each stacked
>> network devices are created for each port of the switch (this is what DSA
>> does). When a skb is submitted from say net_device X, we can derive its port
>> number and look at the queue_mapping value to determine which port of the
>> switch and queue we should be sending this to. The information is embedded in a
>> tag (4 bytes) and is used by the switch to steer the transmission.
>>
>> These stacked devices will actually transmit using a "master" or conduit
>> network device which has a number of queues as well. In one version of the
>> hardware that I work with, we have up to 4 ports, each with 8 queues, and the
>> master device has a total of 32 hardware queues, so a 1:1 mapping is easy. With
>> another version of the hardware, same number of ports and queues, but only 16
>> hardware queues, so only a 2:1 mapping is possible.
>>
>> In order for congestion information to work properly, I need to establish a
>> mapping, preferably before transmission starts (but reconfiguration while
>> interfaces are running would be possible too) between these stacked device's
>> queue and the conduit interface's queue.
>>
>> Comments, flames, rotten tomatoes, anything!
>
> Right, i think i understand.
>
> This works just for traffic between the host and ports. The host can
> set the egress queue. And i assume the queues are priorities, either
> absolute or weighted round robin, etc.
>
> But this has no effect on traffic going from port to port. At some
> point, i expect you will want to offload TC for that.
You are absolutely right, this patch series aims at having the host be
able to steer traffic towards particular switch port egress queues which
are configured with specific priorities. At the moment it really is
mapping one priority value (in the 802.1p sense) to one queue number and
let the switch scheduler figure things out.
With this patch set you can now use the multiq filter of tc and do
exactly what is documented under Documentation/networking/multiqueue.txt
and get the desired matches to be steered towards the queue you defined.
>
> How will the two interact? Could the TC rules also act on traffic from
> the host to a port? Would it be simpler in the long run to just
> implement TC rules?
I suppose that you could somehow use TC to influence how the traffic
from host to CPU works, but without a "CPU" port representor the
question is how do we get that done? If we used "eth0" we need to
callback into the switch driver for programming..
Regarding the last patch in this series, what I would ideally to replace
it with is something along the lines of:
tc bind dev sw0p0 queue 0 dev eth0 queue 16
I am not sure if this is an action, or a filter, or something else...
--
Florian
^ permalink raw reply
* Re: [RFC net-next 1/8] net: dsa: Allow switch drivers to indicate number of RX/TX queues
From: Florian Fainelli @ 2017-09-01 4:00 UTC (permalink / raw)
To: Andrew Lunn; +Cc: netdev, jiri, jhs, davem, xiyou.wangcong, vivien.didelot
In-Reply-To: <20170831234412.GA28960@lunn.ch>
On 08/31/2017 04:44 PM, Andrew Lunn wrote:
> On Wed, Aug 30, 2017 at 05:18:45PM -0700, Florian Fainelli wrote:
>> Let switch drivers indicate how many RX and TX queues they support. Some
>> switches, such as Broadcom Starfighter 2 are resigned with 8 egress
>> queues.
>
> Marvell switches also have egress queue.
>
> Does the SF2 have ingress queues? Marvel don't as far as i known. So
> i wounder if num_rx_queues is useful?
At the moment probably not, since we are not doing anything useful other
than creating the network devices with the indicated number of queues.
>
> Do switches in general have ingress queues?
They do, at least the Starfigther 2 has, and from the Broadcom tag you
can get such information (BRCM_EG_TC_SHIFT) and you could presumably
record that queue on the SKB. I don't have an use case for that (yet?).
--
Florian
^ permalink raw reply
* [RFC] tools: selftests: psock_tpacket: skip un-supported tpacket_v3 test
From: Orson Zhai @ 2017-09-01 3:53 UTC (permalink / raw)
To: David S . Miller, Shuah Khan
Cc: netdev, linux-kselftest, linux-kernel, Orson Zhai
The TPACKET_V3 test of PACKET_TX_RING will fail with kernel version
lower than v4.11. Supported code of tx ring was add with commit id
<7f953ab2ba46: af_packet: TX_RING support for TPACKET_V3> at Jan. 3
of 2017.
So skip this item test instead of reporting failing for old kernels.
Signed-off-by: Orson Zhai <orson.zhai@linaro.org>
---
tools/testing/selftests/net/psock_tpacket.c | 17 ++++++++++++++---
1 file changed, 14 insertions(+), 3 deletions(-)
diff --git a/tools/testing/selftests/net/psock_tpacket.c b/tools/testing/selftests/net/psock_tpacket.c
index 7f6cd9fdacf3..f0cfc18c3726 100644
--- a/tools/testing/selftests/net/psock_tpacket.c
+++ b/tools/testing/selftests/net/psock_tpacket.c
@@ -57,6 +57,7 @@
#include <net/if.h>
#include <inttypes.h>
#include <poll.h>
+#include <errno.h>
#include "psock_lib.h"
@@ -676,7 +677,7 @@ static void __v3_fill(struct ring *ring, unsigned int blocks, int type)
ring->flen = ring->req3.tp_block_size;
}
-static void setup_ring(int sock, struct ring *ring, int version, int type)
+static int setup_ring(int sock, struct ring *ring, int version, int type)
{
int ret = 0;
unsigned int blocks = 256;
@@ -703,7 +704,11 @@ static void setup_ring(int sock, struct ring *ring, int version, int type)
if (ret == -1) {
perror("setsockopt");
- exit(1);
+ if (errno == EINVAL) {
+ printf("[SKIP] This type seems un-supported in current kernel, skipped.\n");
+ return -1;
+ } else
+ exit(1);
}
ring->rd_len = ring->rd_num * sizeof(*ring->rd);
@@ -715,6 +720,7 @@ static void setup_ring(int sock, struct ring *ring, int version, int type)
total_packets = 0;
total_bytes = 0;
+ return 0;
}
static void mmap_ring(int sock, struct ring *ring)
@@ -830,7 +836,12 @@ static int test_tpacket(int version, int type)
sock = pfsocket(version);
memset(&ring, 0, sizeof(ring));
- setup_ring(sock, &ring, version, type);
+ if(setup_ring(sock, &ring, version, type)) {
+ /* skip test when error of invalid argument */
+ close(sock);
+ return 0;
+ }
+
mmap_ring(sock, &ring);
bind_ring(sock, &ring);
walk_ring(sock, &ring);
--
2.12.2
^ permalink raw reply related
* Re: virtio_net: ethtool supported link modes
From: Jason Wang @ 2017-09-01 3:36 UTC (permalink / raw)
To: Radu Rendec, virtualization, netdev, linux-kernel; +Cc: Michael S. Tsirkin
In-Reply-To: <1504199044.22080.11.camel@arista.com>
On 2017年09月01日 01:04, Radu Rendec wrote:
> Hello,
>
> Looking at the code in virtnet_set_link_ksettings, it seems the speed
> and duplex can be set to any valid value. The driver will "remember"
> them and report them back in virtnet_get_link_ksettings.
>
> However, the supported link modes (link_modes.supported in struct
> ethtool_link_ksettings) is always 0, indicating that no speed/duplex
> setting is supported.
>
> Does it make more sense to set (at least a few of) the supported link
> modes, such as 10baseT_Half ... 10000baseT_Full?
>
> I would expect to see consistency between what is reported in
> link_modes.supported and what can actually be set. Could you please
> share your opinion on this?
I think the may make sense only if there's a hardware implementation for
virtio. And we probably need to extend virtio spec for adding new commands.
Thanks
>
> Thank you,
> Radu Rendec
>
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply
* Re: [PATCH net-next] doc: document MSG_ZEROCOPY
From: Willem de Bruijn @ 2017-09-01 3:31 UTC (permalink / raw)
To: Alexei Starovoitov; +Cc: Network Development, David Miller, Willem de Bruijn
In-Reply-To: <20170901031009.2p4vv6fncma2y2l7@ast-mbp>
On Thu, Aug 31, 2017 at 11:10 PM, Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
> On Thu, Aug 31, 2017 at 11:04:41PM -0400, Willem de Bruijn wrote:
>> On Thu, Aug 31, 2017 at 10:10 PM, Alexei Starovoitov
>> <alexei.starovoitov@gmail.com> wrote:
>> > On Thu, Aug 31, 2017 at 05:00:13PM -0400, Willem de Bruijn wrote:
>> >> From: Willem de Bruijn <willemb@google.com>
>> >>
>> >> Documentation for this feature was missing from the patchset.
>> >> Copied a lot from the netdev 2.1 paper, addressing some small
>> >> interface changes since then.
>> >>
>> >> Signed-off-by: Willem de Bruijn <willemb@google.com>
>> > ...
>> >> +Notification Batching
>> >> +~~~~~~~~~~~~~~~~~~~~~
>> >> +
>> >> +Multiple outstanding packets can be read at once using the recvmmsg
>> >> +call. This is often not needed. In each message the kernel returns not
>> >> +a single value, but a range. It coalesces consecutive notifications
>> >> +while one is outstanding for reception on the error queue.
>> >> +
>> >> +When a new notification is about to be queued, it checks whether the
>> >> +new value extends the range of the notification at the tail of the
>> >> +queue. If so, it drops the new notification packet and instead increases
>> >> +the range upper value of the outstanding notification.
>> >
>> > Would it make sense to mention that max notification range is 32-bit?
>> > So each 4Gbyte of xmit bytes there will be a notification.
>> > In modern 40Gbps NICs it's not a lot. Means that there will be
>> > at least one notification every second.
>> > Or I misread the code?
>>
>> You're right. The doc does mention that the counter and range
>> are 32-bit. I can state more explicitly that that bounds the working
>> set size to 4GB. Do you expect this to be problematic? Processing
>> a single notification per 4GB of data should not be a significant
>> cost in itself.
>
> I think 4GB is fine. Just there was an idea that in cases when
> notification of transmission can be known by other means
Some kind of unspoofable response from the peer (i.e., not just
a tcp ack), or a kernel mechanism independent from the error
queue? The first does not guarantee that a retransmit is
not in progress.
> the user space
> could have skipped reading errqeuee completely, but looks like it
> still needs to poll.
If a process has no need to see the notification, say because
it is sending out a buffer that is constant for the process lifetime,
then it could conceivably skip the recv, and poll with it. The code
as written will not coalesce more than 4GB of data, but that could
be revised.
^ permalink raw reply
* Re: [PATCH v2 net-next 1/8] bpf: Add support for recursively running cgroup sock filters
From: Alexei Starovoitov @ 2017-09-01 3:27 UTC (permalink / raw)
To: Tejun Heo; +Cc: David Ahern, netdev, daniel, ast, davem
In-Reply-To: <20170831142201.GB1599492@devbig577.frc2.facebook.com>
On Thu, Aug 31, 2017 at 07:22:01AM -0700, Tejun Heo wrote:
> Hello, David, Alexei.
>
> Sorry about late reply.
>
> On Sun, Aug 27, 2017 at 08:49:23AM -0600, David Ahern wrote:
> > On 8/25/17 8:49 PM, Alexei Starovoitov wrote:
> > >
> > >> + if (prog && curr_recursive && !new_recursive)
> > >> + /* if a parent has recursive prog attached, only
> > >> + * allow recursive programs in descendent cgroup
> > >> + */
> > >> + return -EINVAL;
> > >> +
> > >> old_prog = cgrp->bpf.prog[type];
> > >
> > > ... I'm struggling to completely understand how it interacts
> > > with BPF_F_ALLOW_OVERRIDE.
> >
> > The 2 flags are completely independent. The existing override logic is
> > unchanged. If a program can not be overridden, then the new recursive
> > flag is irrelevant.
>
> I'm not sure all four combo of the two flags makes sense. Can't we
> have something simpler like the following?
>
> 1. None: No further bpf programs allowed in the subtree.
>
> 2. Overridable: If a sub-cgroup installs the same bpf program, this
> one yields to that one.
>
> 3. Recursive: If a sub-cgroup installs the same bpf program, that
> cgroup program gets run in addition to this one.
>
> Note that we can have combinations of overridables and recursives -
> both allow further programs in the sub-hierarchy and the only
> distinction is whether that specific program behaves when that
> happens.
If I understand the proposal correctly in case of:
A (with recurs) -> B (with override) -> C (with recurse) -> D (with override)
when something happens in D, you propose to run D,C,A ?
With the order of execution from children to parent?
That's a bit a different then what I was proposing with 'multi-prog' flag,
but the more I think about it the more I like it.
In your case detach is sort of transparent to everything around.
And you would also allow to say 'None' to one of the substrees too, right?
So something like:
A (with recurs) -> B (with override) -> C (with recurse) -> D (None) -> E
would mean that E cannot attach anything and events in E will
call D->C->A, right?
I will work on a patch for the above and see how it looks.
^ permalink raw reply
* Re: [PATCH net-next] virtio-net: invoke zerocopy callback on xmit path if no tx napi
From: Jason Wang @ 2017-09-01 3:25 UTC (permalink / raw)
To: Willem de Bruijn, Michael S. Tsirkin
Cc: Koichiro Den, virtualization, Network Development
In-Reply-To: <CAF=yD-+AjQLLUKdvnrwd2tqFtw4Hm81cR7WUJd65oLnziNGM8A@mail.gmail.com>
On 2017年08月31日 22:30, Willem de Bruijn wrote:
>> Incomplete results at this stage, but I do see this correlation between
>> flows. It occurs even while not running out of zerocopy descriptors,
>> which I cannot yet explain.
>>
>> Running two threads in a guest, each with a udp socket, each
>> sending up to 100 datagrams, or until EAGAIN, every msec.
>>
>> Sender A sends 1B datagrams.
>> Sender B sends VHOST_GOODCOPY_LEN, which is enough
>> to trigger zcopy_used in vhost net.
>>
>> A local receive process on the host receives both flows. To avoid
>> a deep copy when looping the packet onto the receive path,
>> changed skb_orphan_frags_rx to always return false (gross hack).
>>
>> The flow with the larger packets is redirected through netem on ifb0:
>>
>> modprobe ifb
>> ip link set dev ifb0 up
>> tc qdisc add dev ifb0 root netem limit $LIMIT rate 1MBit
>>
>> tc qdisc add dev tap0 ingress
>> tc filter add dev tap0 parent ffff: protocol ip \
>> u32 match ip dport 8000 0xffff \
>> action mirred egress redirect dev ifb0
>>
>> For 10 second run, packet count with various ifb0 queue lengths $LIMIT:
>>
>> no filter
>> rx.A: ~840,000
>> rx.B: ~840,000
>>
>> limit 1
>> rx.A: ~500,000
>> rx.B: ~3100
>> ifb0: 3273 sent, 371141 dropped
>>
>> limit 100
>> rx.A: ~9000
>> rx.B: ~4200
>> ifb0: 4630 sent, 1491 dropped
>>
>> limit 1000
>> rx.A: ~6800
>> rx.B: ~4200
>> ifb0: 4651 sent, 0 dropped
>>
>> Sender B is always correctly rate limited to 1 MBps or less. With a
>> short queue, it ends up dropping a lot and sending even less.
>>
>> When a queue builds up for sender B, sender A throughput is strongly
>> correlated with queue length. With queue length 1, it can send almost
>> at unthrottled speed. But even at limit 100 its throughput is on the
>> same order as sender B.
>>
>> What is surprising to me is that this happens even though the number
>> of ubuf_info in use at limit 100 is around 100 at all times. In other words,
>> it does not exhaust the pool.
>>
>> When forcing zcopy_used to be false for all packets, this effect of
>> sender A throughput being correlated with sender B does not happen.
>>
>> no filter
>> rx.A: ~850,000
>> rx.B: ~850,000
>>
>> limit 100
>> rx.A: ~850,000
>> rx.B: ~4200
>> ifb0: 4518 sent, 876182 dropped
>>
>> Also relevant is that with zerocopy, the sender processes back off
>> and report the same count as the receiver. Without zerocopy,
>> both senders send at full speed, even if only 4200 packets from flow
>> B arrive at the receiver.
>>
>> This is with the default virtio_net driver, so without napi-tx.
>>
>> It appears that the zerocopy notifications are pausing the guest.
>> Will look at that now.
> It was indeed as simple as that. With 256 descriptors, queuing even
> a hundred or so packets causes the guest to stall the device as soon
> as the qdisc is installed.
>
> Adding this check
>
> + in_use = nvq->upend_idx - nvq->done_idx;
> + if (nvq->upend_idx < nvq->done_idx)
> + in_use += UIO_MAXIOV;
> +
> + if (in_use > (vq->num >> 2))
> + zcopy_used = false;
>
> Has the desired behavior of reverting zerocopy requests to copying.
>
> Without this change, the result is, as previously reported, throughput
> dropping to hundreds of packets per second on both flows.
>
> With the change, pps as observed for a few seconds at handle_tx is
>
> zerocopy=165 copy=168435
> zerocopy=0 copy=168500
> zerocopy=65 copy=168535
>
> Both flows continue to send at more or less normal rate, with only
> sender B observing massive drops at the netem.
>
> With the queue removed the rate reverts to
>
> zerocopy=58878 copy=110239
> zerocopy=58833 copy=110207
>
> This is not a 50/50 split, which impliesTw that some packets from the large
> packet flow are still converted to copying. Without the change the rate
> without queue was 80k zerocopy vs 80k copy, so this choice of
> (vq->num >> 2) appears too conservative.
>
> However, testing with (vq->num >> 1) was not as effective at mitigating
> stalls. I did not save that data, unfortunately. Can run more tests on fine
> tuning this variable, if the idea sounds good.
Looks like there're still two cases were left:
1) sndbuf is not INT_MAX
2) tx napi is used for virtio-net
1) could be a corner case, and for 2) what your suggest here may not
solve the issue since it still do in order completion.
Thanks
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox