Netdev List
 help / color / mirror / Atom feed
* Re: [Intel-wired-lan] [PATCH 1/2] e1000e: Don't return uninitialized stats
From: Benjamin Poirier @ 2017-04-24 19:01 UTC (permalink / raw)
  To: Paul Menzel; +Cc: Jeff Kirsher, netdev, intel-wired-lan, Stefan Priebe
In-Reply-To: <71d8032d-5ed1-8242-83e6-16ce65718966@molgen.mpg.de>

On 2017/04/24 10:23, Paul Menzel wrote:
> Dear Benjamin,
> 
> 
> Thank you for your fix.
> 
> On 04/21/17 23:20, Benjamin Poirier wrote:
> > Some statistics passed to ethtool are garbage because e1000e_get_stats64()
> > doesn't write them, for example: tx_heartbeat_errors. This leaks kernel
> > memory to userspace and confuses users.
> 
> Could you please give specific examples to reproduce the issue? That way
> your fix can also be tested.
> 

Some fields in e1000_get_ethtool_stats()'s net_stats are not initialized
by e1000e_get_stats64(). The structure is allocated on the stack,
therefore, the value of those fields depends on previous stack content;
that in turns depends on kernel version, compiler and previous execution
path. I've tried on 8 machines with different kernel versions and it
reproduced on 3.

root@linux-zxe0:/usr/local/src/linux# git log -n1 --oneline
fc1f8f4f310a net: ipv6: send unsolicited NA if enabled for all interfaces
root@linux-zxe0:/usr/local/src/linux# ethtool -i eth0
driver: e1000e
[...]
root@linux-zxe0:/usr/local/src/linux# ethtool -S eth0
NIC statistics:
     rx_packets: 217
     tx_packets: 153
     rx_bytes: 23091
     tx_bytes: 20533
     rx_broadcast: 0
     tx_broadcast: 6
     rx_multicast: 0
     tx_multicast: 10
     rx_errors: 0
     tx_errors: 0
     tx_dropped: 18446683600612146192
     multicast: 0
     collisions: 0
     rx_length_errors: 0
     rx_over_errors: 70364470214850
     rx_crc_errors: 0
     rx_frame_errors: 0
     rx_no_buffer_count: 0
     rx_missed_errors: 0
     tx_aborted_errors: 0
     tx_carrier_errors: 0
     tx_fifo_errors: 18446744072101618112
     tx_heartbeat_errors: 18446612150964469760
[...]

(gdb) p /x 18446683600612146192
$1 = 0xffffc9000282bc10
(gdb) p /x 18446744072101618112
$2 = 0xffffffffa028e1c0
(gdb) p /x 18446612150964469760
$3 = 0xffff880457a44000
... a bunch of kernel addresses

Inserting a dummy memset is a reliable way to show the issue:

--- a/drivers/net/ethernet/intel/e1000e/ethtool.c
+++ b/drivers/net/ethernet/intel/e1000e/ethtool.c
@@ -2061,6 +2061,8 @@ static void e1000_get_ethtool_stats(struct net_device *netdev,
        int i;
        char *p = NULL;

+       memset(&net_stats, 0xff, sizeof(net_stats));
+
        pm_runtime_get_sync(netdev->dev.parent);

        e1000e_get_stats64(netdev, &net_stats);

root@linux-zxe0:/usr/local/src/linux# ethtool -S eth0
NIC statistics:
     rx_packets: 30
     tx_packets: 29
     rx_bytes: 2924
     tx_bytes: 3012
     rx_broadcast: 0
     tx_broadcast: 6
     rx_multicast: 0
     tx_multicast: 7
     rx_errors: 0
     tx_errors: 0
     tx_dropped: 18446744073709551615
     multicast: 0
     collisions: 0
     rx_length_errors: 0
     rx_over_errors: 18446744073709551615
     rx_crc_errors: 0
     rx_frame_errors: 0
     rx_no_buffer_count: 0
     rx_missed_errors: 0
     tx_aborted_errors: 0
     tx_carrier_errors: 0
     tx_fifo_errors: 18446744073709551615
     tx_heartbeat_errors: 18446744073709551615
[...]

(gdb) p /x 18446744073709551615
$1 = 0xffffffffffffffff

^ permalink raw reply

* Re: [PATCH] net: bridge: suppress broadcast when multicast flood is disabled
From: Stephen Hemminger @ 2017-04-24 19:00 UTC (permalink / raw)
  To: Mike Manning; +Cc: netdev, Nikolay Aleksandrov
In-Reply-To: <1493042944-4005-1-git-send-email-mmanning@brocade.com>

On Mon, 24 Apr 2017 15:09:04 +0100
Mike Manning <mmanning@brocade.com> wrote:

> Flood suppression for packets that are not unicast needs to be handled
> consistently by also not flooding broadcast packets. As broadcast is a
> special case of multicast, the same kernel parameter should be used to
> suppress flooding for both of these packet types.
> 
> Fixes: b6cb5ac8331b ("net: bridge: add per-port multicast flood flag")
> Cc: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
> Signed-off-by: Mike Manning <mmanning@brocade.com>
> ---

Makes sense.

Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>

^ permalink raw reply

* Re: [PATCH v2] brcmfmac: Make skb header writable before use
From: Arend Van Spriel @ 2017-04-24 18:59 UTC (permalink / raw)
  To: Eric Dumazet, James Hughes
  Cc: Franky Lin, Hante Meuleman, Kalle Valo,
	linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	brcm80211-dev-list.pdl-dY08KVG/lbpWk0Htik3J/w,
	netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1493057382.6453.39.camel-XN9IlZ5yJG9HTL0Zs8A6p+yfmBU6pStAUsxypvmhUTTZJqsBc5GL+g@public.gmane.org>



On 24-4-2017 20:09, Eric Dumazet wrote:
> On Mon, 2017-04-24 at 14:03 +0100, James Hughes wrote:
>> The driver was making changes to the skb_header without
>> ensuring it was writable (i.e. uncloned).
>> This patch also removes some boiler plate header size
>> checking/adjustment code as that is also handled by the
>> skb_cow_header function used to make header writable.
>>
>> This patch depends on
>> brcmfmac: Ensure pointer correctly set if skb data location changes
>>
>> Signed-off-by: James Hughes <james.hughes-FnsA7b+Nu9XbIbC87yuRow@public.gmane.org>
>> ---
>> Changes in v2
>>   Makes the _cow_ call at the entry point of the skb in to the
>>   stack, means only needs to be done once, and error handling
>>   is easier.
>>   Split a separate minor bug fix off to a separate patch (which
>>   this patch depends on)
> 
> Best way to handle dependencies is to send a patch series.
> 
> 1/2 brcmfmac: Ensure pointer correctly set if skb data location changes
> 2/2 brcmfmac: Make skb header writable before use

There is actually no real dependency. Both are valid fixes of course but
either one applies to wireless-drivers-next/master on its own.

Regards,
Arend

^ permalink raw reply

* Re: [iproute PATCH] man: ip-rule.8: Further clarify how to interpret priority value
From: Stephen Hemminger @ 2017-04-24 18:44 UTC (permalink / raw)
  To: Phil Sutter; +Cc: netdev
In-Reply-To: <20170424153537.15072-1-phil@nwl.cc>

On Mon, 24 Apr 2017 17:35:37 +0200
Phil Sutter <phil@nwl.cc> wrote:

> Despite the past changes, users seemed to get confused by the seemingly
> contradictory relation of priority value and actual rule priority.
> 
> Signed-off-by: Phil Sutter <phil@nwl.cc>

Makes sense. Applied.

^ permalink raw reply

* Re: PROBLEM: IPVS incorrectly reverse-NATs traffic to LVS host
From: Julian Anastasov @ 2017-04-24 18:42 UTC (permalink / raw)
  To: Nick Moriarty; +Cc: Wensong Zhang, Simon Horman, netdev, lvs-devel
In-Reply-To: <CAEo=OUmQb0FGQ2Lm+H4RNKTZD7hVKBJ+LTe2Sjy+BgW84yJ_ug@mail.gmail.com>


	Hello,

On Mon, 24 Apr 2017, Nick Moriarty wrote:

> Many thanks for your prompt fix!  I've now tested this patch against
> the 4.11.0-rc8 kernel on Ubuntu, and I can confirm that my check
> script is no longer seeing incorrect addresses in its responses.
> 
> Could you please keep me posted as this is merged?

	Sure. Thanks for the confirmation! I'll do some
tests and will post official patch in few days.

Regards

--
Julian Anastasov <ja@ssi.bg>

^ permalink raw reply

* Re: [PATCH net-next v3 0/5] nfp: DMA flags, adjust head and fixes
From: David Miller @ 2017-04-24 18:36 UTC (permalink / raw)
  To: jakub.kicinski; +Cc: netdev, kubakici, oss-drivers
In-Reply-To: <20170423031756.94429-1-jakub.kicinski@netronome.com>

From: Jakub Kicinski <jakub.kicinski@netronome.com>
Date: Sat, 22 Apr 2017 20:17:51 -0700

> This series takes advantage of Alex's DMA_ATTR_SKIP_CPU_SYNC to make 
> XDP packet modifications "correct" from DMA API point of view.  It 
> also allows us to parse the metadata before we run XDP at no additional
> DMA sync cost.  That way we can get rid of the metadata memcpy, and 
> remove the last upstream user of bpf_prog->xdp_adjust_head.
> 
> David's patch adds a way to read capabilities from the management
> firmware.
> 
> There are also two net-next fixes.  Patch 4 which fixes what seems to
> be a result of a botched rebase on my part.  Patch 5 corrects locking
> when state of ethernet ports is being refreshed.
> 
> ---
> v3: move the sync from alloc func to the actual give to hw func
> v2: sync rx buffers before giving them to the card (Alex)

Series applied, thanks Jakub.

^ permalink raw reply

* Re: [PATCH net-next 0/2] flower: add MPLS matching support
From: David Miller @ 2017-04-24 18:32 UTC (permalink / raw)
  To: benjamin.lahaise; +Cc: netdev, bcrl
In-Reply-To: <1492894367-11637-1-git-send-email-benjamin.lahaise@netronome.com>

From: Benjamin LaHaise <benjamin.lahaise@netronome.com>
Date: Sat, 22 Apr 2017 16:52:45 -0400

> From: Benjamin LaHaise <bcrl@kvack.org>
> 
> This patch series adds support for parsing MPLS flows in the flow dissector
> and the flower classifier.  Each of the MPLS TTL, BOS, TC and Label fields
> can be used for matching.
> 
> v2: incorporate style feedback, move #defines to linux/include/mpls.h
> Note: this omits Jiri's request to remove tabs between the type and 
> field names in struct declarations.  This would be inconsistent with 
> numerous other struct definitions.

Series applied, but in the future:

1) Put the "v2", "v3", whatever in the inital [PATCH xxx] bracketed
   part of the subject line, otherwise it ends up in the GIT commit
   message header line and that's not desired.

2) Please cut it with these double signoffs, and add an appropriate
   entry to the email aliases file.

Thanks.

^ permalink raw reply

* Re: [PATCH net-next 0/3] net/tcp_fastopen: Fix for various TFO firewall issues
From: David Miller @ 2017-04-24 18:27 UTC (permalink / raw)
  To: weiwan; +Cc: netdev, ycheng, edumazet
In-Reply-To: <20170420214548.23666-1-tracywwnj@gmail.com>

From: Wei Wang <weiwan@google.com>
Date: Thu, 20 Apr 2017 14:45:45 -0700

> Currently there are still some firewall issues in the middlebox
> which make the middlebox drop packets silently for TFO sockets.
> This kind of issue is hard to be detected by the end client.
> 
> This patch series tries to detect such issues in the kernel and disable
> TFO temporarily.
> More details about the issues and the fixes are included in the following
> patches.

Series applied, thanks.

^ permalink raw reply

* Re: [PATCH v3 07/29] x86: bpf_jit, use ENTRY+ENDPROC
From: David Miller @ 2017-04-24 18:24 UTC (permalink / raw)
  To: jslaby
  Cc: alexei.starovoitov, mingo, mingo, tglx, hpa, x86, jpoimboe,
	linux-kernel, netdev, daniel, edumazet
In-Reply-To: <26f5afd7-d70b-effa-e528-8d0a2b95717a@suse.cz>

From: Jiri Slaby <jslaby@suse.cz>
Date: Mon, 24 Apr 2017 19:51:54 +0200

> For example what's the point of making the sk_load_word_positive_offset
> label a global, callable function? Note that this is exactly the reason
> why this particular two hunks look weird to you even though the
> annotations only mechanically paraphrase what is in the current code.

So that it can be referenced by the eBPF JIT, because these are
helpers for eBPF JIT generated code.  Every architecture implementing
an eBPF JIT has this "mess".

You can't even put a tracepoint or kprobe on these things and expect
to see "arguments" or "return PC" values in the usual spots.  This
code has special calling conventions and register usage as Alexei
explained.

I would suggest that you read and understand how this assembler is
designed, how it is called from the generated JIT code, and what it's
semantics and register usage are, before trying to annotating it.

Thank you.

^ permalink raw reply

* Re: cpsw regression in mainline with "cpsw/netcp: cpts depends on posix_timers"
From: Tony Lindgren @ 2017-04-24 18:23 UTC (permalink / raw)
  To: Arnd Bergmann; +Cc: Grygorii Strashko, Networking, linux-omap
In-Reply-To: <CAK8P3a2ybR29txotehP5uKtstucxW_ZYZXArMroSEbmV=Y--7A@mail.gmail.com>

* Arnd Bergmann <arnd@arndb.de> [170424 11:14]:
> On Mon, Apr 24, 2017 at 7:44 PM, Tony Lindgren <tony@atomide.com> wrote:
> > * Arnd Bergmann <arnd@arndb.de> [170424 10:38]:
> >> On Mon, Apr 24, 2017 at 6:51 PM, Tony Lindgren <tony@atomide.com> wrote:
> >> > Hi,
> >> >
> >> > Looks like commit 07fef3623407 ("cpsw/netcp: cpts depends on posix_timers")
> >> > in mainline started triggering the following oops at least on j5eco-evm.
> >> >
> >> > Adding CONFIG_PTP_1588_CLOCK to .config solves it, but the oops hints
> >> > something is wrong with the dependencies.. CONFIG_TI_CPTS defaults to N
> >> > and not selecting it causes the oops.
> >> >
> >> > Any ideas what's needed to properly fix this?
> >>
> >> Does your configuration have POSIX_TIMERS enabled? If not, then CPTS
> >> is now also disabled. There are two issues here that we need to solve:
> >>
> >> a) find out why POSIX_TIMERS got disabled, and make sure it's always
> >>     turned on for normal users
> >
> > $ make omap2plus_defconfig
> > ...
> > CONFIG_POSIX_TIMERS=y
> > # CONFIG_PTP_1588_CLOCK is not set
> >
> > So CONFIG_TI_CPTS is unselectable.
> 
> Ok, so at least this one is easy to understand, it's a direct consequence
> of my change, as nothing else will select CONFIG_PTP_1588_CLOCK
> now.
> 
> My first try would be this one:
> 
> diff --git a/drivers/net/ethernet/ti/Kconfig b/drivers/net/ethernet/ti/Kconfig
> index 9e631952b86f..8042a7626056 100644
> --- a/drivers/net/ethernet/ti/Kconfig
> +++ b/drivers/net/ethernet/ti/Kconfig
> @@ -76,7 +76,7 @@ config TI_CPSW
>  config TI_CPTS
>   bool "TI Common Platform Time Sync (CPTS) Support"
>   depends on TI_CPSW || TI_KEYSTONE_NETCP
> - depends on PTP_1588_CLOCK
> + depends on POSIX_TIMERS
>   ---help---
>    This driver supports the Common Platform Time Sync unit of
>    the CPSW Ethernet Switch and Keystone 2 1g/10g Switch Subsystem.
> @@ -87,6 +87,7 @@ config TI_CPTS_MOD
>   tristate
>   depends on TI_CPTS
>   default y if TI_CPSW=y || TI_KEYSTONE_NETCP=y
> + imply PTP_1588_CLOCK
>   default m
> 
>  config TI_KEYSTONE_NETCP

Yup this produces a booting .config for me:

Tested-by: Tony Lindgren <tony@atomide.com>

Regards,

Tony

^ permalink raw reply

* Re: PATCH drivers:net:cris/eth_v10: alternate string char arrary
From: David Miller @ 2017-04-24 18:18 UTC (permalink / raw)
  To: karim.eshapa
  Cc: felipe.balbi, paul.gortmaker, mugunthanvnm, jarod, fw, netdev,
	linux-kernel
In-Reply-To: <1493056179-6460-1-git-send-email-karim.eshapa@gmail.com>

From: Karim Eshapa <karim.eshapa@gmail.com>
Date: Mon, 24 Apr 2017 19:49:39 +0200

> static char pointer creates two variables in final assembly.
> static string and pointer to it according to
> Jeff Garzik janitors TODO.
> 
> Signed-off-by: Karim Eshapa <karim.eshapa@gmail.com>

Instead of trusting some document written more than 10 years ago on
the internet, why don't you build the source file in question and take
a look at what actually happens?

I'm not applying this, sorry.

^ permalink raw reply

* Re: cpsw regression in mainline with "cpsw/netcp: cpts depends on posix_timers"
From: Arnd Bergmann @ 2017-04-24 18:12 UTC (permalink / raw)
  To: Tony Lindgren; +Cc: Grygorii Strashko, Networking, linux-omap
In-Reply-To: <20170424174413.GD3780@atomide.com>

On Mon, Apr 24, 2017 at 7:44 PM, Tony Lindgren <tony@atomide.com> wrote:
> * Arnd Bergmann <arnd@arndb.de> [170424 10:38]:
>> On Mon, Apr 24, 2017 at 6:51 PM, Tony Lindgren <tony@atomide.com> wrote:
>> > Hi,
>> >
>> > Looks like commit 07fef3623407 ("cpsw/netcp: cpts depends on posix_timers")
>> > in mainline started triggering the following oops at least on j5eco-evm.
>> >
>> > Adding CONFIG_PTP_1588_CLOCK to .config solves it, but the oops hints
>> > something is wrong with the dependencies.. CONFIG_TI_CPTS defaults to N
>> > and not selecting it causes the oops.
>> >
>> > Any ideas what's needed to properly fix this?
>>
>> Does your configuration have POSIX_TIMERS enabled? If not, then CPTS
>> is now also disabled. There are two issues here that we need to solve:
>>
>> a) find out why POSIX_TIMERS got disabled, and make sure it's always
>>     turned on for normal users
>
> $ make omap2plus_defconfig
> ...
> CONFIG_POSIX_TIMERS=y
> # CONFIG_PTP_1588_CLOCK is not set
>
> So CONFIG_TI_CPTS is unselectable.

Ok, so at least this one is easy to understand, it's a direct consequence
of my change, as nothing else will select CONFIG_PTP_1588_CLOCK
now.

My first try would be this one:

diff --git a/drivers/net/ethernet/ti/Kconfig b/drivers/net/ethernet/ti/Kconfig
index 9e631952b86f..8042a7626056 100644
--- a/drivers/net/ethernet/ti/Kconfig
+++ b/drivers/net/ethernet/ti/Kconfig
@@ -76,7 +76,7 @@ config TI_CPSW
 config TI_CPTS
  bool "TI Common Platform Time Sync (CPTS) Support"
  depends on TI_CPSW || TI_KEYSTONE_NETCP
- depends on PTP_1588_CLOCK
+ depends on POSIX_TIMERS
  ---help---
   This driver supports the Common Platform Time Sync unit of
   the CPSW Ethernet Switch and Keystone 2 1g/10g Switch Subsystem.
@@ -87,6 +87,7 @@ config TI_CPTS_MOD
  tristate
  depends on TI_CPTS
  default y if TI_CPSW=y || TI_KEYSTONE_NETCP=y
+ imply PTP_1588_CLOCK
  default m

 config TI_KEYSTONE_NETCP

>> b) find out what's wrong with the driver when CPTS is disabled. This would
>>     be an existing problem that you just never saw because CPTS was
>>     always enabled so far.
>
> See a) above, yes seems there are two problems here.

Ok.

       Arnd

^ permalink raw reply related

* Re: [pull request][net-next 0/5] Mellanox, mlx5 updates 2017-04-22
From: David Miller @ 2017-04-24 18:11 UTC (permalink / raw)
  To: saeedm; +Cc: netdev, ogerlitz, roid, stephen
In-Reply-To: <20170422184507.26569-1-saeedm@mellanox.com>

From: Saeed Mahameed <saeedm@mellanox.com>
Date: Sat, 22 Apr 2017 21:45:02 +0300

> This series contains some updates to mlx5 driver.
> 
> Sparse and compiler warnings fixes from Stephen Hemminger.
> 
> From Roi Dayan and Or Gerlitz, Add devlink and mlx5 support for controlling
> E-Switch encapsulation mode, this knob will enable HW support for applying
> encapsulation/decapsulation to VF traffic as part of SRIOV e-switch offloading.
> 
> Please pull and let me know if there's any problem.

Applied, thank you.

^ permalink raw reply

* Re: [PATCH v2] brcmfmac: Make skb header writable before use
From: Eric Dumazet @ 2017-04-24 18:09 UTC (permalink / raw)
  To: James Hughes
  Cc: Arend van Spriel, Franky Lin, Hante Meuleman, Kalle Valo,
	linux-wireless, brcm80211-dev-list.pdl, netdev
In-Reply-To: <20170424130322.476-1-james.hughes@raspberrypi.org>

On Mon, 2017-04-24 at 14:03 +0100, James Hughes wrote:
> The driver was making changes to the skb_header without
> ensuring it was writable (i.e. uncloned).
> This patch also removes some boiler plate header size
> checking/adjustment code as that is also handled by the
> skb_cow_header function used to make header writable.
> 
> This patch depends on
> brcmfmac: Ensure pointer correctly set if skb data location changes
> 
> Signed-off-by: James Hughes <james.hughes@raspberrypi.org>
> ---
> Changes in v2
>   Makes the _cow_ call at the entry point of the skb in to the
>   stack, means only needs to be done once, and error handling
>   is easier.
>   Split a separate minor bug fix off to a separate patch (which
>   this patch depends on)

Best way to handle dependencies is to send a patch series.

1/2 brcmfmac: Ensure pointer correctly set if skb data location changes
2/2 brcmfmac: Make skb header writable before use

^ permalink raw reply

* Re: [PATCH net-next] net: add rcu locking when changing early demux
From: David Miller @ 2017-04-24 18:08 UTC (permalink / raw)
  To: dsa; +Cc: netdev, subashab
In-Reply-To: <1492878796-31328-1-git-send-email-dsa@cumulusnetworks.com>

From: David Ahern <dsa@cumulusnetworks.com>
Date: Sat, 22 Apr 2017 09:33:16 -0700

> systemd-sysctl is triggering a suspicious RCU usage message when
> net.ipv4.tcp_early_demux or net.ipv4.udp_early_demux is changed via
> a sysctl config file:
> 
> [   33.896184] ===============================
> [   33.899558] [ ERR: suspicious RCU usage.  ]
> [   33.900624] 4.11.0-rc7+ #104 Not tainted
> [   33.901698] -------------------------------
> [   33.903059] /home/dsa/kernel-2.git/net/ipv4/sysctl_net_ipv4.c:305 suspicious rcu_dereference_check() usage!
> [   33.905724]
> other info that might help us debug this:
> 
> [   33.907656]
> rcu_scheduler_active = 2, debug_locks = 0
> [   33.909288] 1 lock held by systemd-sysctl/143:
> [   33.910373]  #0:  (sb_writers#5){.+.+.+}, at: [<ffffffff8123a370>] file_start_write+0x45/0x48
> [   33.912407]
> stack backtrace:
> [   33.914018] CPU: 0 PID: 143 Comm: systemd-sysctl Not tainted 4.11.0-rc7+ #104
> [   33.915631] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
> [   33.917870] Call Trace:
> [   33.918431]  dump_stack+0x81/0xb6
> [   33.919241]  lockdep_rcu_suspicious+0x10f/0x118
> [   33.920263]  proc_configure_early_demux+0x65/0x10a
> [   33.921391]  proc_udp_early_demux+0x3a/0x41
> 
> add rcu locking to proc_configure_early_demux.
> 
> Fixes: dddb64bcb3461 ("net: Add sysctl to toggle early demux for tcp and udp")
> Signed-off-by: David Ahern <dsa@cumulusnetworks.com>

Applied, thanks David.

^ permalink raw reply

* Re: [PATCH v2 net-next] net: ipv6: send unsolicited NA if enabled for all interfaces
From: David Miller @ 2017-04-24 18:07 UTC (permalink / raw)
  To: dsa; +Cc: netdev, hannes
In-Reply-To: <1492877413-21906-1-git-send-email-dsa@cumulusnetworks.com>

From: David Ahern <dsa@cumulusnetworks.com>
Date: Sat, 22 Apr 2017 09:10:13 -0700

> When arp_notify is set to 1 for either a specific interface or for 'all'
> interfaces, gratuitous arp requests are sent. Since ndisc_notify is the
> ipv6 equivalent to arp_notify, it should follow the same semantics.
> Commit 4a6e3c5def13 ("net: ipv6: send unsolicited NA on admin up") sends
> the NA on admin up. The final piece is checking devconf_all->ndisc_notify
> in addition to the per device setting. Add it.
> 
> Fixes: 5cb04436eef6 ("ipv6: add knob to send unsolicited ND on link-layer address change")
> Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
> ---
> v2
> - update commit message with subject of commit 4a6e3c5def13 per comment
>   from Sergei

Applied, thanks David.

^ permalink raw reply

* Re: pull request: bluetooth-next 2017-04-22
From: David Miller @ 2017-04-24 18:06 UTC (permalink / raw)
  To: johan.hedberg; +Cc: linux-bluetooth, netdev
In-Reply-To: <20170422131302.GA17928@x1c>

From: Johan Hedberg <johan.hedberg@gmail.com>
Date: Sat, 22 Apr 2017 16:13:02 +0300

> Here are some more Bluetooth patches (and one 802.15.4 patch) in the
> bluetooth-next tree targeting the 4.12 kernel. Most of them are pure
> fixes.
> 
> Please let me know if there are any issues pulling. Thanks.

Pulled, thank you.

^ permalink raw reply

* Re: [PATCH v2] net: natsemi: ns83820: add checks for dma mapping error
From: David Miller @ 2017-04-24 18:04 UTC (permalink / raw)
  To: khoroshilov; +Cc: netdev, linux-kernel, ldv-project
In-Reply-To: <1492866365-5422-1-git-send-email-khoroshilov@ispras.ru>

From: Alexey Khoroshilov <khoroshilov@ispras.ru>
Date: Sat, 22 Apr 2017 16:06:05 +0300

> +		free_idx = (free_idx + NR_TX_DESC - 1) % NR_TX_DESC;

This is more simply stated as "(free_idx - 1) % NR_TX_DESC.

^ permalink raw reply

* Re: [PATCH] net: netcp: fix spelling mistake: "memomry" -> "memory"
From: David Miller @ 2017-04-24 17:59 UTC (permalink / raw)
  To: colin.king; +Cc: w-kwok2, m-karicheri2, netdev, kernel-janitors, linux-kernel
In-Reply-To: <20170422122201.7979-1-colin.king@canonical.com>

From: Colin King <colin.king@canonical.com>
Date: Sat, 22 Apr 2017 13:22:01 +0100

> From: Colin Ian King <colin.king@canonical.com>
> 
> Trivial fix to spelling mistake in dev_err message and rejoin
> line.
> 
> Signed-off-by: Colin Ian King <colin.king@canonical.com>

Applied.

^ permalink raw reply

* Re: [PATCH net] ravb: Double free on error in ravb_start_xmit()
From: David Miller @ 2017-04-24 17:59 UTC (permalink / raw)
  To: dan.carpenter
  Cc: sergei.shtylyov, kazuya.mizuguchi.ks, horms+renesas, ykaneko0929,
	masaru.nagai.vx, geert+renesas, niklas.soderlund+renesas, tremyfr,
	netdev, linux-renesas-soc, kernel-janitors
In-Reply-To: <20170422104656.jjnooh3or4x3lbw3@mwanda>

From: Dan Carpenter <dan.carpenter@oracle.com>
Date: Sat, 22 Apr 2017 13:46:56 +0300

> If skb_put_padto() fails then it frees the skb.  I shifted that code
> up a bit to make my error handling a little simpler.
> 
> Fixes: a0d2f20650e8 ("Renesas Ethernet AVB PTP clock driver")
> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>

Applied.

^ permalink raw reply

* Re: [PATCH] net: atheros: atl1: use offset_in_page() macro
From: David Miller @ 2017-04-24 17:58 UTC (permalink / raw)
  To: geliangtang; +Cc: jcliburn, chris.snook, edumazet, netdev, linux-kernel
In-Reply-To: <c8746e014d4f7121ab4602548e199f8e8f2fc083.1492758007.git.geliangtang@gmail.com>

From: Geliang Tang <geliangtang@gmail.com>
Date: Sat, 22 Apr 2017 09:21:10 +0800

> Use offset_in_page() macro instead of open-coding.
> 
> Signed-off-by: Geliang Tang <geliangtang@gmail.com>

Applied, thanks.

^ permalink raw reply

* Re: [PATCH net-next 0/5] bnxt_en: Updates for net-next.
From: David Miller @ 2017-04-24 17:55 UTC (permalink / raw)
  To: michael.chan; +Cc: netdev
In-Reply-To: <1492819886-19625-1-git-send-email-michael.chan@broadcom.com>

From: Michael Chan <michael.chan@broadcom.com>
Date: Fri, 21 Apr 2017 20:11:21 -0400

> Miscellaneous updates include passing DCBX RoCE VLAN priority to firmware,
> checking one more new firmware flag before allowing DCBX to run on the host,
> adding 100Gbps speed support, adding check to disallow speed settings on
> Multi-host NICs, and a minor fix for reporting VF attributes.

Series applied, thanks.

^ permalink raw reply

* Re: [PATCH net-next v2 2/2] openvswitch: Add eventmask support to CT action.
From: David Miller @ 2017-04-24 17:53 UTC (permalink / raw)
  To: jarno; +Cc: netdev
In-Reply-To: <1492818486-57292-2-git-send-email-jarno@ovn.org>

From: Jarno Rajahalme <jarno@ovn.org>
Date: Fri, 21 Apr 2017 16:48:06 -0700

> Add a new optional conntrack action attribute OVS_CT_ATTR_EVENTMASK,
> which can be used in conjunction with the commit flag
> (OVS_CT_ATTR_COMMIT) to set the mask of bits specifying which
> conntrack events (IPCT_*) should be delivered via the Netfilter
> netlink multicast groups.  Default behavior depends on the system
> configuration, but typically a lot of events are delivered.  This can be
> very chatty for the NFNLGRP_CONNTRACK_UPDATE group, even if only some
> types of events are of interest.
> 
> Netfilter core init_conntrack() adds the event cache extension, so we
> only need to set the ctmask value.  However, if the system is
> configured without support for events, the setting will be skipped due
> to extension not being found.
> 
> Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
> Reviewed-by: Greg Rose <gvrose8192@gmail.com>
> Acked-by: Joe Stringer <joe@ovn.org>

Applied.

^ permalink raw reply

* Re: [PATCH net-next v2 1/2] openvswitch: Typo fix.
From: David Miller @ 2017-04-24 17:53 UTC (permalink / raw)
  To: jarno; +Cc: netdev
In-Reply-To: <1492818486-57292-1-git-send-email-jarno@ovn.org>

From: Jarno Rajahalme <jarno@ovn.org>
Date: Fri, 21 Apr 2017 16:48:05 -0700

> Fix typo in a comment.
> 
> Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
> Acked-by: Greg Rose <gvrose8192@gmail.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next v2 2/5] virtio-net: transmit napi
From: Willem de Bruijn @ 2017-04-24 17:51 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Jason Wang, Network Development, virtualization, David Miller,
	Willem de Bruijn
In-Reply-To: <20170424201346-mutt-send-email-mst@kernel.org>

On Mon, Apr 24, 2017 at 1:14 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> On Mon, Apr 24, 2017 at 01:05:45PM -0400, Willem de Bruijn wrote:
>> On Mon, Apr 24, 2017 at 12:40 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
>> > On Fri, Apr 21, 2017 at 10:50:12AM -0400, Willem de Bruijn wrote:
>> >> >>> Maybe I was wrong, but according to Michael's comment it looks like he
>> >> >>> want
>> >> >>> check affinity_hint_set just for speculative tx polling on rx napi
>> >> >>> instead
>> >> >>> of disabling it at all.
>> >> >>>
>> >> >>> And I'm not convinced this is really needed, driver only provide affinity
>> >> >>> hint instead of affinity, so it's not guaranteed that tx and rx interrupt
>> >> >>> are in the same vcpus.
>> >> >>
>> >> >> You're right. I made the restriction broader than the request, to really
>> >> >> err
>> >> >> on the side of caution for the initial merge of napi tx. And enabling
>> >> >> the optimization is always a win over keeping it off, even without irq
>> >> >> affinity.
>> >> >>
>> >> >> The cycle cost is significant without affinity regardless of whether the
>> >> >> optimization is used.
>> >> >
>> >> >
>> >> > Yes, I noticed this in the past too.
>> >> >
>> >> >> Though this is not limited to napi-tx, it is more
>> >> >> pronounced in that mode than without napi.
>> >> >>
>> >> >> 1x TCP_RR for affinity configuration {process, rx_irq, tx_irq}:
>> >> >>
>> >> >> upstream:
>> >> >>
>> >> >> 1,1,1: 28985 Mbps, 278 Gcyc
>> >> >> 1,0,2: 30067 Mbps, 402 Gcyc
>> >> >>
>> >> >> napi tx:
>> >> >>
>> >> >> 1,1,1: 34492 Mbps, 269 Gcyc
>> >> >> 1,0,2: 36527 Mbps, 537 Gcyc (!)
>> >> >> 1,0,1: 36269 Mbps, 394 Gcyc
>> >> >> 1,0,0: 34674 Mbps, 402 Gcyc
>> >> >>
>> >> >> This is a particularly strong example. It is also representative
>> >> >> of most RR tests. It is less pronounced in other streaming tests.
>> >> >> 10x TCP_RR, for instance:
>> >> >>
>> >> >> upstream:
>> >> >>
>> >> >> 1,1,1: 42267 Mbps, 301 Gcyc
>> >> >> 1,0,2: 40663 Mbps, 445 Gcyc
>> >> >>
>> >> >> napi tx:
>> >> >>
>> >> >> 1,1,1: 42420 Mbps, 303 Gcyc
>> >> >> 1,0,2:  42267 Mbps, 431 Gcyc
>> >> >>
>> >> >> These numbers were obtained with the virtqueue_enable_cb_delayed
>> >> >> optimization after xmit_skb, btw. It turns out that moving that before
>> >> >> increases 1x TCP_RR further to ~39 Gbps, at the cost of reducing
>> >> >> 100x TCP_RR a bit.
>> >> >
>> >> >
>> >> > I see, so I think we can leave the affinity hint optimization/check for
>> >> > future investigation:
>> >> >
>> >> > - to avoid endless optimization (e.g we may want to share a single
>> >> > vector/napi for tx/rx queue pairs in the future) for this series.
>> >> > - tx napi is disabled by default which means we can do optimization on top.
>> >>
>> >> Okay. I'll drop the vi->affinity_hint_set from the patch set for now.
>> >
>> > I kind of like it, let's be conservative. But I'd prefer a comment
>> > near it explaining why it's there.
>>
>> I don't feel strongly. Was minutes away from sending a v3 with this
>> code reverted, but I'll reinstate it and add a comment. Other planned
>> changes based on Jason's feedback to v2:
>>
>>   v2 -> v3:
>>     - convert __netif_tx_trylock to __netif_tx_lock on tx napi poll
>>           ensure that the handler always cleans, to avoid deadlock
>>     - unconditionally clean in start_xmit
>>           avoid adding an unnecessary "if (use_napi)" branch
>>     - remove virtqueue_disable_cb in patch 5/5
>>           a noop in the common event_idx based loop
>
> Makes sense, thanks!

Great. Sent that, thanks.

The actual diff to v2 is quite small:

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index b107ae011632..003143835766 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -986,6 +986,9 @@ static void virtnet_napi_tx_enable(struct virtnet_info *vi,
        if (!napi->weight)
                return;

+       /* Tx napi touches cachelines on the cpu handling tx interrupts. Only
+        * enable the feature if this is likely affine with the transmit path.
+        */
        if (!vi->affinity_hint_set) {
                napi->weight = 0;
                return;
@@ -1131,10 +1134,9 @@ static int virtnet_poll_tx(struct napi_struct
*napi, int budget)
        struct virtnet_info *vi = sq->vq->vdev->priv;
        struct netdev_queue *txq = netdev_get_tx_queue(vi->dev, vq2txq(sq->vq));

-       if (__netif_tx_trylock(txq)) {
-               free_old_xmit_skbs(sq);
-               __netif_tx_unlock(txq);
-       }
+       __netif_tx_lock(txq, raw_smp_processor_id());
+       free_old_xmit_skbs(sq);
+       __netif_tx_unlock(txq);

        virtqueue_napi_complete(napi, sq->vq, 0);

@@ -1196,14 +1198,10 @@ static netdev_tx_t start_xmit(struct sk_buff
*skb, struct net_device *dev)
        bool use_napi = sq->napi.weight;

        /* Free up any pending old buffers before queueing new ones. */
-       if (use_napi) {
-               if (kick)
-                       virtqueue_enable_cb_delayed(sq->vq);
-               else
-                       virtqueue_disable_cb(sq->vq);
-       } else {
-               free_old_xmit_skbs(sq);
-       }
+       free_old_xmit_skbs(sq);
+
+       if (use_napi && kick)
+               virtqueue_enable_cb_delayed(sq->vq);

(gmail will munge the identation, sorry)

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox