Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH 2/2] net/smsc911x: Fix delays in the PHY enable/disable routines
From: David Miller @ 2014-11-13 19:38 UTC (permalink / raw)
  To: al.kochet; +Cc: netdev, steve.glendinning
In-Reply-To: <1415841980-14250-2-git-send-email-al.kochet@gmail.com>

From: Alexander Kochetkov <al.kochet@gmail.com>
Date: Thu, 13 Nov 2014 05:26:20 +0400

> Increased delay in the smsc911x_phy_disable_energy_detect (from 1ms to 2ms).
> Dropped delays in the smsc911x_phy_enable_energy_detect (100ms and 1ms).
> 
> The patch affect SMSC LAN generation 4 chips with integrated PHY (LAN9221).
> 
> I saw problems with soft reset due to wrong udelay timings.
> After I fixed udelay, I measured the time needed to bring integrated PHY
> from power-down to operational mode (the time beetween clearing EDPWRDOWN
> bit and soft reset complete event). I got 1ms (measured using ktime_get).
> The value is equal to the current value (1ms) used in the
> smsc911x_phy_disable_energy_detect. It is near the upper bound and in order
> to avoid rare soft reset faults it is doubled (2ms).
> 
> I don't know official timing for bringing up integrated PHY as specs doesn't
> clarify this (or may be I didn't found).
> 
> It looks safe to drop delays before and after setting EDPWRDOWN bit
> (enable PHY power-down mode). I didn't saw any regressions with the patch.
> 
> The patch was reviewed by Steve Glendinning and Microchip Team.
> 
> Signed-off-by: Alexander Kochetkov <al.kochet@gmail.com>
> Acked-by: Steve Glendinning <steve.glendinning@shawell.net>

Applied.

^ permalink raw reply

* Re: lib: rhashtable - Remove weird non-ASCII characters from comments
From: David Miller @ 2014-11-13 19:39 UTC (permalink / raw)
  To: herbert; +Cc: netdev, tgraf
In-Reply-To: <20141113051048.GA1801@gondor.apana.org.au>

From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Thu, 13 Nov 2014 13:10:48 +0800

> My editor spewed garbage that looked like memory corruption on
> my screen.  It turns out that a number of occurences of "fi" got
> turned into a ligature.
>     
> This patch replaces these ligatures with the ASCII letters "fi".
>     
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

Applied, thanks :)

^ permalink raw reply

* Re: [patch v2 -next] amd-xgbe: fix ->rss_hash_type
From: David Miller @ 2014-11-13 19:40 UTC (permalink / raw)
  To: dan.carpenter; +Cc: thomas.lendacky, netdev, kernel-janitors
In-Reply-To: <20141113061905.GA1280@mwanda>

From: Dan Carpenter <dan.carpenter@oracle.com>
Date: Thu, 13 Nov 2014 09:19:06 +0300

> There was a missing break statement so we set everything to
> PKT_HASH_TYPE_L3 even when we intended to use PKT_HASH_TYPE_L4.
> 
> Fixes: 5b9dfe299e55 ('amd-xgbe: Provide support for receive side scaling')
> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
> ---
> v2: remove blank line

Applied, thanks a lot Dan.

^ permalink raw reply

* Re: arm64 allmodconfig failures in nft_reject_bridge.c
From: Mark Brown @ 2014-11-13 19:47 UTC (permalink / raw)
  To: David Miller
  Cc: pablo, linux, kaber, kadlec, stephen, linaro-kernel,
	kernel-build-reports, netfilter-devel, coreteam, bridge, netdev
In-Reply-To: <20141113.143513.785778562781935404.davem@davemloft.net>

[-- Attachment #1: Type: text/plain, Size: 1109 bytes --]

On Thu, Nov 13, 2014 at 02:35:13PM -0500, David Miller wrote:

> I hold changes in my tree for a week or more, because I want them to
> "cook" there before they go to Linus.

Hrm.  Guess there must've been some other change in -next that pulled
the header in implicitly here :(

> So if it takes a week or two for a bug fix like this to propagate into
> Linus's tree, that's just what sometimes happens.

> In the mean time you can apply the fix locally if you absolutely have
> to have it right at this moment, that is the freedom that everyone
> has.

This can be a bit problematic for build (or widespread boot) breaks in
common configs since it takes out all the automated runtime testing that
people have running for the time the build break is in place.  In this
case it was just allmodconfig so it doesn't really get non-build testing
and makes little difference but for defconfigs it can be a more
substantial impact.  Applying fixes locally doesn't really work for this
case.

> FWIW, I plan to push my tree to Linus some time today, so this will be
> resolved in the next day or so.

Great, thanks.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply

* Re: [PATCH net-next v2] net: generic dev_disable_lro() stacked device handling
From: David Miller @ 2014-11-13 19:49 UTC (permalink / raw)
  To: mkubecek; +Cc: netdev, linux-kernel, j.vosburgh, vfalico, andy, jiri
In-Reply-To: <20141113065450.1645FA0BEF@unicorn.suse.cz>

From: Michal Kubecek <mkubecek@suse.cz>
Date: Thu, 13 Nov 2014 07:54:50 +0100 (CET)

> Large receive offloading is known to cause problems if received packets
> are passed to other host. Therefore the kernel disables it by calling
> dev_disable_lro() whenever a network device is enslaved in a bridge or
> forwarding is enabled for it (or globally). For virtual devices we need
> to disable LRO on the underlying physical device (which is actually
> receiving the packets).
> 
> Current dev_disable_lro() code handles this  propagation for a vlan
> (including 802.1ad nested vlan), macvlan or a vlan on top of a macvlan.
> It doesn't handle other stacked devices and their combinations, in
> particular propagation from a bond to its slaves which often causes
> problems in virtualization setups.
> 
> As we now have generic data structures describing the upper-lower device
> relationship, dev_disable_lro() can be generalized to disable LRO also
> for all lower devices (if any) once it is disabled for the device
> itself.
> 
> For bonding and teaming devices, it is necessary to disable LRO not only
> on current slaves at the moment when dev_disable_lro() is called but
> also on any slave (port) added later.
> 
> v2: use lower device links for all devices (including vlan and macvlan)
> 
> Signed-off-by: Michal Kubecek <mkubecek@suse.cz>

Applied, thanks a lot.

^ permalink raw reply

* Re: [PATCH] sh_eth: r8a779x: Enable automatically fetch receive descriptor
From: David Miller @ 2014-11-13 20:03 UTC (permalink / raw)
  To: ykaneko0929; +Cc: netdev, horms, magnus.damm, linux-sh, grant.likely
In-Reply-To: <1415861819-27812-1-git-send-email-ykaneko0929@gmail.com>

From: Yoshihiro Kaneko <ykaneko0929@gmail.com>
Date: Thu, 13 Nov 2014 15:56:59 +0900

> From: Kouei Abe <kouei.abe.cp@renesas.com>
> 
> HDMAC automatically fetches the receive descriptor and receives frames.
> Continuous reception of multiple frames is possible.
> 
> Signed-off-by: Kouei Abe <kouei.abe.cp@renesas.com>
> Signed-off-by: Yoshihiro Kaneko <ykaneko0929@gmail.com>
> ---
> 
> This patch is based on net-next tree.

This doesn't even compile, or, it depends upon another patch which you have
not mentioned.

Because sh_eth_cpu_data does not have an rmcr_value field.

^ permalink raw reply

* Re: Device Tree Binding for Marvell DSA Switch on imx28 board over Mdio Interface
From: Florian Fainelli @ 2014-11-13 20:03 UTC (permalink / raw)
  To: Oliver Graute; +Cc: netdev
In-Reply-To: <CA+KjHfbwtwPFXFEVDDijRNeTcHcSn0BZ4_yGuwcgyFfubNFCQA@mail.gmail.com>

On 11/13/2014 07:15 AM, Oliver Graute wrote:
> Hello Florian,
> 
> On Wed, Nov 12, 2014 at 8:19 PM, Florian Fainelli <f.fainelli@gmail.com> wrote:
>> On 11/12/2014 05:07 AM, Oliver Graute wrote:
>>> Hello,
>>>
>>> how do I specify the DSA node and the MDIO node in the Device Tree
>>> Binding to integrate a Marvell 88e6071 switch with a imx28 board?
>>>
>>> On my board the Marvell switch 88e6071 is connected via phy1 (on a
>>> imx28 PCB) to phy5 on the Marvell switch (on a Switch PCB). All phys
>>> are connected via the same MDIO Bus.
>>>
>>> I enabled the Marvell DSA Support Driver, Gianfar Ethernet Driver and
>>> Freescale PQ MDIO Driver in the Kernel (I' am not sure if this is the
>>> right choice for imx28 fec ethernet controller is it?)
>>>
> 
> I changed my DeviceTree according to your proposal. Now I got a ENODEV 19
> in dsa_of_probe. Because  of_find_device_by_node(ethernet) is returning 0.
> Is my ethernet setting still wrong?

Is your ethernet driver also modular? If so, you will need it to be
loaded *before* dsa. of_find_device_by_node() also needs the ethernet
driver to be a platform_driver.

NB: I have a patch that looks up a net_device based on the struct
device_node that might be better to use, since it makes no assumption
about whether that is a platform_device/pci_device etc...

> 
> dsa@0 {
>         compatible = "marvell,dsa";
>         #address-cells = <2>;
>         #size-cells = <0>;
> 
>         interrupts = <10>;
>         dsa,ethernet = <&eth1>;
>         dsa,mii-bus = <&mdio_bus>;
> 
>         switch@0 {
>             #address-cells = <1>;
>             #size-cells = <0>;
>             reg = <5 0>;   /* MDIO address 5, switch 0 in tree */
> 
>             port@0 {
>                 reg = <0>;
>                 label = "lan1";
>                 phy-handle = <&ethphy1>;
>             };
> 
>             port@1 {
>                 reg = <1>;
>                 label = "lan2";
>             };
> 
>             port@2 {
>                 reg = <2>;
>                 label = "lan3";
>             };
> 
>             port@3 {
>                 reg = <3>;
>                 label = "lan4";
>             };
> 
>             port@4 {
>                 reg = <4>;
>                 label = "lan5";
>             };
> 
>             port@5 {
>                 reg = <5>;
>                 label = "cpu";
>             };
> 
>         };
>     };
> 
> eth1: eth1 {
>     status = "okay";
>     ethernet1-port@1 {
>         phy-handle = <&ethphy1>;
> 
>         fixed-link {
>                 speed = <1000>;
>                 full-duplex;
>             };
>     };
> };
> 
> mdio_bus: mdio@800f0040 {
>         #address-cells = <1>;
>         #size-cells = <0>;
>         device_type = "mdio";
>         //compatible = "fsl,gianfar-mdio";
>         compatible = "fsl,mpc875-fec-mdio", "fsl,pq1-fec-mdio";
>         reg = <0x800f0040 0x188>;
>         status = "okay";
> 
>          ethphy0: ethernet-phy@0 {
>                 compatible = "fsl,gianfar-mdio";
>                 device_type = "network";
>                 model = "FEC";
>                 reg = <0x00>;
> 
>          };
> 
>          ethphy1: ethernet-phy@1 {
>                  compatible = "fsl,gianfar-mdio";
>                  device_type = "network";
>                  model = "FEC";
>                  reg = <0x01>;
>                 };
>                  //reg = <0xff>; */ /* No PHY attached */
>                  //speed = <1000>;
>                  //duple = <1>;
>        };
> 
> modprobe dsa_core
> [  151.720180] !!!!!enter dsa_init_module!!!!!
> [  151.724713] !!!!Enter dsa Probe!!!!!
> [  151.728321] Distributed Switch Architecture driver version 0.1
> [  151.739026] !!!!!Enter dsa_of_probe!!!!!
> [  151.744515] !!!!!mdio->name=mdio mdio->type=mdio
> mdio->full_name=/mdio@800f0040 !!!!!
> [  151.753559] !!!!!np->name=dsa np->type=<NULL> np->full_name=/dsa@0 !!!!!
> [  151.761419] !!!!before of_mdio_find_bus!!!!!
> [  151.765732] !!!!!enter of_mdio_find_bus!!!!!
> [  151.772418] !!!!!!enter class_find_device!!!!!
> [  151.776908] !!!!!!enter class_dev_iter_init!!!!!
> [  151.783512] !!!!!!iter->type->name=(null) !!!!!
> [  151.788085] !!!!!!leave class_dev_iter_init!!!!!
> [  151.794439] !!!!!enter of_mdio_bus_match!!!!!
> [  151.798845] !!!!!enter of_mdio_bus_match!!!!!
> [  151.804967] !!!!!enter of_mdio_bus_match!!!!!
> [  151.809381] !!!!!!leave class_find_device return dev=!!!!!
> [  151.816668] !!!!Leave of_mdio_find_bus !!!!!
> [  151.822057] !!!!before of_parse_phandle dsa,ethernet!!!!!
> [  151.827553] !!!!before of find_device_by_node!!!!!
> [  151.834819] !!!!!ethernet->name=eth1 ethernet->type=<NULL>
> ethernet->full_name=/eth1 !!!!!
> [  151.844540] !!!!! enter of_find_device_by_node !!!!!
> [  151.850692] !!!!! Leave of_find_device_by_node dev=0 !!!!!
> [  151.856221] !!!! return  of_find-device_by_node =19 !!!!!
> [  151.863419] dsa_of_probe returns=-19
> [  151.873509] !!!!!leave dsa_init_module!!!!!
> 
> 
> Best Regards,
> 
> Oliver
> 

^ permalink raw reply

* Re: [PATCH] net: sh_eth: Add RMII mode setting in probe
From: David Miller @ 2014-11-13 20:04 UTC (permalink / raw)
  To: ykaneko0929; +Cc: netdev, horms, magnus.damm, linux-sh
In-Reply-To: <1415861645-27685-1-git-send-email-ykaneko0929@gmail.com>

From: Yoshihiro Kaneko <ykaneko0929@gmail.com>
Date: Thu, 13 Nov 2014 15:54:05 +0900

> From: Hisashi Nakamura <hisashi.nakamura.ak@renesas.com>
> 
> When using RMMI mode, it is necessary to change in probe.
> 
> Signed-off-by: Hisashi Nakamura <hisashi.nakamura.ak@renesas.com>
> Signed-off-by: Yoshihiro Kaneko <ykaneko0929@gmail.com>

Applied.

^ permalink raw reply

* Re: [PATCH] net: sh_eth: Add r8a7793 support
From: David Miller @ 2014-11-13 20:04 UTC (permalink / raw)
  To: ykaneko0929; +Cc: netdev, horms, magnus.damm, linux-sh, grant.likely
In-Reply-To: <1415861947-27867-1-git-send-email-ykaneko0929@gmail.com>

From: Yoshihiro Kaneko <ykaneko0929@gmail.com>
Date: Thu, 13 Nov 2014 15:59:07 +0900

> From: Hisashi Nakamura <hisashi.nakamura.ak@renesas.com>
> 
> The device tree probing for R-Car M2N (r8a7793) is added.
> 
> Signed-off-by: Hisashi Nakamura <hisashi.nakamura.ak@renesas.com>
> Signed-off-by: Yoshihiro Kaneko <ykaneko0929@gmail.com>

Applied.

^ permalink raw reply

* Re: [PATCH] sh_eth: Optimization for RX excess judgement
From: David Miller @ 2014-11-13 20:04 UTC (permalink / raw)
  To: ykaneko0929; +Cc: netdev, horms, magnus.damm, linux-sh
In-Reply-To: <1415862031-27925-1-git-send-email-ykaneko0929@gmail.com>

From: Yoshihiro Kaneko <ykaneko0929@gmail.com>
Date: Thu, 13 Nov 2014 16:00:31 +0900

> From: Mitsuhiro Kimura <mitsuhiro.kimura.kc@renesas.com>
> 
> Both of 'boguscnt' and 'quota' have nearly meaning as the condition of
> the reception loop.
> In order to cut down redundant processing, this patch changes excess judgement.
> 
> Signed-off-by: Mitsuhiro Kimura <mitsuhiro.kimura.kc@renesas.com>
> Signed-off-by: Yoshihiro Kaneko <ykaneko0929@gmail.com>
> ---
> 
> This patch is based on net tree.

On what basis is an optimization like this appropriate for 'net'?  I really
don't think it is.

^ permalink raw reply

* Re: [PATCH 0/2] Fix sleeping function called from invalid context
From: David Miller @ 2014-11-13 20:05 UTC (permalink / raw)
  To: ykaneko0929; +Cc: netdev, horms, magnus.damm, linux-sh
In-Reply-To: <1415862135-27972-1-git-send-email-ykaneko0929@gmail.com>

From: Yoshihiro Kaneko <ykaneko0929@gmail.com>
Date: Thu, 13 Nov 2014 16:02:13 +0900

> This series is based on net tree.

A header posting is supposed to explain things, at a high level,
about the patch series.

For example, what is the patch series doing, and why.  What is the
context of the changes, is it meant to fix a specific bug report,
and if so which one?

Both of your sh_eth patch series have this problem.

^ permalink raw reply

* Re: [RFC PATCH 00/16] Replace smp_read_barrier_depends() with lockless_derefrence()
From: Paul E. McKenney @ 2014-11-13 20:07 UTC (permalink / raw)
  To: Pranith Kumar
  Cc: Herbert Xu, David S. Miller, Cristian Stoica, Horia Geanta,
	Ruchika Gupta, Michael Neuling, Wolfram Sang,
	open list:CRYPTO API, open list, Vinod Koul, Dan Williams,
	Bartlomiej Zolnierkiewicz, Kyungmin Park, Manuel Schölling,
	Dave Jiang, Rashika, open list:DMA GENERIC OFFLO...
In-Reply-To: <1415906662-4576-1-git-send-email-bobby.prani@gmail.com>

On Thu, Nov 13, 2014 at 02:24:06PM -0500, Pranith Kumar wrote:
> Recently lockless_dereference() was added which can be used in place of
> hard-coding smp_read_barrier_depends(). 
> 
> http://lkml.iu.edu/hypermail/linux/kernel/1410.3/04561.html
> 
> The following series tries to do this.
> 
> There are still some hard-coded locations which I was not sure how to replace
> with. I will send in separate patches/questions regarding them.

Thank you for taking this on!  Some questions and comments in response
to the individual patches.

							Thanx, Paul

> Pranith Kumar (16):
>   crypto: caam - Remove unnecessary smp_read_barrier_depends()
>   doc: memory-barriers.txt: Document use of lockless_dereference()
>   drivers: dma: Replace smp_read_barrier_depends() with
>     lockless_dereference()
>   dcache: Replace smp_read_barrier_depends() with lockless_dereference()
>   overlayfs: Replace smp_read_barrier_depends() with
>     lockless_dereference()
>   assoc_array: Replace smp_read_barrier_depends() with
>     lockless_dereference()
>   hyperv: Replace smp_read_barrier_depends() with lockless_dereference()
>   rcupdate: Replace smp_read_barrier_depends() with
>     lockless_dereference()
>   percpu: Replace smp_read_barrier_depends() with lockless_dereference()
>   perf: Replace smp_read_barrier_depends() with lockless_dereference()
>   seccomp: Replace smp_read_barrier_depends() with
>     lockless_dereference()
>   task_work: Replace smp_read_barrier_depends() with
>     lockless_dereference()
>   ksm: Replace smp_read_barrier_depends() with lockless_dereference()
>   slab: Replace smp_read_barrier_depends() with lockless_dereference()
>   netfilter: Replace smp_read_barrier_depends() with
>     lockless_dereference()
>   rxrpc: Replace smp_read_barrier_depends() with lockless_dereference()
> 
>  Documentation/memory-barriers.txt |  2 +-
>  drivers/crypto/caam/jr.c          |  3 ---
>  drivers/dma/ioat/dma_v2.c         |  3 +--
>  drivers/dma/ioat/dma_v3.c         |  3 +--
>  fs/dcache.c                       |  7 ++-----
>  fs/overlayfs/super.c              |  4 +---
>  include/linux/assoc_array_priv.h  | 11 +++++++----
>  include/linux/hyperv.h            |  9 ++++-----
>  include/linux/percpu-refcount.h   |  4 +---
>  include/linux/rcupdate.h          | 10 +++++-----
>  kernel/events/core.c              |  3 +--
>  kernel/events/uprobes.c           |  8 ++++----
>  kernel/seccomp.c                  |  7 +++----
>  kernel/task_work.c                |  3 +--
>  lib/assoc_array.c                 |  7 -------
>  mm/ksm.c                          |  7 +++----
>  mm/slab.h                         |  6 +++---
>  net/ipv4/netfilter/arp_tables.c   |  3 +--
>  net/ipv4/netfilter/ip_tables.c    |  3 +--
>  net/ipv6/netfilter/ip6_tables.c   |  3 +--
>  net/rxrpc/ar-ack.c                | 22 +++++++++-------------
>  security/keys/keyring.c           |  6 ------
>  22 files changed, 50 insertions(+), 84 deletions(-)
> 
> -- 
> 1.9.1
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply

* Re: [PATCHv2] smsc911x: power-up phydev before doing a software reset.
From: David Miller @ 2014-11-13 20:09 UTC (permalink / raw)
  To: eballetbo; +Cc: netdev, steve.glendinning, javier, ebutera
In-Reply-To: <1415866474-11626-1-git-send-email-eballetbo@iseebcn.com>

From: Enric Balletbo i Serra <eballetbo@iseebcn.com>
Date: Thu, 13 Nov 2014 09:14:34 +0100

> With commit be9dad1f9f26604fb ("net: phy: suspend phydev when going
> to HALTED"), the PHY device will be put in a low-power mode using
> BMCR_PDOWN if the the interface is set down. The smsc911x driver does
> a software_reset opening the device driver (ndo_open). In such case,
> the PHY must be powered-up before access to any register and before
> calling the software_reset function. Otherwise, as the PHY is powered
> down the software reset fails and the interface can not be enabled
> again.
> 
> This patch fixes this scenario that is easy to reproduce setting down
> the network interface and setting up again.
> 
>     $ ifconfig eth0 down
>     $ ifconfig eth0 up
>     ifconfig: SIOCSIFFLAGS: Input/output error
> 
> Signed-off-by: Enric Balletbo i Serra <eballetbo@iseebcn.com>

Applied and queued up for -stable, thanks.

^ permalink raw reply

* Re: [PATCH 0/4] rhashtable: Allow local locks to be used and tested
From: David Miller @ 2014-11-13 20:13 UTC (permalink / raw)
  To: herbert; +Cc: netdev, tgraf
In-Reply-To: <20141113101025.GA3728@gondor.apana.org.au>

From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Thu, 13 Nov 2014 18:10:25 +0800

> This series moves mutex_is_held entirely under PROVE_LOCKING so
> there is zero foot print when we're not debugging.  More importantly
> it adds a parrent argument to mutex_is_held so that we can test
> local locks rather than global ones (e.g., per-namespace locks).

Series applied, thanks Herbert.

^ permalink raw reply

* Re: [PATCH net-next 0/7] mlx4: Flexible (asymmetric) allocation of EQs and MSI-X vectors
From: David Miller @ 2014-11-13 20:16 UTC (permalink / raw)
  To: ogerlitz; +Cc: netdev, matanb, amirv, jackm
In-Reply-To: <1415882733-3084-1-git-send-email-ogerlitz@mellanox.com>

From: Or Gerlitz <ogerlitz@mellanox.com>
Date: Thu, 13 Nov 2014 14:45:26 +0200

> This series from Matan Barak is built as follows:
> 
> The 1st two patches fix small bugs w.r.t firmware spec. Next
> are two patches which do more re-factoring of the init/fini flow
> and a patch that adds support for the QUERY_FUNC firmware command,
> these are all pre-steps for the major patch of the series. In this
> patch (#6) we change the order of talking/querying the firmware
> and enabling SRIOV. This allows to remote worst-case assumption
> w.r.t the number of available MSI-X vectors and EQs per function.
> 
> The last patch easily enjoys this ordering change, to enable
> supports > 64 VFs over a firmware that allows for that.

Series applied, thank you.

^ permalink raw reply

* Re: [PATCH 16/16] rxrpc: Replace smp_read_barrier_depends() with lockless_dereference()
From: David Howells @ 2014-11-13 20:17 UTC (permalink / raw)
  To: Pranith Kumar
  Cc: dhowells, David S. Miller, Dan Carpenter,
	open list:NETWORKING [GENERAL], open list, paulmck
In-Reply-To: <1415906662-4576-17-git-send-email-bobby.prani@gmail.com>

Pranith Kumar <bobby.prani@gmail.com> wrote:

>  	     loop != call->acks_head || stop;
>  	     loop = (loop + 1) &  (call->acks_winsz - 1)
>  	     ) {
> -		p_txb = call->acks_window + loop;
> -		smp_read_barrier_depends();
> +		p_txb = lockless_dereference(call)->acks_window + loop;

Nack.  You've stuck an implicit barrier on a dereference that doesn't matter.
And similar for other hunks of this patch.

David

^ permalink raw reply

* Re: [PATCH net-next] rhashtable: Drop gfp_flags arg in insert/remove functions
From: David Miller @ 2014-11-13 20:18 UTC (permalink / raw)
  To: tgraf; +Cc: netdev, linux-kernel, ebiederm, eric.dumazet
In-Reply-To: <17c3262027e643a0826c6ac5dd2d14cda0822a0b.1415879747.git.tgraf@suug.ch>

From: Thomas Graf <tgraf@suug.ch>
Date: Thu, 13 Nov 2014 13:45:46 +0100

> Reallocation is only required for shrinking and expanding and both rely
> on a mutex for synchronization and callers of rhashtable_init() are in
> non atomic context. Therefore, no reason to continue passing allocation
> hints through the API.
> 
> Instead, use GFP_KERNEL and add __GFP_NOWARN | __GFP_NORETRY to allow
> for silent fall back to vzalloc() without the OOM killer jumping in as
> pointed out by Eric Dumazet and Eric W. Biederman.
> 
> Signed-off-by: Thomas Graf <tgraf@suug.ch>

Applied, thanks Thomas.

^ permalink raw reply

* Re: [PATCH net v3] vxlan: Do not reuse sockets for a different address family
From: David Miller @ 2014-11-13 20:20 UTC (permalink / raw)
  To: mleitner; +Cc: netdev, stephen, sergei.shtylyov
In-Reply-To: <54ab998424b7b06d892eec6df6a8ec38db3cb2bb.1415896616.git.mleitner@redhat.com>

From: Marcelo Ricardo Leitner <mleitner@redhat.com>
Date: Thu, 13 Nov 2014 14:43:08 -0200

> Currently, we only match against local port number in order to reuse
> socket. But if this new vxlan wants an IPv6 socket and a IPv4 one bound
> to that port, vxlan will reuse an IPv4 socket as IPv6 and a panic will
> follow. The following steps reproduce it:
> 
>    # ip link add vxlan6 type vxlan id 42 group 229.10.10.10 \
>        srcport 5000 6000 dev eth0
>    # ip link add vxlan7 type vxlan id 43 group ff0e::110 \
>        srcport 5000 6000 dev eth0
>    # ip link set vxlan6 up
>    # ip link set vxlan7 up
>    <panic>
 ...
> So address family must also match in order to reuse a socket.
> 
> Reported-by: Jean-Tsung Hsiao <jhsiao@redhat.com>
> Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com>

Hey, this one actually compiles :-)

Applied and queued up for -stable, thanks!

^ permalink raw reply

* Re: [PATCH net-next] tcp: limit GSO packets to half cwnd
From: David Miller @ 2014-11-13 20:22 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev, ncardwell, ycheng, nanditad
In-Reply-To: <1415900722.17262.22.camel@edumazet-glaptop2.roam.corp.google.com>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Thu, 13 Nov 2014 09:45:22 -0800

> From: Eric Dumazet <edumazet@google.com>
> 
> In DC world, GSO packets initially cooked by tcp_sendmsg() are usually
> big, as sk_pacing_rate is high.
> 
> When network is congested, cwnd can be smaller than the GSO packets
> found in socket write queue. tcp_write_xmit() splits GSO packets
> using the available cwnd, and we end up sending a single GSO packet,
> consuming all available cwnd.
> 
> With GRO aggregation on the receiver, we might handle a single GRO
> packet, sending back a single ACK.
> 
> 1) This single ACK might be lost
>    TLP or RTO are forced to attempt a retransmit.
> 2) This ACK releases a full cwnd, sender sends another big GSO packet,
>    in a ping pong mode.
> 
> This behavior does not fill the pipes in the best way, because of
> scheduling artifacts.
> 
> Make sure we always have at least two GSO packets in flight.
> 
> This allows us to safely increase GRO efficiency without risking
> spurious retransmits.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>

This looks fantastic, applied, thanks Eric!

^ permalink raw reply

* Re: arm64 allmodconfig failures in nft_reject_bridge.c
From: David Miller @ 2014-11-13 20:23 UTC (permalink / raw)
  To: broonie
  Cc: pablo, linux, kaber, kadlec, stephen, linaro-kernel,
	kernel-build-reports, netfilter-devel, coreteam, bridge, netdev
In-Reply-To: <20141113194752.GQ3815@sirena.org.uk>

From: Mark Brown <broonie@kernel.org>
Date: Thu, 13 Nov 2014 19:47:52 +0000

> On Thu, Nov 13, 2014 at 02:35:13PM -0500, David Miller wrote:
> 
>> I hold changes in my tree for a week or more, because I want them to
>> "cook" there before they go to Linus.
> 
> Hrm.  Guess there must've been some other change in -next that pulled
> the header in implicitly here :(

-next pulls in my 'net' tree, so got the fix.

^ permalink raw reply

* [GIT] Networking
From: David Miller @ 2014-11-13 20:35 UTC (permalink / raw)
  To: torvalds; +Cc: akpm, netdev, linux-kernel


1) sunhme driver lacks DMA mapping error checks, based upon a report
   by Meelis Roos.

2) Fix memory leak in mvpp2 driver, from Sudip Mukherjee.

3) DMA memory allocation sizes are wrong in systemport ethernet
   driver, fix from Florian Fainelli.

4) Fix use after free in mac80211 defragmentation code, from Johannes
   Berg.

5) Some networking uapi headers missing from Kbuild file, from Stephen
   Hemminger.

6) TUN driver gets csum_start offset wrong when VLAN accel is enabled,
   and macvtap has a similar bug, from Herbert Xu.

7) Adjust several tunneling drivers to set dev->iflink after registry,
   because registry sets that to -1 overwriting whatever we did.  From
   Steffen Klassert.

8) Geneve forgets to set inner tunneling type, causing GSO
   segmentation to fail on some NICs.  From Jesse Gross.

9) Fix several locking bugs in stmmac driver, from Fabrice Gasnier and
   Giuseppe CAVALLARO.

10) Fix spurious timeouts with NewReno on low traffic connections, from
    Marcelo Leitner.

11) Fix descriptor updates in enic driver, from Govindarajulu Varadarajan.

12) PPP calls bpf_prog_create() with locks held, which isn't kosher.  Fix
    from Takashi Iwai.

13) Fix NULL deref in SCTP with malformed INIT packets, from Daniel
    Borkmann.

14) psock_fanout selftest accesses past the end of the mmap ring, fix
    from Shuah Khan.

15) Fix PTP timestamping for VLAN packets, from Richard Cochran.

16) netlink_unbind() calls in netlink pass wrong initial argument, from
    Hiroaki SHIMODA.

17) vxlan socket reuse accidently reuses a socket when the address family
    is different, so we have to explicitly check this, from Marcelo
    Lietner.

18) Fix missing include in nft_reject_bridge.c breaking the build on ppc
    and other architectures, from Guenter Roeck.

Please pull, thanks a lot.

The following changes since commit 9f935675d41aa51ebf929fc977cf530ff7d1a7fc:

  Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input (2014-10-31 19:51:11 -0700)

are available in the git repository at:


  git://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git master

for you to fetch changes up to 19ca9fc1445b76b60d34148f7ff837b055f5dcf3:

  vxlan: Do not reuse sockets for a different address family (2014-11-13 15:19:59 -0500)

----------------------------------------------------------------
Alexander Kochetkov (2):
      net/smsc911x: Fix rare soft reset timeout issue due to PHY power-down mode
      net/smsc911x: Fix delays in the PHY enable/disable routines

Andrew Lunn (1):
      net: dsa: slave: Fix autoneg for phys on switch MDIO bus

Anish Bhatt (2):
      cxgb4 : Fix bug in DCB app deletion
      cxgb4 : dcb open-lldp interop fixes

Brian Hill (1):
      net: phy: Correctly handle MII ioctl which changes autonegotiation.

Charles Keepax (1):
      asix: Do full reset during ax88772_bind

Chen Gang (2):
      drivers: net: ethernet: xilinx: xilinx_emaclite: Compatible with 'xlnx, xps-ethernetlite-2.00.b' for QEMU using
      drivers: net: ethernet: xilinx: xilinx_emaclite: revert the original commit "1db3ddff1602edf2390b7667dcbaa0f71512e3ea"

Daniel Borkmann (3):
      net: sctp: fix NULL pointer dereference in af->from_addr_param on malformed packet
      net: sctp: fix memory leak in auth key management
      ixgbe: phy: fix uninitialized status in ixgbe_setup_phy_link_tnx

David S. Miller (10):
      sunhme: Add DMA mapping error checks.
      Merge branch 'systemport-net'
      Merge branch 'tun-net'
      Merge branch 'ipv6_tunnel_iflink_init'
      Merge branch 'xgene-net'
      Merge branch 'stmmac-net'
      Merge branch 'mlx5-net'
      Merge tag 'master-2014-11-04' of git://git.kernel.org/.../linville/wireless
      Merge branch 'cxgb4-net'
      Merge branch 'bcmgenet-net'

Edward Cree (1):
      sfc: don't BUG_ON efx->max_channels == 0 in probe

Eli Cohen (2):
      net/mlx5_core: Fix race in create EQ
      net/mlx5_core: Fix race on driver load

Emmanuel Grumbach (2):
      iwlwifi: mvm: initialize the cur_ucode upon boot
      iwlwifi: fix RFkill while calibrating

Enric Balletbo i Serra (1):
      smsc911x: power-up phydev before doing a software reset.

Eric Dumazet (1):
      ipv6: fix IPV6_PKTINFO with v4 mapped

Fabrice Gasnier (2):
      stmmac: fix stmmac_tx_avail should be called with TX locked
      stmmac: release tx lock, in case of dma mapping error.

Felix Fietkau (1):
      mac80211: flush keys for AP mode on ieee80211_do_stop

Florian Fainelli (4):
      net: systemport: fix DMA allocation/freeing sizes
      net: systemport: do not crash freeing an unitialized TX ring
      net: bcmgenet: connect and disconnect from the PHY state machine
      net: bcmgenet: apply MII configuration in bcmgenet_open()

Giuseppe CAVALLARO (3):
      stmmac: fix lock in stmmac_set_rx_mode
      stmmac: fix concurrency in eee initialization.
      stmmac: fix atomicity in pm routines

Govindarajulu Varadarajan (2):
      enic: handle error condition properly in enic_rq_indicate_buf
      enic: update desc properly in rx_copybreak

Gregory Fong (1):
      bridge: include in6.h in if_bridge.h for struct in6_addr

Guenter Roeck (1):
      netfilter: nft_reject_bridge: Fix powerpc build error

Hariprasad Shenai (3):
      cxgb4vf: Move fl_starv_thres into adapter->sge data structure
      cxgb4/cxgb4vf: For T5 use Packing and Padding Boundaries for SGE DMA transfers
      cxgb4vf: FL Starvation Threshold needs to be larger than the SGE's Egress Congestion Threshold

Herbert Xu (4):
      tun: Fix csum_start with VLAN acceleration
      tun: Fix TUN_PKT_STRIP setting
      macvtap: Fix csum_start when VLAN tags are present
      lib: rhashtable - Remove weird non-ASCII characters from comments

Hiroaki SHIMODA (1):
      netlink: Properly unbind in error conditions.

Iyappan Subramanian (3):
      dtb: xgene: fix: Backward compatibility with older firmware
      drivers: net: xgene: Backward compatibility with older firmware
      drivers: net: xgene: fix: Use separate resources

Jesse Gross (3):
      geneve: Set GSO type on transmit.
      geneve: Unregister pernet subsys on module unload.
      udptunnel: Add SKB_GSO_UDP_TUNNEL during gro_complete.

Johannes Berg (2):
      mac80211: properly flush delayed scan work on interface removal
      mac80211: fix use-after-free in defragmentation

John W. Linville (2):
      Merge tag 'iwlwifi-for-john-2014-11-03' of git://git.kernel.org/.../iwlwifi/iwlwifi-fixes
      Merge tag 'mac80211-for-john-2014-11-04' of git://git.kernel.org/.../jberg/mac80211

Junjie Mao (1):
      mac80211_hwsim: release driver when ieee80211_register_hw fails

Karl Beldan (1):
      net: mv643xx_eth: reclaim TX skbs only when released by the HW

Linus Walleij (1):
      smc91x: retrieve IRQ and trigger flags in a modern way

Loganaden Velvindron (1):
      net: Add missing descriptions for fwmark_reflect for ipv4 and ipv6.

Lothar Waßmann (1):
      net: fec: fix regression on i.MX28 introduced by rx_copybreak support

Luciano Coelho (2):
      mac80211: use secondary channel offset IE also beacons during CSA
      mac80211: schedule the actual switch of the station before CSA count 0

Manish Chopra (1):
      netxen: Fix link event handling.

Marcelo Leitner (2):
      tcp: zero retrans_stamp if all retrans were acked
      vxlan: Do not reuse sockets for a different address family

Mugunthan V N (1):
      drivers: net: cpsw: remove cpsw_ale_stop from cpsw_ale_destroy

Nimrod Andy (1):
      net: fec: fix suspend broken on multiple MACs sillicons

Or Gerlitz (1):
      net/mlx4_en: Advertize encapsulation offloads features only when VXLAN tunnel is set

Rasmus Villemoes (1):
      include/linux/socket.h: Fix comment

Richard Cochran (1):
      net: ptp: fix time stamp matching logic for VLAN packets.

Ryo Munakata (1):
      net/9p: remove a comment about pref member which doesn't exist

Shuah Khan (1):
      selftests/net: psock_fanout seg faults in sock_fanout_read_ring()

Stefan Wahren (1):
      net: qualcomm: Fix dependency

Steffen Klassert (4):
      ip6_tunnel: Use ip6_tnl_dev_init as the ndo_init function.
      vti6: Use vti6_dev_init as the ndo_init function.
      sit: Use ipip6_tunnel_init as the ndo_init function.
      gre6: Move the setting of dev->iflink into the ndo_init functions.

Sudip Mukherjee (1):
      net: mvpp2: fix possible memory leak

Takashi Iwai (1):
      net: ppp: Don't call bpf_prog_create() in ppp_lock

stephen hemminger (1):
      uapi: add missing network related headers to kbuild

 Documentation/networking/ip-sysctl.txt               |  14 ++++++++
 arch/arm64/boot/dts/apm-storm.dtsi                   |  10 +++---
 drivers/net/ethernet/apm/xgene/xgene_enet_hw.c       |  18 +++++++++-
 drivers/net/ethernet/apm/xgene/xgene_enet_hw.h       |   4 +++
 drivers/net/ethernet/apm/xgene/xgene_enet_main.c     |  11 +++---
 drivers/net/ethernet/apm/xgene/xgene_enet_main.h     |   5 ++-
 drivers/net/ethernet/apm/xgene/xgene_enet_sgmac.c    |   7 +++-
 drivers/net/ethernet/apm/xgene/xgene_enet_xgmac.c    |   7 +++-
 drivers/net/ethernet/broadcom/bcmsysport.c           |  13 +++++--
 drivers/net/ethernet/broadcom/genet/bcmgenet.c       |  11 +++++-
 drivers/net/ethernet/broadcom/genet/bcmgenet.h       |   3 +-
 drivers/net/ethernet/broadcom/genet/bcmmii.c         |   9 ++---
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_dcb.c       |  31 +++++++++++------
 drivers/net/ethernet/chelsio/cxgb4/sge.c             |  30 ++++++++++++++--
 drivers/net/ethernet/chelsio/cxgb4/t4_hw.c           |  51 ++++++++++++++++++++++++----
 drivers/net/ethernet/chelsio/cxgb4/t4_regs.h         |  10 ++++++
 drivers/net/ethernet/chelsio/cxgb4vf/adapter.h       |   8 +++++
 drivers/net/ethernet/chelsio/cxgb4vf/sge.c           | 136 ++++++++++++++++++++++++++++++++++++++++++++++++-------------------------
 drivers/net/ethernet/chelsio/cxgb4vf/t4vf_common.h   |   2 ++
 drivers/net/ethernet/chelsio/cxgb4vf/t4vf_hw.c       |  28 ++++++++++++++-
 drivers/net/ethernet/cisco/enic/enic_main.c          |  20 +++++------
 drivers/net/ethernet/freescale/fec_main.c            |  39 ++++++++++++++-------
 drivers/net/ethernet/intel/ixgbe/ixgbe_phy.c         |   4 +--
 drivers/net/ethernet/marvell/mv643xx_eth.c           |  18 +++++-----
 drivers/net/ethernet/marvell/mvpp2.c                 |  27 ++++++++++-----
 drivers/net/ethernet/mellanox/mlx4/en_netdev.c       |  22 +++++++-----
 drivers/net/ethernet/mellanox/mlx5/core/eq.c         |   7 ++--
 drivers/net/ethernet/mellanox/mlx5/core/main.c       |   4 +--
 drivers/net/ethernet/qlogic/netxen/netxen_nic_main.c |   3 +-
 drivers/net/ethernet/qualcomm/Kconfig                |   3 +-
 drivers/net/ethernet/sfc/ef10.c                      |   3 +-
 drivers/net/ethernet/smsc/smc91x.c                   |  20 ++++++-----
 drivers/net/ethernet/smsc/smsc911x.c                 |  61 +++++++++++++++++++++++++++------
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c    |  52 +++++++++++++++-------------
 drivers/net/ethernet/sun/sunhme.c                    |  62 ++++++++++++++++++++++++++++++---
 drivers/net/ethernet/ti/cpsw_ale.c                   |   1 -
 drivers/net/ethernet/ti/cpts.c                       |   2 +-
 drivers/net/macvtap.c                                |   2 ++
 drivers/net/phy/dp83640.c                            |   4 +--
 drivers/net/phy/phy.c                                |  36 +++++++++++++-------
 drivers/net/ppp/ppp_generic.c                        |  40 +++++++++++-----------
 drivers/net/tun.c                                    |  28 +++++++++------
 drivers/net/usb/asix_devices.c                       |  14 +-------
 drivers/net/vxlan.c                                  |  31 +++++++++++------
 drivers/net/wireless/iwlwifi/mvm/fw.c                |  10 +++++-
 drivers/net/wireless/iwlwifi/mvm/mac80211.c          |   1 +
 drivers/net/wireless/iwlwifi/mvm/mvm.h               |   1 +
 drivers/net/wireless/iwlwifi/mvm/ops.c               |  12 ++++++-
 drivers/net/wireless/iwlwifi/pcie/trans.c            |   4 +--
 drivers/net/wireless/mac80211_hwsim.c                |   4 ++-
 include/linux/socket.h                               |   2 +-
 include/net/9p/transport.h                           |   1 -
 include/net/udp_tunnel.h                             |   9 +++++
 include/uapi/linux/Kbuild                            |   4 +++
 include/uapi/linux/if_bridge.h                       |   1 +
 lib/rhashtable.c                                     |  10 +++---
 net/bridge/netfilter/nft_reject_bridge.c             |   1 +
 net/dsa/slave.c                                      |   7 ++--
 net/ipv4/fou.c                                       |   2 ++
 net/ipv4/geneve.c                                    |   3 ++
 net/ipv4/ip_sockglue.c                               |   2 +-
 net/ipv4/tcp_input.c                                 |  60 ++++++++++++++++----------------
 net/ipv6/ip6_gre.c                                   |   5 +--
 net/ipv6/ip6_tunnel.c                                |  10 +-----
 net/ipv6/ip6_vti.c                                   |  11 +-----
 net/ipv6/sit.c                                       |  15 ++++----
 net/mac80211/ibss.c                                  |   2 +-
 net/mac80211/ieee80211_i.h                           |   3 +-
 net/mac80211/iface.c                                 |  18 ++++++----
 net/mac80211/mesh.c                                  |   2 +-
 net/mac80211/mlme.c                                  |   5 +--
 net/mac80211/rx.c                                    |  14 ++++----
 net/mac80211/spectmgmt.c                             |  18 ++++------
 net/netlink/af_netlink.c                             |   5 +--
 net/sctp/auth.c                                      |   2 --
 net/sctp/sm_make_chunk.c                             |   3 ++
 tools/testing/selftests/net/psock_fanout.c           |   2 +-
 77 files changed, 784 insertions(+), 376 deletions(-)

^ permalink raw reply

* Re: [PATCH 16/16] rxrpc: Replace smp_read_barrier_depends() with lockless_dereference()
From: David Howells @ 2014-11-13 20:47 UTC (permalink / raw)
  To: Pranith Kumar
  Cc: dhowells, David S. Miller, Dan Carpenter,
	open list:NETWORKING [GENERAL], open list, paulmck
In-Reply-To: <1415906662-4576-17-git-send-email-bobby.prani@gmail.com>

Pranith Kumar <bobby.prani@gmail.com> wrote:

> Recently lockless_dereference() was added which can be used in place of
> hard-coding smp_read_barrier_depends(). The following PATCH makes the change.

Actually, the use of smp_read_barrier_depends() is wrong in circular
buffering.  See Documentation/circular-buffers.txt

David

^ permalink raw reply

* Re: linux-next: manual merge of the net-next tree with the net tree
From: David Miller @ 2014-11-13 21:14 UTC (permalink / raw)
  To: sfr; +Cc: netdev, linux-next, linux-kernel, hariprasad, alexander.h.duyck
In-Reply-To: <20141113113555.15117365@canb.auug.org.au>

From: Stephen Rothwell <sfr@canb.auug.org.au>
Date: Thu, 13 Nov 2014 11:35:55 +1100

> Today's linux-next merge of the net-next tree got a conflict in
> drivers/net/ethernet/chelsio/cxgb4vf/sge.c between commit 65f6ecc93e7c
> ("cxgb4vf: Move fl_starv_thres into adapter->sge data structure") from
> the net tree and commit aa9cd31c3f3e ("cxgb4/cxgb4vf: Replace
> __skb_alloc_page with __dev_alloc_page") from the net-next tree.
> 
> I fixed it up (see below) and can carry the fix as necessary (no action
> is required).

This merge resolution looks perfect.

^ permalink raw reply

* Re: [PATCH] net: skb_fclone_busy() needs to detect orphaned skb
From: Eric Dumazet @ 2014-11-13 21:20 UTC (permalink / raw)
  To: Luis Henriques; +Cc: David Miller, netdev, Neal Cardwell, Joseph Salisbury
In-Reply-To: <20141113191502.GC7095@hercules>

On Thu, 2014-11-13 at 19:15 +0000, Luis Henriques wrote:
> Hi Eric,
> 
> On Thu, Oct 30, 2014 at 10:32:34AM -0700, Eric Dumazet wrote:
> > From: Eric Dumazet <edumazet@google.com>
> > 
> > Some drivers are unable to perform TX completions in a bound time.
> > They instead call skb_orphan()
> > 
> > Problem is skb_fclone_busy() has to detect this case, otherwise
> > we block TCP retransmits and can freeze unlucky tcp sessions on
> > mostly idle hosts.
> > 
> > Signed-off-by: Eric Dumazet <edumazet@google.com>
> > Fixes: 1f3279ae0c13 ("tcp: avoid retransmits of TCP packets hanging in host queues")
> > ---
> >  This is a stable candidate.
> >  This problem is known to hurt users of linux-3.16 kernels used by guests kernels.
> >  David, I can provide backports if you want.
> >  Thanks !
> > 
> 
> We got a bug report[0] where a backport for 3.16 was provided.  Since
> I couldn't find the original backport post, I'm not sure who's the
> actual author.  Could you please confirm if this backport is correct?
> (I'm copying the patch below).
> 
> [0] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1390604
> 
> Cheers,
> --
> Luís
> 
> 

Sure ! I provided this patch indeed, I am 'The Google engineer'
mentioned in this bug report ;)

Signed-off-by: Eric Dumazet <edumazet@google.com>


Thanks !

> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index 4e4932b5079b..a8794367cd20 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -2082,7 +2082,8 @@ static bool skb_still_in_host_queue(const struct sock *sk,
>  	const struct sk_buff *fclone = skb + 1;
>  
>  	if (unlikely(skb->fclone == SKB_FCLONE_ORIG &&
> -		     fclone->fclone == SKB_FCLONE_CLONE)) {
> +		     fclone->fclone == SKB_FCLONE_CLONE &&
> +		     fclone->sk == sk)) {
>  		NET_INC_STATS_BH(sock_net(sk),
>  				 LINUX_MIB_TCPSPURIOUS_RTX_HOSTQUEUES);
>  		return true;

^ permalink raw reply

* Fw: [Bug 88161] New: High traffic causes a lot of softirqs
From: Stephen Hemminger @ 2014-11-13 21:21 UTC (permalink / raw)
  To: netdev



Begin forwarded message:

Date: Thu, 13 Nov 2014 06:18:28 -0800
From: "bugzilla-daemon@bugzilla.kernel.org" <bugzilla-daemon@bugzilla.kernel.org>
To: "stephen@networkplumber.org" <stephen@networkplumber.org>
Subject: [Bug 88161] New: High traffic causes a lot of softirqs


https://bugzilla.kernel.org/show_bug.cgi?id=88161

            Bug ID: 88161
           Summary: High traffic causes a lot of softirqs
           Product: Networking
           Version: 2.5
    Kernel Version: 3.17.2
          Hardware: Intel
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: high
          Priority: P1
         Component: Other
          Assignee: shemminger@linux-foundation.org
          Reporter: mike@zcentric.com
        Regression: No

I'm using packaged rpms by centos and elrepo with the same results and I can
replicate this on any server in our cluster. 

I have tried installing 

kernel-3.10.56-11.el6.centos.alt.x86_64

Also currently running

[root@web125-east.domain.com /var/www/html]# uname -a
Linux web125-east.domain.com 3.17.2-1.el6.elrepo.x86_64 #1 SMP Fri Oct 31
10:37:44 EDT 2014 x86_64 x86_64 x86_64 GNU/Linux

from the centosplus repo to solve a problem where 2.6 was locking up process
tree on high cpu and it fixed it but it introduced another issue where we have
a lot of softirq requests when under a lot of traffic load. 

Here is a powertop from a 2.6 series server

Summary: 42492.1 wakeups/second, 0.0 GPU ops/seconds, 0.0 VFS ops/sec and
2422.0% CPU use

                Usage Events/s Category Description
            22613 ms/s 23637.4    Process php-fpm: pool www
            716.9 ms/s 15783.2    Process nginx: worker process
             21.3 ms/s 1096.1    Process /usr/bin/java -Xms200m -Xmx2000m
-Xss256k -XX:MaxDirectMemorySize=516m -XX:+UseParNewGC -XX:+UseConcMarkSweepGC
-Dage
              5.8 ms/s 674.4 Process /usr/sbin/gmond
            130.0 ms/s 494.5 Process /usr/bin/redis-server 127.0.0.1:6379
             73.2 ms/s 487.4 Process python /usr/bin/statsd-relay.py
              3.8 ms/s 82.7 Process java -Xmx6g -server -Dfile.encoding=utf-8
-XX:OnOutOfMemoryError=kill -9 %p -XX:+HeapDumpOnOutOfMemoryError -XX:HeapD
            212.4 ms/s 0.00 Interrupt [3] net_rx(softirq)


Here it is from 3.10


                Usage Events/s Category Description
             10.2 ms/s 1033.6    Timer hrtimer_wakeup
              3.3 ms/s 932.7 Process /usr/bin/java -Xms200m -Xmx2000m -Xss256k
            591.1 ms/s 624.3 Process php-fpm: pool www
             41.5 ms/s 724.0 Interrupt [3] net_rx(softirq)

Load pretty much just keeps crawling up to the 500's

There also is a lot of CPU usage from

  116 root 20 0 0 0 0 R 75.0 0.0 0:04.57 kworker/u66:0

Which from my understanding handles a lot of the acpi calls that softirq is
doing. 

I've tried many other 3.x kernels above 3.10 with the same results.. so I'm
wondering if this is a known issue

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox