Netdev List
 help / color / mirror / Atom feed
* [PATCH] net: ethernet: ti: cpsw: adjust cpsw fifos depth for fullduplex flow control
From: Grygorii Strashko @ 2017-05-08 19:21 UTC (permalink / raw)
  To: David S. Miller, netdev, Sekhar Nori, Ivan Khoronzhuk
  Cc: linux-kernel, linux-omap, Grygorii Strashko, Schuyler Patton

When users set flow control using ethtool the bits are set properly in the
CPGMAC_SL MACCONTROL register, but the FIFO depth in the respective Port n
Maximum FIFO Blocks (Pn_MAX_BLKS) registers remains set to the minimum size
reset value. When receive flow control is enabled on a port, the port's
associated FIFO block allocation must be adjusted. The port RX allocation
must increase to accommodate the flow control runout. The TRM recommends
numbers of 5 or 6.

Hence, apply required Port FIFO configuration to
Pn_MAX_BLKS.Pn_TX_MAX_BLKS=0xF and Pn_MAX_BLKS.Pn_RX_MAX_BLKS=0x5 during
interface initialization.

Cc: Schuyler Patton <spatton@ti.com>
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
---
 drivers/net/ethernet/ti/cpsw.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c
index 20fd14c..7cf9c79 100644
--- a/drivers/net/ethernet/ti/cpsw.c
+++ b/drivers/net/ethernet/ti/cpsw.c
@@ -287,6 +287,10 @@ struct cpsw_ss_regs {
 /* Bit definitions for the CPSW1_TS_SEQ_LTYPE register */
 #define CPSW_V1_SEQ_ID_OFS_SHIFT	16
 
+#define CPSW_MAX_BLKS_TX		15
+#define CPSW_MAX_BLKS_TX_SHIFT		4
+#define CPSW_MAX_BLKS_RX		5
+
 struct cpsw_host_regs {
 	u32	max_blks;
 	u32	blk_cnt;
@@ -1278,11 +1282,23 @@ static void cpsw_slave_open(struct cpsw_slave *slave, struct cpsw_priv *priv)
 	switch (cpsw->version) {
 	case CPSW_VERSION_1:
 		slave_write(slave, TX_PRIORITY_MAPPING, CPSW1_TX_PRI_MAP);
+		/* Increase RX FIFO size to 5 for supporting fullduplex
+		 * flow control mode
+		 */
+		slave_write(slave,
+			    (CPSW_MAX_BLKS_TX << CPSW_MAX_BLKS_TX_SHIFT) |
+			    CPSW_MAX_BLKS_RX, CPSW1_MAX_BLKS);
 		break;
 	case CPSW_VERSION_2:
 	case CPSW_VERSION_3:
 	case CPSW_VERSION_4:
 		slave_write(slave, TX_PRIORITY_MAPPING, CPSW2_TX_PRI_MAP);
+		/* Increase RX FIFO size to 5 for supporting fullduplex
+		 * flow control mode
+		 */
+		slave_write(slave,
+			    (CPSW_MAX_BLKS_TX << CPSW_MAX_BLKS_TX_SHIFT) |
+			    CPSW_MAX_BLKS_RX, CPSW2_MAX_BLKS);
 		break;
 	}
 
-- 
2.10.1

^ permalink raw reply related

* Re: [PATCH v1 0/4] stmmac: pci: Fix crash on Intel Galileo Gen2
From: David Miller @ 2017-05-08 19:15 UTC (permalink / raw)
  To: andriy.shevchenko; +Cc: Joao.Pinto, jan.kiszka, netdev, peppe.cavallaro
In-Reply-To: <20170508141422.39612-1-andriy.shevchenko@linux.intel.com>

From: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Date: Mon,  8 May 2017 17:14:18 +0300

> Due to misconfiguration of PCI driver for Intel Quark the user will get
> a kernel crash:
> 
> # udhcpc -i eth0
> udhcpc: started, v1.26.2
> stmmaceth 0000:00:14.6 eth0: device MAC address 98:4f:ee:05:ac:47
> Generic PHY stmmac-a6:01: attached PHY driver [Generic PHY] (mii_bus:phy_addr=stmmac-a6:01, irq=-1)
> stmmaceth 0000:00:14.6 eth0: IEEE 1588-2008 Advanced Timestamp supported
> stmmaceth 0000:00:14.6 eth0: registered PTP clock
> IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
> 
> udhcpc: sending discover
> 
> stmmaceth 0000:00:14.6 eth0: Link is Up - 100Mbps/Full - flow control off
> IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
> BUG: unable to handle kernel NULL pointer dereference at (null)
> IP: stmmac_xmit+0xf1/0x1080
> 
> Fix this by adding necessary settings.
> 
> P.S. I split fix to three patches according to what each of them adds.

Looks good, series applied, thanks.

^ permalink raw reply

* Re: [PATCH v2 net] vti: check nla_put_* return value
From: David Miller @ 2017-05-08 19:14 UTC (permalink / raw)
  To: liuhangbin; +Cc: netdev
In-Reply-To: <1494237433-18964-1-git-send-email-liuhangbin@gmail.com>

From: Hangbin Liu <liuhangbin@gmail.com>
Date: Mon,  8 May 2017 17:57:13 +0800

> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>

Applied.

^ permalink raw reply

* Re: [PATCH RFC] net: Fix inconsistent teardown and release of private netdev state.
From: Stephen Hemminger @ 2017-05-08 19:13 UTC (permalink / raw)
  To: David Miller; +Cc: netdev
In-Reply-To: <20170508.125243.1637135196896174683.davem@davemloft.net>

On Mon, 08 May 2017 12:52:43 -0400 (EDT)
David Miller <davem@davemloft.net> wrote:

> Network devices can allocate reasources and private memory using
> netdev_ops->ndo_init().  However, the release of these resources
> can occur in one of two different places.
> 
> Either netdev_ops->ndo_uninit() or netdev->destructor().
> 
> The decision of which operation frees the resources depends upon
> whether it is necessary for all netdev refs to be released before it
> is safe to perform the freeing.
> 
> netdev_ops->ndo_uninit() presumably can occur right after the
> NETDEV_UNREGISTER notifier completes and the unicast and multicast
> address lists are flushed.
> 
> netdev->destructor(), on the other hand, does not run until the
> netdev references all go away.
> 
> Further complicating the situation is that netdev->destructor()
> almost universally does also a free_netdev().
> 
> This creates a problem for the logic in register_netdevice().
> Because all callers of register_netdevice() manage the freeing
> of the netdev, and invoke free_netdev(dev) if register_netdevice()
> fails.
> 
> If netdev_ops->ndo_init() succeeds, but something else fails inside
> of register_netdevice(), it does call ndo_ops->ndo_uninit().  But
> it is not able to invoke netdev->destructor().
> 
> This is because netdev->destructor() will do a free_netdev() and
> then the caller of register_netdevice() will do the same.
> 
> However, this means that the resources that would normally be released
> by netdev->destructor() will not be.
> 
> Over the years drivers have added local hacks to deal with this, by
> invoking their destructor parts by hand when register_netdevice()
> fails.
> 
> Many drivers do not try to deal with this, and instead we have leaks.
> 
> Let's close this hole by formalizing the distinction between what
> private things need to be freed up by netdev->destructor() and whether
> the driver needs unregister_netdevice() to perform the free_netdev().
> 
> netdev->priv_destructor() performs all actions to free up the private
> resources that used to be freed by netdev->destructor(), except for
> free_netdev().
> 
> netdev->needs_free_netdev is a boolean that indicates whether
> free_netdev() should be done at the end of unregister_netdevice().
> 
> Now, register_netdevice() can sanely release all resources after
> ndo_ops->ndo_init() succeeds, by invoking both ndo_ops->ndo_uninit()
> and netdev->priv_destructor().
> 
> And at the end of unregister_netdevice(), we invoke
> netdev->priv_destructor() and optionally call free_netdev().
> 
> Signed-off-by: David S. Miller <davem@davemloft.net>
> 

Thanks for addressing a confusing and messy semantic. If you merge
this then it would be good to update the documentation in Doucmentation/networking
to describe how drivers are supposed to behave.

The original semantics here evolved from a desire not to have to re-engineer
lots of device drivers. And your change does the same; ie. assume that there
is a reason the device needs special treatment on unregister.

An alternative and more difficult way of addressing the problem would
be to rework the drivers that define destructors to work with the simpler
model used by all the other drivers. What makes these drivers special,
and can they be fixed instead. Probably this won't work and would be too
much effort on lots of little used devices.

^ permalink raw reply

* Re: [PATCH] net: ethernet: stmmac: properly set PS bit in MII configurations during reset
From: Thomas Petazzoni @ 2017-05-08 19:12 UTC (permalink / raw)
  To: Giuseppe CAVALLARO; +Cc: Corentin Labbe, Alexandre Torgue, netdev, stable
In-Reply-To: <49297ed2-84dc-3e9b-8f40-af915476a091@st.com>

Hello,

On Mon, 8 May 2017 16:28:21 +0200, Giuseppe CAVALLARO wrote:

> > I just see that GMAC_CONTROL and MAC_CTRL_REG are the same, so why not create a custom adjust_link for each dwmac type ?
> > This will permit to call it instead of set_ps() and remove lots of if (has_gmac) and co in stmmac_adjust_link()
> > Basicly replace all between "ctrl = readl()... and writel(ctrl)" by a sot of priv->hw->mac->adjust_link()
> >
> > It will also help a lot for my dwmac-sun8i inclusion (since I add some if has_sun8i:))  
> 
> Corentin, I think this is a good idea and maybe necessary now that the 
> driver is supporting a lot of chips.
> In the past it was sufficient to have a adjust link function and a 
> stmmac_hw_fix_mac_speed
> to invoke dedicated hook shared between MAC10/100 and GMAC inside STM 
> platforms.
> 
> Thomas, I wonder if you could take a look at the 
> priv->plat->fix_mac_speed. This can be used
> for setting internal registers too.

Once again, this is not called at the right time to fix the issue I'm
seeing with a MII PHY. I need to adjust the PS bit between asserting the
reset and polling for the reset bit to clear.

->fix_mac_speed() is called in the adjust_link() call-back, which is
called way too late.

Please, read again my patch and the description of the problem that I
have sent. But basically, any solution that does not allow to set the
PS bit between asserting the DMA reset bit and polling for it to clear
will not work for MII PHYs.

Best regards,

Thomas Petazzoni
-- 
Thomas Petazzoni, CTO, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply

* Re: [PATCH net] bpf: don't let ldimm64 leak map addresses on unprivileged
From: David Miller @ 2017-05-08 19:08 UTC (permalink / raw)
  To: daniel; +Cc: alexei.starovoitov, jannh, kafai, netdev
In-Reply-To: <793c517a7d163c613ab886eb02d32efea9f902fd.1494194233.git.daniel@iogearbox.net>

From: Daniel Borkmann <daniel@iogearbox.net>
Date: Mon,  8 May 2017 00:04:09 +0200

> The patch fixes two things at once:
> 
> 1) It checks the env->allow_ptr_leaks and only prints the map address to
>    the log if we have the privileges to do so, otherwise it just dumps 0
>    as we would when kptr_restrict is enabled on %pK. Given the latter is
>    off by default and not every distro sets it, I don't want to rely on
>    this, hence the 0 by default for unprivileged.
> 
> 2) Printing of ldimm64 in the verifier log is currently broken in that
>    we don't print the full immediate, but only the 32 bit part of the
>    first insn part for ldimm64. Thus, fix this up as well; it's okay to
>    access, since we verified all ldimm64 earlier already (including just
>    constants) through replace_map_fd_with_map_ptr().
> 
> Fixes: cbd357008604 ("bpf: verifier (add ability to receive verification log)")
> Reported-by: Jann Horn <jannh@google.com>
> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

Applied, with second Fixes: tag added, and queued up for -stable.

Thanks.

^ permalink raw reply

* Re: [PATCH 0/4] TI Bluetooth serdev support
From: Sebastian Reichel @ 2017-05-08 19:07 UTC (permalink / raw)
  To: Adam Ford
  Cc: Rob Herring, Marcel Holtmann, linux-bluetooth, Mark Rutland,
	devicetree, Johan Hedberg, Gustavo Padovan, Satish Patel, Wei Xu,
	Eyal Reizer, netdev, linux-arm-kernel
In-Reply-To: <CAHCN7x+Mq=b6RN-ezU0400W=H=D7DR+Es1FHtT4GyozpVd8ALA@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 3458 bytes --]

Hi,

On Fri, May 05, 2017 at 09:51:33AM -0500, Adam Ford wrote:
> On Sun, Apr 30, 2017 at 11:04 AM, Sebastian Reichel <sre@kernel.org> wrote:
> > On Sun, Apr 30, 2017 at 10:14:20AM -0500, Adam Ford wrote:
> >> On Wed, Apr 5, 2017 at 1:30 PM, Rob Herring <robh@kernel.org> wrote:
> >> > This series adds serdev support to the HCI LL protocol used on TI BT
> >> > modules and enables support on HiKey board with with the WL1835 module.
> >> > With this the custom TI UIM daemon and btattach are no longer needed.
> >>
> >> Without UIM daemon, what instruction do you use to load the BT firmware?
> >>
> >> I was thinking 'hciattach' but I was having trouble.  I was hoping you
> >> might have some insight.
> >>
> >>  hciattach -t 30 -s 115200 /dev/ttymxc1 texas 3000000 flow  Just
> >> returns a timeout.
> >>
> >> I modified my i.MX6 device tree per the binding documentation and
> >> setup the regulators and enable GPIO pins.
> >
> > If you configured everything correctly no userspace interaction is
> > required. The driver should request the firmware automatically once
> > you power up the bluetooth device.
> >
> > Apart from DT changes make sure, that the following options are
> > enabled and check dmesg for any hints.
> >
> > CONFIG_SERIAL_DEV_BUS
> > CONFIG_SERIAL_DEV_CTRL_TTYPORT
> > CONFIG_BT_HCIUART
> > CONFIG_BT_HCIUART_LL
> 
> I have enabled those flags, and I have updated my device tree.
> I am testing this on an OMAP3630 (DM3730) board with a WL1283.  I am
> getting a lot of timeout errors.  I tested this against the original
> implemention I had in pdata-quirks.c using the ti-st driver, uim & and
> the btwilink driver.
> 
> I pulled in some of the newer patches to enable the wl1283-st, but I
> am obviously missing something.
> 
> I   58.717651] Bluetooth: hci0: Reading TI version information failed
> (-110)
> [   58.724853] Bluetooth: hci0: download firmware failed, retrying...
> [   60.957641] Bluetooth: hci0 command 0x1001 tx timeout
> [   68.957641] Bluetooth: hci0: Reading TI version information failed
> (-110)
> [   68.964843] Bluetooth: hci0: download firmware failed, retrying...
> [   69.132171] Bluetooth: Unknown HCI packet type 06
> [   69.138244] Bluetooth: Unknown HCI packet type 0c
> [   69.143249] Bluetooth: Unknown HCI packet type 40
> [   69.148498] Bluetooth: Unknown HCI packet type 20
> [   69.153533] Bluetooth: Data length is too large
> [   69.158569] Bluetooth: Unknown HCI packet type a0
> [   69.163574] Bluetooth: Unknown HCI packet type 00
> [   69.168731] Bluetooth: Unknown HCI packet type 00
> [   69.173736] Bluetooth: Unknown HCI packet type 34
> [   69.178924] Bluetooth: Unknown HCI packet type 91
> [   71.197631] Bluetooth: hci0 command 0x1001 tx timeout
> [   79.197662] Bluetooth: hci0: Reading TI version information failed (-110)
>
> Since the pdata-quirks and original ti-st drivers work together, I
> know the hardware is fine.  The only change to the device tree is the
> addition of the Bluetooth container:
> 
> bluetooth {
>   compatible = "ti,wl1283-st";
>   enable-gpios = <&gpio6 2 GPIO_ACTIVE_HIGH>;
> };
> 
> Any thoughts or suggestions to try? I get similar behavior on an
> i.MX6 board with a wl1837-st module as well.

Looks like its loosing bytes. I suggest to have a look at
pinmuxing (esp. for cts & rts) and maybe add some debug
prints to see where it starts failing.

-- Sebastian

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* Re: [PATCH net v2] Check for skb_put_padto return value
From: David Miller @ 2017-05-08 19:03 UTC (permalink / raw)
  To: mail; +Cc: netdev
In-Reply-To: <0102015be3e8a656-6cd75d31-ac07-4255-80d9-3c481165dfaa-000000@eu-west-1.amazonses.com>

From: Peter Heise <mail@pheise.de>
Date: Sun, 7 May 2017 17:15:26 +0000

> Return value of skb_put_padto is now checked as
> reported by Dan Carpenter. skb might be freed in
> case of error in skb_put_padto.
> ---
>  net/hsr/hsr_device.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/net/hsr/hsr_device.c b/net/hsr/hsr_device.c
> index c73160fb11e7..a1545d09a3c9 100644
> --- a/net/hsr/hsr_device.c
> +++ b/net/hsr/hsr_device.c
> @@ -314,7 +314,8 @@ static void send_hsr_supervision_frame(struct hsr_port *master,
>  	hsr_sp = (typeof(hsr_sp)) skb_put(skb, sizeof(struct hsr_sup_payload));
>  	ether_addr_copy(hsr_sp->MacAddressA, master->dev->dev_addr);
>  
> -	skb_put_padto(skb, ETH_ZLEN + HSR_HLEN);
> +	if (skb_put_padto(skb, ETH_ZLEN + HSR_HLEN))
> +		return;
>  
>  	hsr_forward_skb(skb, master);
>  	return;

You're adding a more serious bug than the one you are fixing.

Now, if skb_put_padto() fails, we leak the SKB.

^ permalink raw reply

* Re: [PATCH] yam: use memdup_user
From: David Miller @ 2017-05-08 19:02 UTC (permalink / raw)
  To: geliangtang; +Cc: jpr, linux-hams, netdev, linux-kernel
In-Reply-To: <bd8ca4a87fc128e5260c13960423f025e1b42c5c.1493783298.git.geliangtang@gmail.com>

From: Geliang Tang <geliangtang@gmail.com>
Date: Sat,  6 May 2017 23:42:22 +0800

> Use memdup_user() helper instead of open-coding to simplify the code.
> 
> Signed-off-by: Geliang Tang <geliangtang@gmail.com>

Applied.

^ permalink raw reply

* Re: [PATCH] net/hippi/rrunner: use memdup_user
From: David Miller @ 2017-05-08 19:02 UTC (permalink / raw)
  To: geliangtang; +Cc: jes, linux-hippi, netdev, linux-kernel
In-Reply-To: <58733b05cfe6248e30471a59ad1faee7f2f71280.1493780932.git.geliangtang@gmail.com>

From: Geliang Tang <geliangtang@gmail.com>
Date: Sat,  6 May 2017 23:42:16 +0800

> Use memdup_user() helper instead of open-coding to simplify the code.
> 
> Signed-off-by: Geliang Tang <geliangtang@gmail.com>

Applied.

^ permalink raw reply

* Re: [PATCH net] cxgb4: avoid disabling FEC by default
From: David Miller @ 2017-05-08 19:01 UTC (permalink / raw)
  To: ganeshgr; +Cc: netdev, nirranjan, indranil
In-Reply-To: <1494060906-6453-1-git-send-email-ganeshgr@chelsio.com>

From: Ganesh Goudar <ganeshgr@chelsio.com>
Date: Sat,  6 May 2017 14:25:06 +0530

> Recent Chelsio firmware started using few port capablity bits to
> manage FEC and as driver was not aware of FEC changes those bits
> were zeroed, consequently disabling FEC.
> 
> Avoid zeroing those bits and default to whatever the firmware
> tells us the Link is currently advertising.
> 
> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>

Applied.

^ permalink raw reply

* Re: [PATCH] net: dsa: loop: Check for memory allocation failure
From: David Miller @ 2017-05-08 19:01 UTC (permalink / raw)
  To: christophe.jaillet
  Cc: andrew, vivien.didelot, f.fainelli, netdev, linux-kernel,
	kernel-janitors
In-Reply-To: <20170506052945.2639-1-christophe.jaillet@wanadoo.fr>

From: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Date: Sat,  6 May 2017 07:29:45 +0200

> If 'devm_kzalloc' fails, a NULL pointer will be dereferenced.
> Return -ENOMEM instead, as done for some other memory allocation just a
> few lines above.
> 
> Fixes: 98cd1552ea27 ("net: dsa: Mock-up driver")
> 
> Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>

Please do not separate "Fixes: " tags from signoffs and acks with
empty lines in the future, thank you.

Applied and queued up for -stable.

^ permalink raw reply

* Re: [PATCH] bonding: make multi if nla_put_failure check into one
From: David Miller @ 2017-05-08 18:57 UTC (permalink / raw)
  To: liuhangbin; +Cc: netdev
In-Reply-To: <1494040672-13934-1-git-send-email-liuhangbin@gmail.com>

From: Hangbin Liu <liuhangbin@gmail.com>
Date: Sat,  6 May 2017 11:17:52 +0800

> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>

Changes like this neither make the code better nor worse, so please
avoid this kind of churn.

I'm not applying this.

Thank you.

^ permalink raw reply

* Re: [PATCH] bonding: check nla_put_be32 return value
From: David Miller @ 2017-05-08 18:57 UTC (permalink / raw)
  To: liuhangbin; +Cc: netdev
In-Reply-To: <1494040626-13862-1-git-send-email-liuhangbin@gmail.com>

From: Hangbin Liu <liuhangbin@gmail.com>
Date: Sat,  6 May 2017 11:17:06 +0800

> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>

Applied.

^ permalink raw reply

* Re: [PATCH] qlge: Avoid reading past end of buffer
From: David Miller @ 2017-05-08 18:42 UTC (permalink / raw)
  To: keescook
  Cc: netdev, harish.patil, manish.chopra, Dept-GELinuxNICDev,
	linux-kernel, danielmicay
In-Reply-To: <20170505223434.GA19245@beast>

From: Kees Cook <keescook@chromium.org>
Date: Fri, 5 May 2017 15:34:34 -0700

> Using memcpy() from a string that is shorter than the length copied means
> the destination buffer is being filled with arbitrary data from the kernel
> rodata segment. Instead, use strncpy() which will fill the trailing bytes
> with zeros.
> 
> This was found with the future CONFIG_FORTIFY_SOURCE feature.
> 
> Cc: Daniel Micay <danielmicay@gmail.com>
> Signed-off-by: Kees Cook <keescook@chromium.org>

Applied.

^ permalink raw reply

* Re: [PATCH] bna: ethtool: Avoid reading past end of buffer
From: David Miller @ 2017-05-08 18:42 UTC (permalink / raw)
  To: keescook
  Cc: netdev, rasesh.mody, sudarsana.kalluru, linux-kernel,
	Dept-GELinuxNICDev, danielmicay
In-Reply-To: <20170505223023.GA17972@beast>

From: Kees Cook <keescook@chromium.org>
Date: Fri, 5 May 2017 15:30:23 -0700

> Using memcpy() from a string that is shorter than the length copied means
> the destination buffer is being filled with arbitrary data from the kernel
> rodata segment. Instead, use strncpy() which will fill the trailing bytes
> with zeros.
> 
> This was found with the future CONFIG_FORTIFY_SOURCE feature.
> 
> Cc: Daniel Micay <danielmicay@gmail.com>
> Signed-off-by: Kees Cook <keescook@chromium.org>

Applied.

^ permalink raw reply

* Re: [PATCH] bna: Avoid reading past end of buffer
From: David Miller @ 2017-05-08 18:42 UTC (permalink / raw)
  To: keescook
  Cc: netdev, rasesh.mody, sudarsana.kalluru, Dept-GELinuxNICDev,
	linux-kernel, danielmicay
In-Reply-To: <20170505222532.GA16005@beast>

From: Kees Cook <keescook@chromium.org>
Date: Fri, 5 May 2017 15:25:32 -0700

> Using memcpy() from a string that is shorter than the length copied means
> the destination buffer is being filled with arbitrary data from the kernel
> rodata segment. Instead, use strncpy() which will fill the trailing bytes
> with zeros.
> 
> This was found with the future CONFIG_FORTIFY_SOURCE feature.
> 
> Cc: Daniel Micay <danielmicay@gmail.com>
> Signed-off-by: Kees Cook <keescook@chromium.org>

Applied.

^ permalink raw reply

* Re: [PATCH v2] vlan: Keep NETIF_F_HW_CSUM similar to other software devices
From: David Miller @ 2017-05-08 18:39 UTC (permalink / raw)
  To: vyasevich; +Cc: netdev, mkubecek, alexander.duyck, avagin, vyasevic
In-Reply-To: <1494015461-12192-1-git-send-email-vyasevic@redhat.com>

From: Vladislav Yasevich <vyasevich@gmail.com>
Date: Fri,  5 May 2017 16:17:41 -0400

> Vlan devices, like all other software devices, enable
> NETIF_F_HW_CSUM feature.  However, unlike all the othe other
> software devices, vlans will switch to using IP|IPV6_CSUM
> features, if the underlying devices uses them.  In these situations,
> checksum offload features on the vlan device can't be controlled
> via ethtool.
> 
> This patch makes vlans keep HW_CSUM feature if the underlying
> device supports checksum offloading.  This makes vlan devices
> behave like other software devices, and restores control to the
> user.
> 
> A side-effect is that some offload settings (typically UFO)
> may be enabled on the vlan device while being disabled on the HW.
> However, the GSO code will correctly process the packets. This
> actually results in slightly better raw throughput.
> 
> Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
> ---
> V2:  posted the right patch.

Applied, thanks.

^ permalink raw reply

* Re: [PATCH net] tcp: make congestion control optionally skip slow start after idle
From: David Miller @ 2017-05-08 18:37 UTC (permalink / raw)
  To: weiwan; +Cc: netdev, ycheng, ncardwell, edumazet
In-Reply-To: <20170505195323.124792-1-tracywwnj@gmail.com>

From: Wei Wang <weiwan@google.com>
Date: Fri,  5 May 2017 12:53:23 -0700

> From: Wei Wang <weiwan@google.com>
> 
> Congestion control modules that want full control over congestion
> control behavior do not want the cwnd modifications controlled by
> the sysctl_tcp_slow_start_after_idle code path.
> So skip those code paths for CC modules that use the cong_control()
> API.
> As an example, those cwnd effects are not desired for the BBR congestion
> control algorithm.
> 
> Fixes: c0402760f565 ("tcp: new CC hook to set sending rate with rate_sample in any CA state")
> Signed-off-by: Wei Wang <weiwan@google.com>
> Signed-off-by: Yuchung Cheng <ycheng@google.com>
> Signed-off-by: Neal Cardwell <ncardwell@google.com>

Applied and queued up for -stable, thanks.

^ permalink raw reply

* Re: [Patch net] ipv4: restore rt->fi for reference counting
From: David Miller @ 2017-05-08 18:35 UTC (permalink / raw)
  To: xiyou.wangcong; +Cc: netdev, andreyknvl, edumazet
In-Reply-To: <1493934857-6693-1-git-send-email-xiyou.wangcong@gmail.com>

From: Cong Wang <xiyou.wangcong@gmail.com>
Date: Thu,  4 May 2017 14:54:17 -0700

> IPv4 dst could use fi->fib_metrics to store metrics but fib_info
> itself is refcnt'ed, so without taking a refcnt fi and
> fi->fib_metrics could be freed while dst metrics still points to
> it. This triggers use-after-free as reported by Andrey twice.
> 
> This patch reverts commit 2860583fe840 ("ipv4: Kill rt->fi") to
> restore this reference counting. It is a quick fix for -net and
> -stable, for -net-next, as Eric suggested, we can consider doing
> reference counting for metrics itself instead of relying on fib_info.
> 
> IPv6 is very different, it copies or steals the metrics from mx6_config
> in fib6_commit_metrics() so probably doesn't need a refcnt.
> 
> Decnet has already done the refcnt'ing, see dn_fib_semantic_match().
> 
> Fixes: 2860583fe840 ("ipv4: Kill rt->fi")
> Reported-by: Andrey Konovalov <andreyknvl@google.com>
> Tested-by: Andrey Konovalov <andreyknvl@google.com>
> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>

Applied and queued up for -stable, thanks.

^ permalink raw reply

* Re: FEC on i.MX 7 transmit queue timeout
From: Stefan Agner @ 2017-05-08 18:22 UTC (permalink / raw)
  To: Andy Duan, Andrew Lunn; +Cc: festevam, netdev, netdev-owner
In-Reply-To: <AM4PR0401MB2260D9D293EE8A37B93554F6FFEE0@AM4PR0401MB2260.eurprd04.prod.outlook.com>

On 2017-05-07 19:13, Andy Duan wrote:
> From: Andrew Lunn <andrew@lunn.ch> Sent: Friday, May 05, 2017 8:24 PM
>>To: Andy Duan <fugang.duan@nxp.com>
>>Cc: Stefan Agner <stefan@agner.ch>; festevam@gmail.com;
>>netdev@vger.kernel.org; netdev-owner@vger.kernel.org
>>Subject: Re: FEC on i.MX 7 transmit queue timeout
>>
>>> No, it is not workaround. As i said, quque1 and queue2 are for AVB
>>> paths have higher priority in transmition.
>>
>>Does this higher priority result in the low priority queue being starved? Is that
>>why the timer goes off? What happens when somebody does use AVB. Are
>>we back to the same problem? This is what seems to make is sounds like a
>>work around, not a fix.
>>
>>     Andrew
> Yes, queue0 may be blocked by queue1 and queue2, then the queue0
> watchdog time maybe triggered.
> If somebody use AVB quque1 and queue2, the remaining bandwidth is for
> queue0, for example, in 100Mbps system, quque1 cost 50Mbps bandwidth
> and queue2 cost 50Mbps bandwidth for audio and video streaming, then
> queue0 (best effort) has 0 bandwidth that limit user case cannot have 
> asynchronous frames (IP(tcp/udp)) on networking. Of course these is
> extreme case.
> In essentially,  asynchronous frames (IP) go queue0 for the original
> design. To do these just implement .ndo_select_queue() callback in
> driver like fsl tree.

I guess you refer to this commit?

http://git.freescale.com/git/cgit.cgi/imx/linux-imx.git/commit/?h=imx_4.1.15_2.0.0_ga&id=b0d8fa989651baf847287f6888f4d7b723e2a207

It seems that by default there is a Credit-based scheme enabled
(ENETx_QOS[TX_SCHEME] = 000). The driver enables the queue1/2 and
assigns it each 50% of the bandwidth (IDLE_SLOPE_1/2 is set to 0x200,
which according to the register description of IDLE_SLOPE means BW
fraction of 0.5!). This actually violates even the note in register
ENETx_DMAnCFG:

"NOTE: For AVB applications, the BW fraction of class 1 and class 2
combined must not exceed .75."

Instead of using the credit based scheme we could switch to round robin,
but not sure if that is what we want.

What is the default criteria to select queues when .ndo_select_queue is
not provided? I guess it tries to balance individual streams/processes
for better SMP performance?

^ permalink raw reply

* Re: [PATCH net-next 9/9] ipvlan: introduce individual MAC addresses
From: Dan Williams @ 2017-05-08 18:13 UTC (permalink / raw)
  To: Chiappero, Marco, Jiri Benc
  Cc: netdev@vger.kernel.org, David S . Miller, Kirsher, Jeffrey T,
	Duyck, Alexander H, Grandhi, Sainath, Mahesh Bandewar
In-Reply-To: <695CDAEB5A6CE2498DE413F2A9AA829609161324@IRSMSX101.ger.corp.intel.com>

On Mon, 2017-05-08 at 15:29 +0000, Chiappero, Marco wrote:
> > -----Original Message-----
> > From: Jiri Benc [mailto:jbenc@redhat.com]
> > Sent: Thursday, May 4, 2017 5:44 PM
> > To: Chiappero, Marco <marco.chiappero@intel.com>
> > Cc: Dan Williams <dcbw@redhat.com>; netdev@vger.kernel.org; David S
> > .
> > Miller <davem@davemloft.net>; Kirsher, Jeffrey T
> > <jeffrey.t.kirsher@intel.com>; Duyck, Alexander H
> > <alexander.h.duyck@intel.com>; Grandhi, Sainath
> > <sainath.grandhi@intel.com>; Mahesh Bandewar <maheshb@google.com>
> > Subject: Re: [PATCH net-next 9/9] ipvlan: introduce individual MAC
> > addresses
> > 
> > On Thu, 4 May 2017 09:37:00 +0000, Chiappero, Marco wrote:
> > > This looks conceptually wrong. Yes, ipvlan works at L3 (which is
> > > an
> > > implementation detail anyway), but slaves are Ethernet interfaces
> > > and
> > > should behave as much as possible as such regardless, with an
> > > individual MAC address assigned.
> > 
> > Isn't the proper fix then converting ipvlan interfaces to be L3
> > only interfaces?
> > I.e., ARPHRD_NONE? There's not much ipvlan can do with arbitrary
> > Ethernet
> > frames anyway. Of course, a flag to switch to the new behavior
> > would be
> > needed in order to preserve backwards compatibility.
> 
> Yes, L3 only interfaces would be a valid solution and a major
> differentiation with respect to macvlan. In fact it's been considered
> but abandoned because:
> - it would be a break from the past
> - VMs and containers usually expect Ethernet devices
> 
> Having that said, I'm ok with this solution if deemed preferable.
>  
> > This patchset looks very wrong. For proper support of multiple MAC
> > addresses,
> > we have macvlan and it's pointless to add that to ipvlan.
> 
> Ipvlan will never have proper support for L2, it's a L3 thing and
> doesn't aim at replacing macvlan. So, the options are:
> 1) remove it and provide L3 only interfaces - as you are proposing
> 2) fully emulate it preserving the single unicast MAC rule - my
> proposal
> 
> I'm open to both solutions, to me the important thing is to address
> the problems with the current implementation, one way or another,
> making it comply with basic networking concepts and expectations.
> 
> > And doing some kind of weird MAC NAT in ipvlan just to satisfy
> > broken tools that
> > can't cope with multiple interfaces with the same MAC address is
> > wrong, too.
> 
> Ipvlan has always had the MAC issue, regardless, these tools simply
> make it more apparent. And as I said already, whether they are broken

If we're talking about the slaves being unable to change when the
parent MAC changes, everyone agrees that was a bug.

If we are talking about the "all slaves have the same MAC" then that's
not an "issue", that's the way it's designed.  Whether that design is
the best design possible is debatable, but that's the way it currently
is.

> is debatable (yet I have to read a reasonable motivation). At the 

Existing tools are already broken for bond slaves and a couple existing
drivers, see below.

Note that making the MACs unique would break DHCP functionality,
because now the MAC address encoded in the 'chaddr' field of the DHCP
packet would no longer correspond to the MAC address of the device that
DHCP replies should be received on.  The userspace process writes the
'chaddr' field, and the userspace process would see the unique (and
incorrect) MAC address.

Misrepresenting the MAC address of the device, as this would do, is
simply not going to work.  The tools that expect the unique MAC
addresses are already broken.

> very least their expectation to have unique addresses on the same
> broadcast domain is hardly arguable. Should ipvlan considered
> special? Again, questionable.

bond slaves
drivers/net/ethernet/toshiba/ps3_gelic_net.c
drivers/s390/net/qeth_l3_main.c

already all have the same MAC address for different netdevices.  Yeah,
not many people have PS3s anymore, but s390 qeth is fairly widely used.
 Just pointing out that ipvlan is hardly the first device to have
encountered this issue, or to have solved it this way.  qeth does
pretty much the same thing as ipvlan.

I'm not arguing that ipvlan is perfect.  I'm just arguing that in its
current form (eg, virtualized L2 device) making this change is
incorrect.

Dan

> > Those tools are already broken anyway, there's nothing preventing
> > anyone to
> > set the same MAC address to multiple interfaces.
> 
> What are you trying to demonstrate, that you can shoot yourself in
> the foot? That you can interfere with the controller? That you can
> intentionally break your network? Yes, of course you can, so what?
> As long as network components comply with expected behaviors,
> including the ability to change the MAC address, every orchestrator
> will do its job of managing the network and will do it correctly.
> 
> > I suppose those tools don't work with bonding and bridge, either?
> 
> NaaS software and SDN controllers are built around bridges, a bridge
> is a well-defined, coherent, fundamental network component. As soon
> as ipvlan will behave coherently as well, upper layers will handle it
> properly.
> By the way, none of them has the same limitations of ipvlan.
>  
> > > So, either we fix this by forcing slaves to stay in sync with
> > > master,
> > 
> > Yes, that's the correct behavior. Well, at least as correct as one
> > can get with the
> > ipvlan broken design of pretending that an interface is L2 when in
> > fact, it is not.
> 
> Eventually we got there:  "ipvlan broken design". Let's fix it then.
> Would moving to L3 only interfaces be accepted instead?
> --------------------------------------------------------------
> Intel Research and Development Ireland Limited
> Registered in Ireland
> Registered Office: Collinstown Industrial Park, Leixlip, County
> Kildare
> Registered Number: 308263
> 
> 
> This e-mail and any attachments may contain confidential material for
> the sole
> use of the intended recipient(s). Any review or distribution by
> others is
> strictly prohibited. If you are not the intended recipient, please
> contact the
> sender and delete all copies.
> 

^ permalink raw reply

* Re: [PATCH] net: dsa: loop: Check for memory allocation failure
From: Joe Perches @ 2017-05-08 17:44 UTC (permalink / raw)
  To: Julia Lawall, David Laight
  Cc: 'Christophe JAILLET', andrew@lunn.ch,
	vivien.didelot@savoirfairelinux.com, f.fainelli@gmail.com,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	kernel-janitors@vger.kernel.org
In-Reply-To: <alpine.DEB.2.20.1705082031280.3440@hadrien>

On Mon, 2017-05-08 at 20:32 +0800, Julia Lawall wrote:
> 
> On Mon, 8 May 2017, David Laight wrote:
> 
> > From: Christophe JAILLET
> > > Sent: 06 May 2017 06:30
> > > If 'devm_kzalloc' fails, a NULL pointer will be dereferenced.
> > > Return -ENOMEM instead, as done for some other memory allocation just a
> > > few lines above.
> > 
> > ...
> > > --- a/drivers/net/dsa/dsa_loop.c
> > > +++ b/drivers/net/dsa/dsa_loop.c
> > > @@ -256,6 +256,9 @@ static int dsa_loop_drv_probe(struct mdio_device *mdiodev)
> > >  		return -ENOMEM;
> > > 
> > >  	ps = devm_kzalloc(&mdiodev->dev, sizeof(*ps), GFP_KERNEL);
> > > +	if (!ps)
> > > +		return -ENOMEM;
> > > +
> > >  	ps->netdev = dev_get_by_name(&init_net, pdata->netdev);
> > >  	if (!ps->netdev)
> > >  		return -EPROBE_DEFER;
> > 
> > On the face if it this code leaks like a sieve.
> 
> I don't think so.  The allocations (dsa_switch_alloc and devm_kzalloc) use
> devm functions.

It's at least wasteful.

Each time -EPROBE_DEFER occurs, another set of calls to
dsa_switch_alloc and dev_kzalloc also occurs.

Perhaps it'd be better to do:

	if (ps->netdev) {
		devm_kfree(&devmdev->dev, ps);
		devm_kfree(&mdiodev->dev, ds);
		return -EPROBE_DEFER;
	}

^ permalink raw reply

* Re: sky2: Use seq_putc() in sky2_debug_show()
From: SF Markus Elfring @ 2017-05-08 17:42 UTC (permalink / raw)
  To: Lino Sanfilippo, netdev
  Cc: Mirko Lindner, Stephen Hemminger, LKML, kernel-janitors
In-Reply-To: <8594c152-68a4-c0c7-b4c6-49c753b0c748@gmx.de>

> Which issue do you mean? I dont see any issue you fix here.

Are the run time characteristics a bit nicer for the function “seq_putc”
in comparison to the function “seq_puts” for printing a single line break here?

Regards,
Markus

^ permalink raw reply

* Re: [PATCH RFC net-next 0/6] net: reducing memory footprint of network devices
From: Florian Fainelli @ 2017-05-08 17:35 UTC (permalink / raw)
  To: David Ahern, netdev; +Cc: roopa, nicolas.dichtel
In-Reply-To: <20170506160734.47084-1-dsahern@gmail.com>

On 05/06/2017 09:07 AM, David Ahern wrote:
> As I have mentioned many times[1], at ~43+kB per instance the use of
> net_devices does not scale for deployments needing 10,000+ devices. At
> netconf 1.2 there was a discussion about using a net_device_common for
> the minimal set of common attributes with other structs built on top of
> that one for "full" devices. It provided a means for the code to know
> "non-standard" net_devices. Conceptually, that approach has its merits
> but it is not practical given the sweeping changes required to the code
> base. More importantly though struct net_device is not the problem; it
> weighs in at less than 2kB so reorganizing the code base around a
> refactored net_device is not going to solve the problem. The primary
> issue is all of the initializations done *because* it is a struct
> net_device -- kobject and sysfs and the protocols (e.g., ipv4, ipv6,
> mpls, neighbors).
> 
> So, how do you keep the desired attributes of a net device -- network
> addresses, xmit function, qdisc, netfilter rules, tcpdump -- while
> lowering the overhead of a net_device instance and without sweeping
> changes across net/ and drivers/net/?
> 
> This patch set introduces the concept of labeling net_devices as
> "lightweight", first mentioned at netdev 1.1 [1]. Users have to opt
> in to lightweight devices by passing a new attribute, IFLA_LWT_NETDEV,
> in the new link request. This lightweight tag is meant for virtual
> devices such as vlan, vrf, vti, and dummy where the user expects to
> create a lot of them and does not want the duplication of resources.
> Each device type can always opt out of a lightweight label if necessary
> by failing device creates.
> 
> Labeling a virtual device as "lightweight" reduces the footprint for
> device creation from ~43kB to ~6kB. That reduction in memory is obtained
> by:
> 1. no entry in sysfs
>    - kobject in net_device.device is not initialized
> 
> 2. no entry in procfs
>    - no sysctl option for these devices
> 
> 3. deferred ipv4, ipv6, mpls initialization
>    - network layer must be enabled before an address can be assigned
>      or mpls labels can be processed
>    - enables what Florian called L2 only devices [2]
> 
> Once the core premise of a lightweight device is accepted, follow on
> patches can reduce the overhead of network initializations. e.g.,
> 
> 1. remove devconf per device (ipv4 and ipv6)
>    - lightweight devices use the default settings rather than replicate
>      the same data for each device
> 
> 2. reduce / remove / opt out of snmp mibs
>    - snmp6_alloc_dev and icmpv6msg_mib_device specifically is a heavy
>      hitter
> 
> Patches can also be found here:
>     https://github.com/dsahern/linux lwt-dev-rfc
> 
> And iproute2 here:
>     https://github.com/dsahern/iproute2 lwt-dev
> 
> Example:
>     ip li add foo lwd type vrf table 123
> 
> - creates VRF device 'foo' as a lightweight netdevice.

This is really looking nice, thanks for posting this patch series! The
only submission wide comment I have is that the flag is named
IFF_LWT_NETDEV whereas the helper that checks for it is named
netif_is_lwd() so we should reconcile the two. Since there is an
existing lightweight tunnel infrastructure already, maybe using
IFF_LWD_NETDEV (or just IFF_LWD) would be good enough here?

> 
> 
> [1] http://www.netdevconf.org/1.1/proceedings/slides/ahern-aleksandrov-prabhu-scaling-network-cumulus.pdf
> [2] https://www.spinics.net/lists/netdev/msg340808.html
> David Ahern (6):
>   net: Add accessor for kboject in a net_device
>   net: Add flags argument to alloc_netdev_mqs
>   net: Introduce IFF_LWT_NETDEV flag
>   net: Do not intialize kobject for lightweight netdevs
>   net: Delay initializations for lightweight devices
>   net: add uapi for creating lightweight devices
> 
>  drivers/net/ethernet/mellanox/mlx5/core/ipoib.c |  2 +-
>  drivers/net/ethernet/tile/tilegx.c              |  2 +-
>  drivers/net/tun.c                               |  2 +-
>  drivers/net/wireless/marvell/mwifiex/cfg80211.c |  2 +-
>  include/linux/netdevice.h                       | 27 ++++++++--
>  include/uapi/linux/if_link.h                    |  1 +
>  net/batman-adv/sysfs.c                          | 13 ++++-
>  net/bridge/br_if.c                              | 12 +++--
>  net/bridge/br_sysfs_br.c                        | 17 +++---
>  net/bridge/br_sysfs_if.c                        |  8 ++-
>  net/core/dev.c                                  | 71 ++++++++++++++++++-------
>  net/core/neighbour.c                            |  3 ++
>  net/core/net-sysfs.c                            | 25 ++++++---
>  net/core/rtnetlink.c                            | 10 +++-
>  net/ethernet/eth.c                              |  2 +-
>  net/ipv4/devinet.c                              | 18 ++++++-
>  net/ipv6/addrconf.c                             |  9 ++++
>  net/mac80211/iface.c                            |  2 +-
>  net/mpls/af_mpls.c                              |  6 +++
>  net/wireless/core.c                             | 15 ++++--
>  20 files changed, 190 insertions(+), 57 deletions(-)
> 


-- 
Florian

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox