Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH] selftests: net: add config fragments
From: Shannon Nelson @ 2018-06-20  1:42 UTC (permalink / raw)
  To: Anders Roxell, davem, shuah, fw; +Cc: netdev, linux-kselftest, linux-kernel
In-Reply-To: <20180619164111.30785-1-anders.roxell@linaro.org>

On 6/19/2018 9:41 AM, Anders Roxell wrote:
> Add fragments to pass bridge and vlan tests.
> 
> Fixes: 33b01b7b4f19 ("selftests: add rtnetlink test script")
> Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
> ---
> 
> Hi,
> 
> net/rtnetlink.sh still fails on tc hbt hierarchy, addrlabel and ipsec:
> Error: Specified qdisc not found.
> RTNETLINK answers: No such file or directory
> Error: Parent Qdisc doesn't exists.
> We have an error talking to the kernel, -1
> Error: Parent Qdisc doesn't exists.
> We have an error talking to the kernel, -1
> Error: Parent Qdisc doesn't exists.
> We have an error talking to the kernel, -1
> Error: Parent Qdisc doesn't exists.
> We have an error talking to the kernel, -1
> Error: Parent Qdisc doesn't exists.
> We have an error talking to the kernel, -1
> Error: Parent Qdisc doesn't exists.
> We have an error talking to the kernel, -1
> Error: Invalid handle.
> FAIL: tc htb hierarchy
> 
> FAIL: ipv6 addrlabel
> 
> FAIL: can't add fou port 7777, skipping test
> RTNETLINK answers: Operation not supported
> FAIL: can't add macsec interface, skipping test
> RTNETLINK answers: Protocol not supported
> RTNETLINK answers: No such process
> RTNETLINK answers: No such process
> ./rtnetlink.sh: line 527:  5356 Terminated              ip x m >
> $tmpfile
> FAIL: ipsec
> 
> 
> I'm using iproute2 tag: 4.17 and tried the qdisc command from the
> function kci_test_tc in net/rtnetlink.sh:
> $ tc qdisc add dev lo root handle 1: htb
> Error: Specified qdisc not found.
> 
> For kci_test_addrlabel it fails on this row:
> ip addrlabel list |grep -q "prefix dead::/64 dev lo label 1"
> 
> Any idea why these three fails?


The "Terminated" line is there because "ip x m" had been put into the 
background, and at the end of the ipsec test it is killed.  I can try to 
play some games with exec and redirection to make that go away.

The "FAIL: ipsec" is partly because the test isn't smart enough to look 
to see if there is any offload actually available to test.  I'm working 
on a patch to netdevsim to add the ipsec-offload in order to have a 
better test.  And yes, this should say "ipsec-offload", not "ipsec".

I don't know about the qdisk or addrlabel issues.

Cheers,
sln


> 
> Cheers,
> Anders
> 
>   tools/testing/selftests/net/config | 2 ++
>   1 file changed, 2 insertions(+)
> 
> diff --git a/tools/testing/selftests/net/config b/tools/testing/selftests/net/config
> index 7ba089b33e8b..cd3a2f1545b5 100644
> --- a/tools/testing/selftests/net/config
> +++ b/tools/testing/selftests/net/config
> @@ -12,3 +12,5 @@ CONFIG_NET_IPVTI=y
>   CONFIG_INET6_XFRM_MODE_TUNNEL=y
>   CONFIG_IPV6_VTI=y
>   CONFIG_DUMMY=y
> +CONFIG_BRIDGE=y
> +CONFIG_VLAN_8021Q=y
> 

^ permalink raw reply

* [PATCH net] net: sungem: fix rx checksum support
From: Eric Dumazet @ 2018-06-20  2:18 UTC (permalink / raw)
  To: David S . Miller
  Cc: netdev, Meelis Roos, Mathieu Malaterre, Andreas Schwab,
	Eric Dumazet, Eric Dumazet

After commit 88078d98d1bb ("net: pskb_trim_rcsum() and CHECKSUM_COMPLETE
are friends"), sungem owners reported the infamous "eth0: hw csum failure"
message.

CHECKSUM_COMPLETE has in fact never worked for this driver, but this
was masked by the fact that upper stacks had to strip the FCS, and
therefore skb->ip_summed was set back to CHECKSUM_NONE before
my recent change.

Driver configures a number of bytes to skip when the chip computes
the checksum, and for some reason only half of the Ethernet header
was skipped.

Then a second problem is that we should strip the FCS by default,
unless the driver is updated to eventually support NETIF_F_RXFCS in
the future.

Finally, a driver should check if NETIF_F_RXCSUM feature is enabled
or not, so that the admin can turn off rx checksum if wanted.

Many thanks to Andreas Schwab and Mathieu Malaterre for their
help in debugging this issue.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Meelis Roos <mroos@linux.ee>
Reported-by: Mathieu Malaterre <malat@debian.org>
Reported-by: Andreas Schwab <schwab@linux-m68k.org>
Tested-by: Andreas Schwab <schwab@linux-m68k.org>
---
 drivers/net/ethernet/sun/sungem.c | 22 ++++++++++++----------
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/sun/sungem.c b/drivers/net/ethernet/sun/sungem.c
index 7a16d40a72d13cf1d522e8a3a396c826fe76f9b9..b9221fc1674dfa0ef17a43f8ff86d700a1ae514f 100644
--- a/drivers/net/ethernet/sun/sungem.c
+++ b/drivers/net/ethernet/sun/sungem.c
@@ -60,8 +60,7 @@
 #include <linux/sungem_phy.h>
 #include "sungem.h"
 
-/* Stripping FCS is causing problems, disabled for now */
-#undef STRIP_FCS
+#define STRIP_FCS
 
 #define DEFAULT_MSG	(NETIF_MSG_DRV		| \
 			 NETIF_MSG_PROBE	| \
@@ -435,7 +434,7 @@ static int gem_rxmac_reset(struct gem *gp)
 	writel(desc_dma & 0xffffffff, gp->regs + RXDMA_DBLOW);
 	writel(RX_RING_SIZE - 4, gp->regs + RXDMA_KICK);
 	val = (RXDMA_CFG_BASE | (RX_OFFSET << 10) |
-	       ((14 / 2) << 13) | RXDMA_CFG_FTHRESH_128);
+	       (ETH_HLEN << 13) | RXDMA_CFG_FTHRESH_128);
 	writel(val, gp->regs + RXDMA_CFG);
 	if (readl(gp->regs + GREG_BIFCFG) & GREG_BIFCFG_M66EN)
 		writel(((5 & RXDMA_BLANK_IPKTS) |
@@ -760,7 +759,6 @@ static int gem_rx(struct gem *gp, int work_to_do)
 	struct net_device *dev = gp->dev;
 	int entry, drops, work_done = 0;
 	u32 done;
-	__sum16 csum;
 
 	if (netif_msg_rx_status(gp))
 		printk(KERN_DEBUG "%s: rx interrupt, done: %d, rx_new: %d\n",
@@ -855,9 +853,13 @@ static int gem_rx(struct gem *gp, int work_to_do)
 			skb = copy_skb;
 		}
 
-		csum = (__force __sum16)htons((status & RXDCTRL_TCPCSUM) ^ 0xffff);
-		skb->csum = csum_unfold(csum);
-		skb->ip_summed = CHECKSUM_COMPLETE;
+		if (likely(dev->features & NETIF_F_RXCSUM)) {
+			__sum16 csum;
+
+			csum = (__force __sum16)htons((status & RXDCTRL_TCPCSUM) ^ 0xffff);
+			skb->csum = csum_unfold(csum);
+			skb->ip_summed = CHECKSUM_COMPLETE;
+		}
 		skb->protocol = eth_type_trans(skb, gp->dev);
 
 		napi_gro_receive(&gp->napi, skb);
@@ -1761,7 +1763,7 @@ static void gem_init_dma(struct gem *gp)
 	writel(0, gp->regs + TXDMA_KICK);
 
 	val = (RXDMA_CFG_BASE | (RX_OFFSET << 10) |
-	       ((14 / 2) << 13) | RXDMA_CFG_FTHRESH_128);
+	       (ETH_HLEN << 13) | RXDMA_CFG_FTHRESH_128);
 	writel(val, gp->regs + RXDMA_CFG);
 
 	writel(desc_dma >> 32, gp->regs + RXDMA_DBHI);
@@ -2985,8 +2987,8 @@ static int gem_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 	pci_set_drvdata(pdev, dev);
 
 	/* We can do scatter/gather and HW checksum */
-	dev->hw_features = NETIF_F_SG | NETIF_F_HW_CSUM;
-	dev->features |= dev->hw_features | NETIF_F_RXCSUM;
+	dev->hw_features = NETIF_F_SG | NETIF_F_HW_CSUM | NETIF_F_RXCSUM;
+	dev->features = dev->hw_features;
 	if (pci_using_dac)
 		dev->features |= NETIF_F_HIGHDMA;
 
-- 
2.18.0.rc1.244.gcf134e6275-goog

^ permalink raw reply related

* Re: [PATCH v2] net: ethernet: stmmac: dwmac-rk: Add GMAC support for PX30
From: David Wu @ 2018-06-20  2:40 UTC (permalink / raw)
  To: Heiko Stübner
  Cc: davem, robh+dt, mark.rutland, huangtao, netdev, linux-arm-kernel,
	linux-rockchip, linux-kernel, 张晴
In-Reply-To: <2582999.2hZx6CH9S6@diego>

Hi Heiko,

在 2018年06月14日 16:30, Heiko Stübner 写道:
> Am Donnerstag, 14. Juni 2018, 10:14:31 CEST schrieb David Wu:
>> Hi Heiko,
>>
>> 在 2018年06月14日 15:54, Heiko Stübner 写道:
>>> I don't see that new clock documented in the dt-binding.
>>> Also, which clock from the clock-controller does this connect to?
>>
>> The clock is the "SCLK_GMAC_RMII" at the clock-controller, which could
>> be set rate by the link speed.
> 
> Hmm, while these huge number of clocks are somewhat strange,
> shouldn't it be named something with _rmii instead of _speed then?

Okay, it is better to be named _speed.

> 
> Also, I don't see any clk_enable action for that new clock, so you could
> end up with being off?

The new speed is the parent of the clk_tx_rx, to enable/disable 
clk_tx_rx, the new clock would be also enabled/disabled.

> 
> And someone could convert the driver to use the new clk-bulk APIs [0],
> so the large number of clk_prepare_enable calls would be a bit
> trimmed down.
> 
> 
> Heiko
> 
> [0] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/clk/clk-bulk.c
> 
> 
> 
> 
> 

^ permalink raw reply

* Re: [PATCH] PCI: allow drivers to limit the number of VFs to 0
From: Jakub Kicinski @ 2018-06-20  2:56 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Bjorn Helgaas, linux-pci, netdev, Sathya Perla, Felix Manlunas,
	alexander.duyck, john.fastabend, Jacob Keller, Donald Dutile,
	oss-drivers, Christoph Hellwig
In-Reply-To: <20180619213715.GC33049@bhelgaas-glaptop.roam.corp.google.com>

On Tue, 19 Jun 2018 16:37:15 -0500, Bjorn Helgaas wrote:
> On Fri, May 25, 2018 at 09:02:23AM -0500, Bjorn Helgaas wrote:
> > On Thu, May 24, 2018 at 06:20:15PM -0700, Jakub Kicinski wrote:  
> > > Hi Bjorn!
> > > 
> > > On Thu, 24 May 2018 18:57:48 -0500, Bjorn Helgaas wrote:  
> > > > On Mon, Apr 02, 2018 at 03:46:52PM -0700, Jakub Kicinski wrote:  
> > > > > Some user space depends on enabling sriov_totalvfs number of VFs
> > > > > to not fail, e.g.:
> > > > > 
> > > > > $ cat .../sriov_totalvfs > .../sriov_numvfs
> > > > > 
> > > > > For devices which VF support depends on loaded FW we have the
> > > > > pci_sriov_{g,s}et_totalvfs() API.  However, this API uses 0 as
> > > > > a special "unset" value, meaning drivers can't limit sriov_totalvfs
> > > > > to 0.  Remove the special values completely and simply initialize
> > > > > driver_max_VFs to total_VFs.  Then always use driver_max_VFs.
> > > > > Add a helper for drivers to reset the VF limit back to total.    
> > > > 
> > > > I still can't really make sense out of the changelog.
> > > >
> > > > I think part of the reason it's confusing is because there are two
> > > > things going on:
> > > > 
> > > >   1) You want this:
> > > >   
> > > >        pci_sriov_set_totalvfs(dev, 0);
> > > >        x = pci_sriov_get_totalvfs(dev) 
> > > > 
> > > >      to return 0 instead of total_VFs.  That seems to connect with
> > > >      your subject line.  It means "sriov_totalvfs" in sysfs could be
> > > >      0, but I don't know how that is useful (I'm sure it is; just
> > > >      educate me :))  
> > > 
> > > Let me just quote the bug report that got filed on our internal bug
> > > tracker :)
> > > 
> > >   When testing Juju Openstack with Ubuntu 18.04, enabling SR-IOV causes
> > >   errors because Juju gets the sriov_totalvfs for SR-IOV-capable device
> > >   then tries to set that as the sriov_numvfs parameter.
> > > 
> > >   For SR-IOV incapable FW, the sriov_totalvfs parameter should be 0, 
> > >   but it's set to max.  When FW is switched to flower*, the correct 
> > >   sriov_totalvfs value is presented.
> > > 
> > > * flower is a project name  
> > 
> > From the point of view of the PCI core (which knows nothing about
> > device firmware and relies on the architected config space described
> > by the PCIe spec), this sounds like an erratum: with some firmware
> > installed, the device is not capable of SR-IOV, but still advertises
> > an SR-IOV capability with "TotalVFs > 0".
> > 
> > Regardless of whether that's an erratum, we do allow PF drivers to use
> > pci_sriov_set_totalvfs() to limit the number of VFs that may be
> > enabled by writing to the PF's "sriov_numvfs" sysfs file.
> > 
> > But the current implementation does not allow a PF driver to limit VFs
> > to 0, and that does seem nonsensical.
> >   
> > > My understanding is OpenStack uses sriov_totalvfs to determine how many
> > > VFs can be enabled, looks like this is the code:
> > > 
> > > http://git.openstack.org/cgit/openstack/charm-neutron-openvswitch/tree/hooks/neutron_ovs_utils.py#n464
> > >   
> > > >   2) You're adding the pci_sriov_reset_totalvfs() interface.  I'm not
> > > >      sure what you intend for this.  Is *every* driver supposed to
> > > >      call it in .remove()?  Could/should this be done in the core
> > > >      somehow instead of depending on every driver?  
> > > 
> > > Good question, I was just thinking yesterday we may want to call it
> > > from the core, but I don't think it's strictly necessary nor always
> > > sufficient (we may reload FW without re-probing).
> > > 
> > > We have a device which supports different number of VFs based on the FW
> > > loaded.  Some legacy FWs does not inform the driver how many VFs it can
> > > support, because it supports max.  So the flow in our driver is this:
> > > 
> > > load_fw(dev);
> > > ...
> > > max_vfs = ask_fw_for_max_vfs(dev);
> > > if (max_vfs >= 0)
> > > 	return pci_sriov_set_totalvfs(dev, max_vfs);
> > > else /* FW didn't tell us, assume max */
> > > 	return pci_sriov_reset_totalvfs(dev); 
> > > 
> > > We also reset the max on device remove, but that's not strictly
> > > necessary.
> > > 
> > > Other users of pci_sriov_set_totalvfs() always know the value to set
> > > the total to (either always get it from FW or it's a constant).
> > > 
> > > If you prefer we can work out the correct max for those legacy cases in
> > > the driver as well, although it seemed cleaner to just ask the core,
> > > since it already has total_VFs value handy :)
> > >   
> > > > I'm also having a hard time connecting your user-space command example
> > > > with the rest of this.  Maybe it will make more sense to me tomorrow
> > > > after some coffee.  
> > > 
> > > OpenStack assumes it will always be able to set sriov_numvfs to
> > > sriov_totalvfs, see this 'if':
> > > 
> > > http://git.openstack.org/cgit/openstack/charm-neutron-openvswitch/tree/hooks/neutron_ovs_utils.py#n512  
> > 
> > Thanks for educating me.  I think there are two issues here that we
> > can separate.  I extracted the patch below for the first.
> > 
> > The second is the question of resetting driver_max_VFs.  I think we
> > currently have a general issue in the core:
> > 
> >   - load PF driver 1
> >   - driver calls pci_sriov_set_totalvfs() to reduce driver_max_VFs
> >   - unload PF driver 1
> >   - load PF driver 2
> > 
> > Now driver_max_VFs is still stuck at the lower value set by driver 1.
> > I don't think that's the way this should work.
> > 
> > I guess this is partly a consequence of setting driver_max_VFs in
> > sriov_init(), which is called before driver attach and should only
> > depend on hardware characteristics, so it is related to the patch
> > below.  But I think we should fix it in general, not just for
> > netronome.  
> 
> Hi Jakub, the patch below is in v4.18-rc1 as 8d85a7a4f2c9 ("PCI/IOV:
> Allow PF drivers to limit total_VFs to 0").  If there's more we need
> to do here, would you mind rebasing what's left to v4.18-rc1 and
> reposting it?

Hi Bjorn!

Thanks a lot for looking into this!  My understanding is that we have
two ways forward:
 - add a pci_sriov_reset_totalvfs() helper for drivers to call;
 - make the core reset the totalVFs after driver is detached.

IMHO second option is better.  I went ahead and posted:

https://patchwork.ozlabs.org/patch/931210/

This works very well for nfp driver (modulo minor clean ups but I'd
rather route those via networking trees to avoid conflicts).

> > commit 4a338bc6f94b9ad824ac944f5dfc249d6838719c
> > Author: Jakub Kicinski <jakub.kicinski@netronome.com>
> > Date:   Fri May 25 08:18:34 2018 -0500
> > 
> >     PCI/IOV: Allow PF drivers to limit total_VFs to 0
> >     
> >     Some SR-IOV PF drivers implement .sriov_configure(), which allows
> >     user-space to enable VFs by writing the desired number of VFs to the sysfs
> >     "sriov_numvfs" file (see sriov_numvfs_store()).
> >     
> >     The PCI core limits the number of VFs to the TotalVFs advertised by the
> >     device in its SR-IOV capability.  The PF driver can limit the number of VFs
> >     to even fewer (it may have pre-allocated data structures or knowledge of
> >     device limitations) by calling pci_sriov_set_totalvfs(), but previously it
> >     could not limit the VFs to 0.
> >     
> >     Change pci_sriov_get_totalvfs() so it always respects the VF limit imposed
> >     by the PF driver, even if the limit is 0.
> >     
> >     This sequence:
> >     
> >       pci_sriov_set_totalvfs(dev, 0);
> >       x = pci_sriov_get_totalvfs(dev);
> >     
> >     previously set "x" to TotalVFs from the SR-IOV capability.  Now it will set
> >     "x" to 0.
> >     
> >     Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
> >     Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> > 
> > diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
> > index 192b82898a38..d0d73dbbd5ca 100644
> > --- a/drivers/pci/iov.c
> > +++ b/drivers/pci/iov.c
> > @@ -469,6 +469,7 @@ static int sriov_init(struct pci_dev *dev, int pos)
> >  	iov->nres = nres;
> >  	iov->ctrl = ctrl;
> >  	iov->total_VFs = total;
> > +	iov->driver_max_VFs = total;
> >  	pci_read_config_word(dev, pos + PCI_SRIOV_VF_DID, &iov->vf_device);
> >  	iov->pgsz = pgsz;
> >  	iov->self = dev;
> > @@ -827,10 +828,7 @@ int pci_sriov_get_totalvfs(struct pci_dev *dev)
> >  	if (!dev->is_physfn)
> >  		return 0;
> >  
> > -	if (dev->sriov->driver_max_VFs)
> > -		return dev->sriov->driver_max_VFs;
> > -
> > -	return dev->sriov->total_VFs;
> > +	return dev->sriov->driver_max_VFs;
> >  }
> >  EXPORT_SYMBOL_GPL(pci_sriov_get_totalvfs);
> >    

^ permalink raw reply

* Re: [PATCH net] ipvlan: call dev_change_flags when reset ipvlan mode
From: Hangbin Liu @ 2018-06-20  3:22 UTC (permalink / raw)
  To: Cong Wang
  Cc: Linux Kernel Network Developers, Stefano Brivio, Paolo Abeni,
	David Miller, Mahesh Bandewar
In-Reply-To: <CAM_iQpXeuU=Cxons5=centGNJzm0OaNU3Jj5hE91hvJH0o2-Eg@mail.gmail.com>

On Tue, Jun 19, 2018 at 02:10:18PM -0700, Cong Wang wrote:
> On Mon, Jun 18, 2018 at 7:04 AM, Hangbin Liu <liuhangbin@gmail.com> wrote:
> > @@ -94,10 +95,13 @@ static int ipvlan_set_port_mode(struct ipvl_port *port, u16 nval)
> >                         mdev->l3mdev_ops = NULL;
> >                 }
> >                 list_for_each_entry(ipvlan, &port->ipvlans, pnode) {
> > +                       flags = ipvlan->dev->flags;
> >                         if (nval == IPVLAN_MODE_L3 || nval == IPVLAN_MODE_L3S)
> > -                               ipvlan->dev->flags |= IFF_NOARP;
> > +                               dev_change_flags(ipvlan->dev,
> > +                                                flags | IFF_NOARP);
> >                         else
> > -                               ipvlan->dev->flags &= ~IFF_NOARP;
> > +                               dev_change_flags(ipvlan->dev,
> > +                                                flags & ~IFF_NOARP);
> 
> You need to check the return value of dev_change_flags().

Hi Wang Cong,

The only case dev_change_flags() return an err is when we change IFF_UP flag.
Since we only set/reset IFF_NOARP, do you think we still need to check the
return value?

Thanks
Hangbin

^ permalink raw reply

* [PATCH net-next] tcp: ignore rcv_rtt sample with old ts ecr value
From: Wei Wang @ 2018-06-20  4:42 UTC (permalink / raw)
  To: David Miller, netdev; +Cc: Eric Dumazet, Neal Cardwell, Wei Wang

From: Wei Wang <weiwan@google.com>

When receiving multiple packets with the same ts ecr value, only try
to compute rcv_rtt sample with the earliest received packet.
This is because the rcv_rtt calculated by later received packets
could possibly include long idle time or other types of delay.
For example:
(1) server sends last packet of reply with TS val V1
(2) client ACKs last packet of reply with TS ecr V1
(3) long idle time passes
(4) client sends next request data packet with TS ecr V1 (again!)
At this time, the rcv_rtt computed on server with TS ecr V1 will be
inflated with the idle time and should get ignored.

Signed-off-by: Wei Wang <weiwan@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/linux/tcp.h  |  1 +
 net/ipv4/tcp.c       |  1 +
 net/ipv4/tcp_input.c | 14 +++++++++++---
 3 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/include/linux/tcp.h b/include/linux/tcp.h
index 72705eaf4b84..3dbea6610304 100644
--- a/include/linux/tcp.h
+++ b/include/linux/tcp.h
@@ -350,6 +350,7 @@ struct tcp_sock {
 #endif
 
 /* Receiver side RTT estimation */
+	u32 rcv_rtt_last_tsecr;
 	struct {
 		u32	rtt_us;
 		u32	seq;
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 141acd92e58a..47c45d5be9f9 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2563,6 +2563,7 @@ int tcp_disconnect(struct sock *sk, int flags)
 	sk->sk_shutdown = 0;
 	sock_reset_flag(sk, SOCK_DONE);
 	tp->srtt_us = 0;
+	tp->rcv_rtt_last_tsecr = 0;
 	tp->write_seq += tp->max_window + 2;
 	if (tp->write_seq == 0)
 		tp->write_seq = 1;
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 355d3dffd021..76ca88f63b70 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -582,9 +582,12 @@ static inline void tcp_rcv_rtt_measure_ts(struct sock *sk,
 {
 	struct tcp_sock *tp = tcp_sk(sk);
 
-	if (tp->rx_opt.rcv_tsecr &&
-	    (TCP_SKB_CB(skb)->end_seq -
-	     TCP_SKB_CB(skb)->seq >= inet_csk(sk)->icsk_ack.rcv_mss)) {
+	if (tp->rx_opt.rcv_tsecr == tp->rcv_rtt_last_tsecr)
+		return;
+	tp->rcv_rtt_last_tsecr = tp->rx_opt.rcv_tsecr;
+
+	if (TCP_SKB_CB(skb)->end_seq -
+	    TCP_SKB_CB(skb)->seq >= inet_csk(sk)->icsk_ack.rcv_mss) {
 		u32 delta = tcp_time_stamp(tp) - tp->rx_opt.rcv_tsecr;
 		u32 delta_us;
 
@@ -5475,6 +5478,11 @@ void tcp_rcv_established(struct sock *sk, struct sk_buff *skb)
 				tcp_ack(sk, skb, 0);
 				__kfree_skb(skb);
 				tcp_data_snd_check(sk);
+				/* When receiving pure ack in fast path, update
+				 * last ts ecr directly instead of calling
+				 * tcp_rcv_rtt_measure_ts()
+				 */
+				tp->rcv_rtt_last_tsecr = tp->rx_opt.rcv_tsecr;
 				return;
 			} else { /* Header too small */
 				TCP_INC_STATS(sock_net(sk), TCP_MIB_INERRS);
-- 
2.18.0.rc1.244.gcf134e6275-goog

^ permalink raw reply related

* Re: [PATCH v2] net: dsa: drop some VLAs in switch.c
From: Kees Cook @ 2018-06-20  4:43 UTC (permalink / raw)
  To: Salvatore Mesoraca
  Cc: Andrew Lunn, LKML, Kernel Hardening, Network Development,
	David S. Miller, Florian Fainelli, Vivien Didelot, David Laight
In-Reply-To: <1525706596-13601-1-git-send-email-s.mesoraca16@gmail.com>

On Mon, May 7, 2018 at 8:23 AM, Salvatore Mesoraca
<s.mesoraca16@gmail.com> wrote:
> We avoid 2 VLAs by using a pre-allocated field in dsa_switch.
> We also try to avoid dynamic allocation whenever possible.
>
> Link: http://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com
> Link: http://lkml.kernel.org/r/20180505185145.GB32630@lunn.ch
>
> Signed-off-by: Salvatore Mesoraca <s.mesoraca16@gmail.com>

Friendly ping. What's needed to take this into the tree? It looks like
all the issues in v1 were addressed here.

Thanks!

-Kees

> ---
>  include/net/dsa.h |  3 +++
>  net/dsa/dsa2.c    | 14 ++++++++++++++
>  net/dsa/switch.c  | 22 ++++++++++------------
>  3 files changed, 27 insertions(+), 12 deletions(-)
>
> diff --git a/include/net/dsa.h b/include/net/dsa.h
> index 60fb4ec..576791d 100644
> --- a/include/net/dsa.h
> +++ b/include/net/dsa.h
> @@ -256,6 +256,9 @@ struct dsa_switch {
>         /* Number of switch port queues */
>         unsigned int            num_tx_queues;
>
> +       unsigned long           *bitmap;
> +       unsigned long           _bitmap;
> +
>         /* Dynamically allocated ports, keep last */
>         size_t num_ports;
>         struct dsa_port ports[];
> diff --git a/net/dsa/dsa2.c b/net/dsa/dsa2.c
> index adf50fb..cebf35f0 100644
> --- a/net/dsa/dsa2.c
> +++ b/net/dsa/dsa2.c
> @@ -748,6 +748,20 @@ struct dsa_switch *dsa_switch_alloc(struct device *dev, size_t n)
>         if (!ds)
>                 return NULL;
>
> +       /* We avoid allocating memory outside dsa_switch
> +        * if it is not needed.
> +        */
> +       if (n <= sizeof(ds->_bitmap) * 8) {
> +               ds->bitmap = &ds->_bitmap;
> +       } else {
> +               ds->bitmap = devm_kzalloc(dev,
> +                                         BITS_TO_LONGS(n) *
> +                                               sizeof(unsigned long),
> +                                         GFP_KERNEL);
> +               if (unlikely(!ds->bitmap))
> +                       return NULL;
> +       }
> +
>         ds->dev = dev;
>         ds->num_ports = n;
>
> diff --git a/net/dsa/switch.c b/net/dsa/switch.c
> index b935117..142b294 100644
> --- a/net/dsa/switch.c
> +++ b/net/dsa/switch.c
> @@ -136,21 +136,20 @@ static int dsa_switch_mdb_add(struct dsa_switch *ds,
>  {
>         const struct switchdev_obj_port_mdb *mdb = info->mdb;
>         struct switchdev_trans *trans = info->trans;
> -       DECLARE_BITMAP(group, ds->num_ports);
>         int port;
>
>         /* Build a mask of Multicast group members */
> -       bitmap_zero(group, ds->num_ports);
> +       bitmap_zero(ds->bitmap, ds->num_ports);
>         if (ds->index == info->sw_index)
> -               set_bit(info->port, group);
> +               set_bit(info->port, ds->bitmap);
>         for (port = 0; port < ds->num_ports; port++)
>                 if (dsa_is_dsa_port(ds, port))
> -                       set_bit(port, group);
> +                       set_bit(port, ds->bitmap);
>
>         if (switchdev_trans_ph_prepare(trans))
> -               return dsa_switch_mdb_prepare_bitmap(ds, mdb, group);
> +               return dsa_switch_mdb_prepare_bitmap(ds, mdb, ds->bitmap);
>
> -       dsa_switch_mdb_add_bitmap(ds, mdb, group);
> +       dsa_switch_mdb_add_bitmap(ds, mdb, ds->bitmap);
>
>         return 0;
>  }
> @@ -204,21 +203,20 @@ static int dsa_switch_vlan_add(struct dsa_switch *ds,
>  {
>         const struct switchdev_obj_port_vlan *vlan = info->vlan;
>         struct switchdev_trans *trans = info->trans;
> -       DECLARE_BITMAP(members, ds->num_ports);
>         int port;
>
>         /* Build a mask of VLAN members */
> -       bitmap_zero(members, ds->num_ports);
> +       bitmap_zero(ds->bitmap, ds->num_ports);
>         if (ds->index == info->sw_index)
> -               set_bit(info->port, members);
> +               set_bit(info->port, ds->bitmap);
>         for (port = 0; port < ds->num_ports; port++)
>                 if (dsa_is_cpu_port(ds, port) || dsa_is_dsa_port(ds, port))
> -                       set_bit(port, members);
> +                       set_bit(port, ds->bitmap);
>
>         if (switchdev_trans_ph_prepare(trans))
> -               return dsa_switch_vlan_prepare_bitmap(ds, vlan, members);
> +               return dsa_switch_vlan_prepare_bitmap(ds, vlan, ds->bitmap);
>
> -       dsa_switch_vlan_add_bitmap(ds, vlan, members);
> +       dsa_switch_vlan_add_bitmap(ds, vlan, ds->bitmap);
>
>         return 0;
>  }
> --
> 1.9.1
>



-- 
Kees Cook
Pixel Security

^ permalink raw reply

* Re: [PATCH net v3 2/2] ipv4: igmp: use alarmtimer to prevent delayed reports
From: kbuild test robot @ 2018-06-20  5:09 UTC (permalink / raw)
  To: Tejaswi Tanikella; +Cc: kbuild-all, netdev, f.fainelli, andrew, davem
In-Reply-To: <20180611134619.GA28666@tejaswit-linux.qualcomm.com>

[-- Attachment #1: Type: text/plain, Size: 2779 bytes --]

Hi Tejaswi,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on net/master]

url:    https://github.com/0day-ci/linux/commits/Tejaswi-Tanikella/ktime-helpers-to-convert-between-ktime-and-jiffies/20180611-214916
config: x86_64-randconfig-s4-06200944 (attached as .config)
compiler: gcc-7 (Debian 7.3.0-16) 7.3.0
reproduce:
        # save the attached .config to linux build tree
        make ARCH=x86_64 

All errors (new ones prefixed by >>):

   In file included from include/linux/timer.h:6:0,
                    from include/linux/workqueue.h:9,
                    from include/linux/srcu.h:34,
                    from include/linux/notifier.h:16,
                    from include/linux/memory_hotplug.h:7,
                    from include/linux/mmzone.h:777,
                    from include/linux/gfp.h:6,
                    from include/linux/umh.h:4,
                    from include/linux/kmod.h:22,
                    from include/linux/module.h:13,
                    from net/ipv4/igmp.c:73:
   net/ipv4/igmp.c: In function 'igmp_mc_seq_show':
>> net/ipv4/igmp.c:2819:28: error: implicit declaration of function 'alarm_expires_remaining'; did you mean 'hrtimer_expires_remaining'? [-Werror=implicit-function-declaration]
      delta = ktime_to_jiffies(alarm_expires_remaining(&im->alarm));
                               ^
   include/linux/ktime.h:100:48: note: in definition of macro 'ktime_to_jiffies'
    #define ktime_to_jiffies(kt)  nsecs_to_jiffies(kt)
                                                   ^~
>> net/ipv4/igmp.c:2819:55: error: 'struct ip_mc_list' has no member named 'alarm'
      delta = ktime_to_jiffies(alarm_expires_remaining(&im->alarm));
                                                          ^
   include/linux/ktime.h:100:48: note: in definition of macro 'ktime_to_jiffies'
    #define ktime_to_jiffies(kt)  nsecs_to_jiffies(kt)
                                                   ^~
   cc1: some warnings being treated as errors

vim +2819 net/ipv4/igmp.c

  2813	
  2814			if (rcu_access_pointer(state->in_dev->mc_list) == im) {
  2815				seq_printf(seq, "%d\t%-10s: %5d %7s\n",
  2816					   state->dev->ifindex, state->dev->name, state->in_dev->mc_count, querier);
  2817			}
  2818	
> 2819			delta = ktime_to_jiffies(alarm_expires_remaining(&im->alarm));
  2820			seq_printf(seq,
  2821				   "\t\t\t\t%08X %5d %d:%08lX\t\t%d\n",
  2822				   im->multiaddr, im->users,
  2823				   im->tm_running,
  2824				   im->tm_running ? jiffies_delta_to_clock_t(delta) : 0,
  2825				   im->reporter);
  2826		}
  2827		return 0;
  2828	}
  2829	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 25732 bytes --]

^ permalink raw reply

* Re: [PATCH net] net: sungem: fix rx checksum support
From: David Miller @ 2018-06-20  5:30 UTC (permalink / raw)
  To: edumazet; +Cc: netdev, mroos, malat, schwab, eric.dumazet
In-Reply-To: <20180620021850.211683-1-edumazet@google.com>

From: Eric Dumazet <edumazet@google.com>
Date: Tue, 19 Jun 2018 19:18:50 -0700

> After commit 88078d98d1bb ("net: pskb_trim_rcsum() and CHECKSUM_COMPLETE
> are friends"), sungem owners reported the infamous "eth0: hw csum failure"
> message.
> 
> CHECKSUM_COMPLETE has in fact never worked for this driver, but this
> was masked by the fact that upper stacks had to strip the FCS, and
> therefore skb->ip_summed was set back to CHECKSUM_NONE before
> my recent change.
> 
> Driver configures a number of bytes to skip when the chip computes
> the checksum, and for some reason only half of the Ethernet header
> was skipped.
> 
> Then a second problem is that we should strip the FCS by default,
> unless the driver is updated to eventually support NETIF_F_RXFCS in
> the future.
> 
> Finally, a driver should check if NETIF_F_RXCSUM feature is enabled
> or not, so that the admin can turn off rx checksum if wanted.
> 
> Many thanks to Andreas Schwab and Mathieu Malaterre for their
> help in debugging this issue.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Reported-by: Meelis Roos <mroos@linux.ee>
> Reported-by: Mathieu Malaterre <malat@debian.org>
> Reported-by: Andreas Schwab <schwab@linux-m68k.org>
> Tested-by: Andreas Schwab <schwab@linux-m68k.org>

Applied and queued up for -stable, thanks Eric.

^ permalink raw reply

* Re: [PATCH net] ipvlan: call dev_change_flags when reset ipvlan mode
From: David Miller @ 2018-06-20  5:31 UTC (permalink / raw)
  To: liuhangbin; +Cc: xiyou.wangcong, netdev, sbrivio, pabeni, maheshb
In-Reply-To: <20180620032254.GW8958@leo.usersys.redhat.com>

From: Hangbin Liu <liuhangbin@gmail.com>
Date: Wed, 20 Jun 2018 11:22:54 +0800

> The only case dev_change_flags() return an err is when we change IFF_UP flag.
> Since we only set/reset IFF_NOARP, do you think we still need to check the
> return value?

It is bad to try and take shortcuts on error handling using assumptions
like that.

If dev_change_flags() is adjusted to return error codes in more
situations, nobody is going to remember to undo your "optimziation"
here.

Please check for errors, thank you.

^ permalink raw reply

* general protection fault in tls_push_sg
From: syzbot @ 2018-06-20  5:34 UTC (permalink / raw)
  To: aviadye, borisp, davejwatson, davem, linux-kernel, netdev,
	syzkaller-bugs

Hello,

syzbot found the following crash on:

HEAD commit:    ba4dbdedd3ed Merge tag 'jfs-4.18' of git://github.com/klei..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=112e9ce4400000
kernel config:  https://syzkaller.appspot.com/x/.config?x=f390986c4f7cd566
dashboard link: https://syzkaller.appspot.com/bug?extid=54bcc120da8da091d609
compiler:       gcc (GCC) 8.0.1 20180413 (experimental)

Unfortunately, I don't have any reproducer for this crash yet.

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+54bcc120da8da091d609@syzkaller.appspotmail.com

netlink: 8 bytes leftover after parsing attributes in process  
`syz-executor0'.
kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] SMP KASAN
CPU: 1 PID: 27979 Comm: syz-executor6 Not tainted 4.18.0-rc1+ #109
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
RIP: 0010:__read_once_size include/linux/compiler.h:188 [inline]
RIP: 0010:compound_head include/linux/page-flags.h:142 [inline]
RIP: 0010:put_page include/linux/mm.h:911 [inline]
RIP: 0010:tls_push_sg+0x2a3/0x880 net/tls/tls_main.c:142
Code: fa 4d 39 e5 75 a2 e8 bc 50 f1 fa 48 8b 85 08 ff ff ff 49 8d 7f 08 48  
b9 00 00 00 00 00 fc ff df c6 00 00 48 89 f8 48 c1 e8 03 <80> 3c 08 00 0f  
85 50 05 00 00 48 8b 85 08 ff ff ff 49 8b 5f 08 80
RSP: 0018:ffff8801c5776d90 EFLAGS: 00010202
RAX: 0000000000000001 RBX: 0000000000000000 RCX: dffffc0000000000
RDX: 0000000000000000 RSI: ffffffff868a59e4 RDI: 0000000000000008
RBP: ffff8801c5776eb0 R08: ffff88018e4fc6c0 R09: ffff8801c5776668
R10: 0000000000000003 R11: 0000000000000002 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
FS:  00007f2d08c17700(0000) GS:ffff8801daf00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000001ffffcc0 CR3: 0000000188ce8000 CR4: 00000000001406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
  tls_push_record+0xaec/0x1400 net/tls/tls_sw.c:264
  tls_sw_push_pending_record+0x22/0x30 net/tls/tls_sw.c:276
  tls_handle_open_record net/tls/tls_main.c:164 [inline]
  tls_sk_proto_close+0x74c/0xae0 net/tls/tls_main.c:264
  inet_release+0x104/0x1f0 net/ipv4/af_inet.c:427
  inet6_release+0x50/0x70 net/ipv6/af_inet6.c:459
  __sock_release+0xd7/0x260 net/socket.c:603
  sock_close+0x19/0x20 net/socket.c:1186
  __fput+0x35b/0x8b0 fs/file_table.c:209
  ____fput+0x15/0x20 fs/file_table.c:243
  task_work_run+0x1ec/0x2a0 kernel/task_work.c:113
  exit_task_work include/linux/task_work.h:22 [inline]
  do_exit+0x1b08/0x2750 kernel/exit.c:865
  do_group_exit+0x177/0x440 kernel/exit.c:968
  get_signal+0x88e/0x1970 kernel/signal.c:2468
  do_signal+0x9c/0x21c0 arch/x86/kernel/signal.c:816
  exit_to_usermode_loop+0x2de/0x370 arch/x86/entry/common.c:162
  prepare_exit_to_usermode arch/x86/entry/common.c:197 [inline]
  syscall_return_slowpath arch/x86/entry/common.c:268 [inline]
  do_syscall_64+0x6be/0x820 arch/x86/entry/common.c:293
  entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x455b29
Code: 1d ba fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7  
48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff  
ff 0f 83 eb b9 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007f2d08c16ce8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
RAX: fffffffffffffe00 RBX: 000000000072bec8 RCX: 0000000000455b29
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000000072bec8
RBP: 000000000072bec8 R08: 0000000000000033 R09: 000000000072bea0
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000a3e81f R14: 00007f2d08c179c0 R15: 0000000000000000
Modules linked in:
Dumping ftrace buffer:
    (ftrace buffer empty)
---[ end trace d9dfd7279b1a9c99 ]---
RIP: 0010:__read_once_size include/linux/compiler.h:188 [inline]
RIP: 0010:compound_head include/linux/page-flags.h:142 [inline]
RIP: 0010:put_page include/linux/mm.h:911 [inline]
RIP: 0010:tls_push_sg+0x2a3/0x880 net/tls/tls_main.c:142


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with  
syzbot.

^ permalink raw reply

* Re: [PATCH] net: stmmac: socfpga: add additional ocp reset line for Stratix10
From: David Miller @ 2018-06-20  5:34 UTC (permalink / raw)
  To: dinguyen; +Cc: netdev, joabreu, alexandre.torgue, peppe.cavallaro, linux-kernel
In-Reply-To: <1529422538-8447-1-git-send-email-dinguyen@kernel.org>

From: Dinh Nguyen <dinguyen@kernel.org>
Date: Tue, 19 Jun 2018 10:35:38 -0500

> The Stratix10 platform has an additional reset line, OCP(Open Core Protocol),
> that also needs to get deasserted for the stmmac ethernet controller to work.
> Thus we need to update the Kconfig to include ARCH_STRATIX10 in order to build
> dwmac-socfpga.
> 
> Also, remove the redundant check for the reset controller pointer. The
> reset driver already checks for the pointer and returns 0 if the pointer
> is NULL.
> 
> Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>

Applied.

^ permalink raw reply

* Re: [PATCH v1 net] stmmac: fix DMA channel hang in half-duplex mode
From: David Miller @ 2018-06-20  5:36 UTC (permalink / raw)
  To: vbhadram; +Cc: peppe.cavallaro, alexandre.torgue, joabreu, netdev
In-Reply-To: <1529245925-764-1-git-send-email-vbhadram@nvidia.com>

From: Bhadram Varka <vbhadram@nvidia.com>
Date: Sun, 17 Jun 2018 20:02:05 +0530

> HW does not support Half-duplex mode in multi-queue
> scenario. Fix it by not advertising the Half-Duplex
> mode if multi-queue enabled.
> 
> Signed-off-by: Bhadram Varka <vbhadram@nvidia.com>

Applied and queued up for -stable.

^ permalink raw reply

* Re: [PATCH] ucc_geth: Add BQL support
From: David Miller @ 2018-06-20  5:39 UTC (permalink / raw)
  To: joakim.tjernlund; +Cc: leoyang.li, netdev
In-Reply-To: <20180619163036.20578-1-joakim.tjernlund@infinera.com>

From: Joakim Tjernlund <joakim.tjernlund@infinera.com>
Date: Tue, 19 Jun 2018 18:30:36 +0200

> @@ -3242,6 +3243,8 @@ static int ucc_geth_tx(struct net_device *dev, u8 txQ)
>  	struct ucc_geth_private *ugeth = netdev_priv(dev);
>  	u8 __iomem *bd;		/* BD pointer */
>  	u32 bd_status;
> +	int howmany = 0;
> +	unsigned int bytes_sent = 0;

Please keep the function local variable declarations ordered from
longest to shortest line.

Thank you.

^ permalink raw reply

* Re: [PATCH net] ip: limit use of gso_size to udp
From: David Miller @ 2018-06-20  5:41 UTC (permalink / raw)
  To: willemdebruijn.kernel; +Cc: netdev, willemb
In-Reply-To: <20180619104026.77432-1-willemdebruijn.kernel@gmail.com>

From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Date: Tue, 19 Jun 2018 06:40:26 -0400

> From: Willem de Bruijn <willemb@google.com>
> 
> The ipcm(6)_cookie field gso_size is set only in the udp path. The ip
> layer copies this to cork only if sk_type is SOCK_DGRAM. This check
> proved too permissive. Ping and l2tp sockets have the same type.
> 
> Limit to sockets of type SOCK_DGRAM and protocol IPPROTO_UDP to
> exclude ping sockets.
> 
> Fixes: bec1f6f69736 ("udp: generate gso with UDP_SEGMENT")
> Reported-by: Maciej Żenczykowski <maze@google.com>
> Signed-off-by: Willem de Bruijn <willemb@google.com>

Applied, thanks Willem.

> For net-next, I'll take a look whether ipcm(6)_cookie fields like
> these can be initialized uniformly, and then this branch removed
> completely.

Sounds good.

^ permalink raw reply

* [PATCH net-next 0/2] fixes for ipsec selftests
From: Shannon Nelson @ 2018-06-20  5:42 UTC (permalink / raw)
  To: netdev, davem, anders.roxell

A couple of bad behaviors in the ipsec selftest were pointed out
by Anders Roxell <anders.roxell@linaro.org> and are addressed here.

Shannon Nelson (2):
  selftests: rtnetlink: hide complaint from terminated monitor
  selftests: rtnetlink: use a local IP address for IPsec tests

 tools/testing/selftests/net/rtnetlink.sh | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

-- 
2.7.4

^ permalink raw reply

* [PATCH net-next 1/2] selftests: rtnetlink: hide complaint from terminated monitor
From: Shannon Nelson @ 2018-06-20  5:42 UTC (permalink / raw)
  To: netdev, davem, anders.roxell
In-Reply-To: <1529473363-4036-1-git-send-email-shannon.nelson@oracle.com>

Set up the "ip xfrm monitor" subprogram so as to not see
a "Terminated" message when the subprogram is killed.

Fixes: 5e596ee171ba ("selftests: add xfrm state-policy-monitor to rtnetlink.sh")
Reported-by: Anders Roxell <anders.roxell@linaro.org>
Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
---
 tools/testing/selftests/net/rtnetlink.sh | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/tools/testing/selftests/net/rtnetlink.sh b/tools/testing/selftests/net/rtnetlink.sh
index 760faef..0a2bc6e 100755
--- a/tools/testing/selftests/net/rtnetlink.sh
+++ b/tools/testing/selftests/net/rtnetlink.sh
@@ -532,8 +532,7 @@ kci_test_ipsec()
 
 	# start the monitor in the background
 	tmpfile=`mktemp ipsectestXXX`
-	ip x m > $tmpfile &
-	mpid=$!
+	mpid=`(ip x m > $tmpfile & echo $!) 2>/dev/null`
 	sleep 0.2
 
 	ipsecid="proto esp src $srcip dst $dstip spi 0x07"
-- 
2.7.4

^ permalink raw reply related

* [PATCH net-next 2/2] selftests: rtnetlink: use a local IP address for IPsec tests
From: Shannon Nelson @ 2018-06-20  5:42 UTC (permalink / raw)
  To: netdev, davem, anders.roxell
In-Reply-To: <1529473363-4036-1-git-send-email-shannon.nelson@oracle.com>

Find an IP address on this machine to use as a source IP, and
make up a destination IP address based on the source IP.  No
actual messages will be sent, just a couple of IPsec rules are
created and deleted.

Fixes: 5e596ee171ba ("selftests: add xfrm state-policy-monitor to rtnetlink.sh")
Reported-by: Anders Roxell <anders.roxell@linaro.org>
Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
---
 tools/testing/selftests/net/rtnetlink.sh | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/net/rtnetlink.sh b/tools/testing/selftests/net/rtnetlink.sh
index 0a2bc6e..b33a371 100755
--- a/tools/testing/selftests/net/rtnetlink.sh
+++ b/tools/testing/selftests/net/rtnetlink.sh
@@ -522,8 +522,12 @@ kci_test_macsec()
 #-------------------------------------------------------------------
 kci_test_ipsec()
 {
-	srcip="14.0.0.52"
-	dstip="14.0.0.70"
+	# find an ip address on this machine and make up a destination
+	srcip=`ip -o addr | awk '/inet / { print $4; }' | grep -v "^127" | head -1 | cut -f1 -d/`
+	net=`echo $srcip | cut -f1-3 -d.`
+	base=`echo $srcip | cut -f4 -d.`
+	dstip="$net."`expr $base + 1`
+
 	algo="aead rfc4106(gcm(aes)) 0x3132333435363738393031323334353664636261 128"
 
 	# flush to be sure there's nothing configured
-- 
2.7.4

^ permalink raw reply related

* Re: [PATCH] net: nixge: Add __packed attribute to DMA descriptor struct
From: David Miller @ 2018-06-20  5:44 UTC (permalink / raw)
  To: mdf; +Cc: keescook, netdev, linux-kernel
In-Reply-To: <20180619165453.31894-1-mdf@kernel.org>

From: Moritz Fischer <mdf@kernel.org>
Date: Tue, 19 Jun 2018 09:54:53 -0700

> @@ -122,7 +122,7 @@ struct nixge_hw_dma_bd {
>  	u32 sw_id_offset;
>  	u32 reserved5;
>  	u32 reserved6;
> -};
> +} __packed;

As I understand it, based upon your replies to Florian, this bug doesn't
even show up with the current code.  The problem only happens with some
64-bit changes you are working on.

So, the change is not valid right now.

And for the 64-bit changes, I agree with Florian that you should adjust
your implementation so that this __packed dance isn't necessary and
that you can avoid some MMIOs as well.

Thanks.

^ permalink raw reply

* Re: [PATCH v3 net-next 4/6] net: ethernet: ti: cpsw: add CBS Qdisc offload
From: Ilias Apalodimas @ 2018-06-20  6:31 UTC (permalink / raw)
  To: Ivan Khoronzhuk
  Cc: grygorii.strashko, davem, corbet, akpm, netdev, linux-doc,
	linux-kernel, linux-omap, vinicius.gomes, henrik,
	jesus.sanchez-palencia, p-varis, spatton, francois.ozog, yogeshs,
	nsekhar, andrew
In-Reply-To: <20180615181310.10437-5-ivan.khoronzhuk@linaro.org>

On Fri, Jun 15, 2018 at 09:13:08PM +0300, Ivan Khoronzhuk wrote:
> The cpsw has up to 4 FIFOs per port and upper 3 FIFOs can feed rate
> limited queue with shaping. In order to set and enable shaping for
> those 3 FIFOs queues the network device with CBS qdisc attached is
> needed. The CBS configuration is added for dual-emac/single port mode
> only, but potentially can be used in switch mode also, based on
> switchdev for instance.
> 
> Despite the FIFO shapers can work w/o cpdma level shapers the base
> usage must be in combine with cpdma level shapers as described in TRM,
> that are set as maximum rates for interface queues with sysfs.
> 
> One of the possible configuration with txq shapers and CBS shapers:
> 
>                       Configured with echo RATE >
>                   /sys/class/net/eth0/queues/tx-0/tx_maxrate
>              /---------------------------------------------------
>             /
>            /            cpdma level shapers
>         +----+ +----+ +----+ +----+ +----+ +----+ +----+ +----+
>         | c7 | | c6 | | c5 | | c4 | | c3 | | c2 | | c1 | | c0 |
>         \    / \    / \    / \    / \    / \    / \    / \    /
>          \  /   \  /   \  /   \  /   \  /   \  /   \  /   \  /
>           \/     \/     \/     \/     \/     \/     \/     \/
> +---------|------|------|------|-------------------------------------+
> |    +----+      |      |  +---+                                     |
> |    |      +----+      |  |                                         |
> |    v      v           v  v                                         |
> | +----+ +----+ +----+ +----+ p        p+----+ +----+ +----+ +----+  |
> | |    | |    | |    | |    | o        o|    | |    | |    | |    |  |
> | | f3 | | f2 | | f1 | | f0 | r  CPSW  r| f3 | | f2 | | f1 | | f0 |  |
> | |    | |    | |    | |    | t        t|    | |    | |    | |    |  |
> | \    / \    / \    / \    / 0        1\    / \    / \    / \    /  |
> |  \  X   \  /   \  /   \  /             \  /   \  /   \  /   \  /   |
> |   \/ \   \/     \/     \/               \/     \/     \/     \/    |
> +-------\------------------------------------------------------------+
>          \
>           \ FIFO shaper, set with CBS offload added in this patch,
>            \ FIFO0 cannot be rate limited
>             ------------------------------------------------------
> 
> CBS shaper configuration is supposed to be used with root MQPRIO Qdisc
> offload allowing to add sk_prio->tc->txq maps that direct traffic to
> appropriate tx queue and maps L2 priority to FIFO shaper.
> 
> The CBS shaper is intended to be used for AVB where L2 priority
> (pcp field) is used to differentiate class of traffic. So additionally
> vlan needs to be created with appropriate egress sk_prio->l2 prio map.
> 
> If CBS has several tx queues assigned to it, the sum of their
> bandwidth has not overlap bandwidth set for CBS. It's recomended the
> CBS bandwidth to be a little bit more.
> 
> The CBS shaper is configured with CBS qdisc offload interface using tc
> tool from iproute2 packet.
> 
> For instance:
> 
> $ tc qdisc replace dev eth0 handle 100: parent root mqprio num_tc 3 \
> map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 queues 1@0 1@1 2@2 hw 1
> 
> $ tc -g class show dev eth0
> +---(100:ffe2) mqprio
> |    +---(100:3) mqprio
> |    +---(100:4) mqprio
> |    
> +---(100:ffe1) mqprio
> |    +---(100:2) mqprio
> |    
> +---(100:ffe0) mqprio
>      +---(100:1) mqprio
> 
> $ tc qdisc add dev eth0 parent 100:1 cbs locredit -1440 \
> hicredit 60 sendslope -960000 idleslope 40000 offload 1
> 
> $ tc qdisc add dev eth0 parent 100:2 cbs locredit -1470 \
> hicredit 62 sendslope -980000 idleslope 20000 offload 1
> 
> The above code set CBS shapers for tc0 and tc1, for that txq0 and
> txq1 is used. Pay attention, the real set bandwidth can differ a bit
> due to discreteness of configuration parameters.
> 
> Here parameters like locredit, hicredit and sendslope are ignored
> internally and are supposed to be set with assumption that maximum
> frame size for frame - 1500.
> 
> It's supposed that interface speed is not changed while reconnection,
> not always is true, so inform user in case speed of interface was
> changed, as it can impact on dependent shapers configuration.
> 
> For more examples see Documentation.
> 
> Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
> ---
>  drivers/net/ethernet/ti/cpsw.c | 221 +++++++++++++++++++++++++++++++++
>  1 file changed, 221 insertions(+)
> 
> diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c
> index edd14def98df..19573627a9bb 100644
> --- a/drivers/net/ethernet/ti/cpsw.c
> +++ b/drivers/net/ethernet/ti/cpsw.c
> @@ -46,6 +46,8 @@
>  #include "cpts.h"
>  #include "davinci_cpdma.h"
>  
> +#include <net/pkt_sched.h>
> +
>  #define CPSW_DEBUG	(NETIF_MSG_HW		| NETIF_MSG_WOL		| \
>  			 NETIF_MSG_DRV		| NETIF_MSG_LINK	| \
>  			 NETIF_MSG_IFUP		| NETIF_MSG_INTR	| \
> @@ -154,8 +156,12 @@ do {								\
>  #define IRQ_NUM			2
>  #define CPSW_MAX_QUEUES		8
>  #define CPSW_CPDMA_DESCS_POOL_SIZE_DEFAULT 256
> +#define CPSW_FIFO_QUEUE_TYPE_SHIFT	16
> +#define CPSW_FIFO_SHAPE_EN_SHIFT	16
> +#define CPSW_FIFO_RATE_EN_SHIFT		20
>  #define CPSW_TC_NUM			4
>  #define CPSW_FIFO_SHAPERS_NUM		(CPSW_TC_NUM - 1)
> +#define CPSW_PCT_MASK			0x7f
>  
>  #define CPSW_RX_VLAN_ENCAP_HDR_PRIO_SHIFT	29
>  #define CPSW_RX_VLAN_ENCAP_HDR_PRIO_MSK		GENMASK(2, 0)
> @@ -457,6 +463,8 @@ struct cpsw_priv {
>  	bool				rx_pause;
>  	bool				tx_pause;
>  	bool				mqprio_hw;
> +	int				fifo_bw[CPSW_TC_NUM];
> +	int				shp_cfg_speed;
>  	u32 emac_port;
>  	struct cpsw_common *cpsw;
>  };
> @@ -1081,6 +1089,38 @@ static void cpsw_set_slave_mac(struct cpsw_slave *slave,
>  	slave_write(slave, mac_lo(priv->mac_addr), SA_LO);
>  }
>  
> +static bool cpsw_shp_is_off(struct cpsw_priv *priv)
> +{
> +	struct cpsw_common *cpsw = priv->cpsw;
> +	struct cpsw_slave *slave;
> +	u32 shift, mask, val;
> +
> +	val = readl_relaxed(&cpsw->regs->ptype);
> +
> +	slave = &cpsw->slaves[cpsw_slave_index(cpsw, priv)];
> +	shift = CPSW_FIFO_SHAPE_EN_SHIFT + 3 * slave->slave_num;
> +	mask = 7 << shift;
> +	val = val & mask;
> +
> +	return !val;
> +}
> +
> +static void cpsw_fifo_shp_on(struct cpsw_priv *priv, int fifo, int on)
> +{
> +	struct cpsw_common *cpsw = priv->cpsw;
> +	struct cpsw_slave *slave;
> +	u32 shift, mask, val;
> +
> +	val = readl_relaxed(&cpsw->regs->ptype);
> +
> +	slave = &cpsw->slaves[cpsw_slave_index(cpsw, priv)];
> +	shift = CPSW_FIFO_SHAPE_EN_SHIFT + 3 * slave->slave_num;
> +	mask = (1 << --fifo) << shift;
> +	val = on ? val | mask : val & ~mask;
> +
> +	writel_relaxed(val, &cpsw->regs->ptype);
> +}
> +
>  static void _cpsw_adjust_link(struct cpsw_slave *slave,
>  			      struct cpsw_priv *priv, bool *link)
>  {
> @@ -1120,6 +1160,12 @@ static void _cpsw_adjust_link(struct cpsw_slave *slave,
>  			mac_control |= BIT(4);
>  
>  		*link = true;
> +
> +		if (priv->shp_cfg_speed &&
> +		    priv->shp_cfg_speed != slave->phy->speed &&
> +		    !cpsw_shp_is_off(priv))
> +			dev_warn(priv->dev,
> +				 "Speed was changed, CBS shaper speeds are changed!");
>  	} else {
>  		mac_control = 0;
>  		/* disable forwarding */
> @@ -1589,6 +1635,178 @@ static int cpsw_tc_to_fifo(int tc, int num_tc)
>  	return CPSW_FIFO_SHAPERS_NUM - tc;
>  }
>  
> +static int cpsw_set_fifo_bw(struct cpsw_priv *priv, int fifo, int bw)
> +{
> +	struct cpsw_common *cpsw = priv->cpsw;
> +	u32 val = 0, send_pct, shift;
> +	struct cpsw_slave *slave;
> +	int pct = 0, i;
> +
> +	if (bw > priv->shp_cfg_speed * 1000)
> +		goto err;
> +
> +	/* shaping has to stay enabled for highest fifos linearly
> +	 * and fifo bw no more then interface can allow
> +	 */
> +	slave = &cpsw->slaves[cpsw_slave_index(cpsw, priv)];
> +	send_pct = slave_read(slave, SEND_PERCENT);
> +	for (i = CPSW_FIFO_SHAPERS_NUM; i > 0; i--) {
> +		if (!bw) {
> +			if (i >= fifo || !priv->fifo_bw[i])
> +				continue;
> +
> +			dev_warn(priv->dev, "Prev FIFO%d is shaped", i);
> +			continue;
> +		}
> +
> +		if (!priv->fifo_bw[i] && i > fifo) {
> +			dev_err(priv->dev, "Upper FIFO%d is not shaped", i);
> +			return -EINVAL;
> +		}
> +
> +		shift = (i - 1) * 8;
> +		if (i == fifo) {
> +			send_pct &= ~(CPSW_PCT_MASK << shift);
> +			val = DIV_ROUND_UP(bw, priv->shp_cfg_speed * 10);
> +			if (!val)
> +				val = 1;
> +
> +			send_pct |= val << shift;
> +			pct += val;
> +			continue;
> +		}
> +
> +		if (priv->fifo_bw[i])
> +			pct += (send_pct >> shift) & CPSW_PCT_MASK;
> +	}
> +
> +	if (pct >= 100)
> +		goto err;
> +
> +	slave_write(slave, send_pct, SEND_PERCENT);
> +	priv->fifo_bw[fifo] = bw;
> +
> +	dev_warn(priv->dev, "set FIFO%d bw = %d\n", fifo,
> +		 DIV_ROUND_CLOSEST(val * priv->shp_cfg_speed, 100));
> +
> +	return 0;
> +err:
> +	dev_err(priv->dev, "Bandwidth doesn't fit in tc configuration");
> +	return -EINVAL;
> +}
> +
> +static int cpsw_set_fifo_rlimit(struct cpsw_priv *priv, int fifo, int bw)
> +{
> +	struct cpsw_common *cpsw = priv->cpsw;
> +	struct cpsw_slave *slave;
> +	u32 tx_in_ctl_rg, val;
> +	int ret;
> +
> +	ret = cpsw_set_fifo_bw(priv, fifo, bw);
> +	if (ret)
> +		return ret;
> +
> +	slave = &cpsw->slaves[cpsw_slave_index(cpsw, priv)];
> +	tx_in_ctl_rg = cpsw->version == CPSW_VERSION_1 ?
> +		       CPSW1_TX_IN_CTL : CPSW2_TX_IN_CTL;
> +
> +	if (!bw)
> +		cpsw_fifo_shp_on(priv, fifo, bw);
> +
> +	val = slave_read(slave, tx_in_ctl_rg);
> +	if (cpsw_shp_is_off(priv)) {
> +		/* disable FIFOs rate limited queues */
> +		val &= ~(0xf << CPSW_FIFO_RATE_EN_SHIFT);
> +
> +		/* set type of FIFO queues to normal priority mode */
> +		val &= ~(3 << CPSW_FIFO_QUEUE_TYPE_SHIFT);
> +
> +		/* set type of FIFO queues to be rate limited */
> +		if (bw)
> +			val |= 2 << CPSW_FIFO_QUEUE_TYPE_SHIFT;
> +		else
> +			priv->shp_cfg_speed = 0;
> +	}
> +
> +	/* toggle a FIFO rate limited queue */
> +	if (bw)
> +		val |= BIT(fifo + CPSW_FIFO_RATE_EN_SHIFT);
> +	else
> +		val &= ~BIT(fifo + CPSW_FIFO_RATE_EN_SHIFT);
> +	slave_write(slave, val, tx_in_ctl_rg);
> +
> +	/* FIFO transmit shape enable */
> +	cpsw_fifo_shp_on(priv, fifo, bw);
> +	return 0;
> +}
> +
> +/* Defaults:
> + * class A - prio 3
> + * class B - prio 2
> + * shaping for class A should be set first
> + */
> +static int cpsw_set_cbs(struct net_device *ndev,
> +			struct tc_cbs_qopt_offload *qopt)
> +{
> +	struct cpsw_priv *priv = netdev_priv(ndev);
> +	struct cpsw_common *cpsw = priv->cpsw;
> +	struct cpsw_slave *slave;
> +	int prev_speed = 0;
> +	int tc, ret, fifo;
> +	u32 bw = 0;
> +
> +	tc = netdev_txq_to_tc(priv->ndev, qopt->queue);
> +
> +	/* enable channels in backward order, as highest FIFOs must be rate
> +	 * limited first and for compliance with CPDMA rate limited channels
> +	 * that also used in bacward order. FIFO0 cannot be rate limited.
> +	 */
> +	fifo = cpsw_tc_to_fifo(tc, ndev->num_tc);
> +	if (!fifo) {
> +		dev_err(priv->dev, "Last tc%d can't be rate limited", tc);
> +		return -EINVAL;
> +	}
> +
> +	/* do nothing, it's disabled anyway */
> +	if (!qopt->enable && !priv->fifo_bw[fifo])
> +		return 0;
> +
> +	/* shapers can be set if link speed is known */
> +	slave = &cpsw->slaves[cpsw_slave_index(cpsw, priv)];
> +	if (slave->phy && slave->phy->link) {
> +		if (priv->shp_cfg_speed &&
> +		    priv->shp_cfg_speed != slave->phy->speed)
> +			prev_speed = priv->shp_cfg_speed;
> +
> +		priv->shp_cfg_speed = slave->phy->speed;
> +	}
> +
> +	if (!priv->shp_cfg_speed) {
> +		dev_err(priv->dev, "Link speed is not known");
> +		return -1;
> +	}
> +
> +	ret = pm_runtime_get_sync(cpsw->dev);
> +	if (ret < 0) {
> +		pm_runtime_put_noidle(cpsw->dev);
> +		return ret;
> +	}
> +
> +	bw = qopt->enable ? qopt->idleslope : 0;
> +	ret = cpsw_set_fifo_rlimit(priv, fifo, bw);
> +	if (ret) {
> +		priv->shp_cfg_speed = prev_speed;
> +		prev_speed = 0;
> +	}
> +
> +	if (bw && prev_speed)
> +		dev_warn(priv->dev,
> +			 "Speed was changed, CBS shaper speeds are changed!");
> +
> +	pm_runtime_put_sync(cpsw->dev);
> +	return ret;
> +}
> +
>  static int cpsw_ndo_open(struct net_device *ndev)
>  {
>  	struct cpsw_priv *priv = netdev_priv(ndev);
> @@ -2263,6 +2481,9 @@ static int cpsw_ndo_setup_tc(struct net_device *ndev, enum tc_setup_type type,
>  			     void *type_data)
>  {
>  	switch (type) {
> +	case TC_SETUP_QDISC_CBS:
> +		return cpsw_set_cbs(ndev, type_data);
> +
>  	case TC_SETUP_QDISC_MQPRIO:
>  		return cpsw_set_mqprio(ndev, type_data);
>  
> -- 
> 2.17.1
> 
Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>

^ permalink raw reply

* Re: [PATCH 1/3] net: ethernet: fix suspend/resume in davinci_emac
From: Bartosz Golaszewski @ 2018-06-20  7:56 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: Grygorii Strashko, David S . Miller, Dan Carpenter,
	Ivan Khoronzhuk, Rob Herring, Lukas Wunner, Kevin Hilman,
	David Lechner, Sekhar Nori, Andrew Lunn, linux-omap, netdev,
	Linux Kernel Mailing List, Bartosz Golaszewski, stable
In-Reply-To: <db1183b9-6ed0-ed74-c461-cdac5bfc7a60@gmail.com>

2018-06-19 18:55 GMT+02:00 Florian Fainelli <f.fainelli@gmail.com>:
> On 06/19/2018 09:09 AM, Bartosz Golaszewski wrote:
>> From: Bartosz Golaszewski <bgolaszewski@baylibre.com>
>>
>> This patch reverts commit 3243ff2a05ec ("net: ethernet: davinci_emac:
>> Deduplicate bus_find_device() by name matching") and adds a comment
>> which should stop anyone from reintroducing the same "fix" in the future.
>>
>> We can't use bus_find_device_by_name() here because the device name is
>> not guaranteed to be 'davinci_mdio'. On some systems it can be
>> 'davinci_mdio.0' so we need to use strncmp() against the first part of
>> the string to correctly match it.
>>
>> Fixes: 3243ff2a05ec ("net: ethernet: davinci_emac: Deduplicate bus_find_device() by name matching")
>> Cc: stable@vger.kernel.org
>> Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
>> ---
>>  drivers/net/ethernet/ti/davinci_emac.c | 15 +++++++++++++--
>>  1 file changed, 13 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/ti/davinci_emac.c b/drivers/net/ethernet/ti/davinci_emac.c
>> index 06d7c9e4dcda..a1a6445b5a7e 100644
>> --- a/drivers/net/ethernet/ti/davinci_emac.c
>> +++ b/drivers/net/ethernet/ti/davinci_emac.c
>> @@ -1385,6 +1385,11 @@ static int emac_devioctl(struct net_device *ndev, struct ifreq *ifrq, int cmd)
>>               return -EOPNOTSUPP;
>>  }
>>
>> +static int match_first_device(struct device *dev, void *data)
>> +{
>> +     return !strncmp(dev_name(dev), "davinci_mdio", 12);
>
>         const char *bus_name = "davinci_mdio";
>
>         return !strncmp(dev_name(dev), bus_name, strlen(bus_name));
>
> Or even better yet, if you want to make sure this really is a PHY device
> that you are trying to match, you could try to use sscanf() with PHY_ID_FMT.
> --
> Florian

I don't think this is necessary. This simple function would get too
complicated with the additional buffer for the sscanf'ed phy name etc.

Thanks,
Bart

^ permalink raw reply

* [PATCH v2 1/2] net: ethernet: fix suspend/resume in davinci_emac
From: Bartosz Golaszewski @ 2018-06-20  8:03 UTC (permalink / raw)
  To: Grygorii Strashko, David S . Miller, Florian Fainelli,
	Dan Carpenter, Ivan Khoronzhuk, Rob Herring, Lukas Wunner,
	Kevin Hilman, David Lechner, Sekhar Nori, Andrew Lunn
  Cc: linux-omap, netdev, linux-kernel, Bartosz Golaszewski, stable
In-Reply-To: <20180620080356.11900-1-brgl@bgdev.pl>

From: Bartosz Golaszewski <bgolaszewski@baylibre.com>

This patch reverts commit 3243ff2a05ec ("net: ethernet: davinci_emac:
Deduplicate bus_find_device() by name matching") and adds a comment
which should stop anyone from reintroducing the same "fix" in the future.

We can't use bus_find_device_by_name() here because the device name is
not guaranteed to be 'davinci_mdio'. On some systems it can be
'davinci_mdio.0' so we need to use strncmp() against the first part of
the string to correctly match it.

Fixes: 3243ff2a05ec ("net: ethernet: davinci_emac: Deduplicate bus_find_device() by name matching")
Cc: stable@vger.kernel.org
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Acked-by: Lukas Wunner <lukas@wunner.de>
---
 drivers/net/ethernet/ti/davinci_emac.c | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/ti/davinci_emac.c b/drivers/net/ethernet/ti/davinci_emac.c
index 06d7c9e4dcda..a1a6445b5a7e 100644
--- a/drivers/net/ethernet/ti/davinci_emac.c
+++ b/drivers/net/ethernet/ti/davinci_emac.c
@@ -1385,6 +1385,11 @@ static int emac_devioctl(struct net_device *ndev, struct ifreq *ifrq, int cmd)
 		return -EOPNOTSUPP;
 }
 
+static int match_first_device(struct device *dev, void *data)
+{
+	return !strncmp(dev_name(dev), "davinci_mdio", 12);
+}
+
 /**
  * emac_dev_open - EMAC device open
  * @ndev: The DaVinci EMAC network adapter
@@ -1484,8 +1489,14 @@ static int emac_dev_open(struct net_device *ndev)
 
 	/* use the first phy on the bus if pdata did not give us a phy id */
 	if (!phydev && !priv->phy_id) {
-		phy = bus_find_device_by_name(&mdio_bus_type, NULL,
-					      "davinci_mdio");
+		/* NOTE: we can't use bus_find_device_by_name() here because
+		 * the device name is not guaranteed to be 'davinci_mdio'. On
+		 * some systems it can be 'davinci_mdio.0' so we need to use
+		 * strncmp() against the first part of the string to correctly
+		 * match it.
+		 */
+		phy = bus_find_device(&mdio_bus_type, NULL, NULL,
+				      match_first_device);
 		if (phy) {
 			priv->phy_id = dev_name(phy);
 			if (!priv->phy_id || !*priv->phy_id)
-- 
2.17.1

^ permalink raw reply related

* [PATCH v2 0/2] net: davinci_emac: fix suspend/resume (both a regression and a common clk problem)
From: Bartosz Golaszewski @ 2018-06-20  8:03 UTC (permalink / raw)
  To: Grygorii Strashko, David S . Miller, Florian Fainelli,
	Dan Carpenter, Ivan Khoronzhuk, Rob Herring, Lukas Wunner,
	Kevin Hilman, David Lechner, Sekhar Nori, Andrew Lunn
  Cc: linux-omap, netdev, linux-kernel, Bartosz Golaszewski

From: Bartosz Golaszewski <bgolaszewski@baylibre.com>

Earlier I sent the first patch as a solution to a regression introduced
during the v4.16 merge window, but after testing David's common clock
series on top of 4.18-rc1 + this patch it turned out that the problem
persisted.

This is a follow-up containing the regression fix and an additional
patche that makes suspend/resume work with David's changes.

v1 -> v2:
- dropped patch 2/3
- in patch 2/2: check the device's parent's compatible

Bartosz Golaszewski (2):
  net: ethernet: fix suspend/resume in davinci_emac
  net: davinci_emac: match the mdio device against its compatible if
    possible

 drivers/net/ethernet/ti/davinci_emac.c | 19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)

-- 
2.17.1

^ permalink raw reply

* Re: Route fallback issue
From: Akshat Kakkar @ 2018-06-20  8:26 UTC (permalink / raw)
  To: netdev; +Cc: cronolog+lartc, lartc, Erik Auerswald
In-Reply-To: <20180620081916.GA30608@unix-ag.uni-kl.de>

Hi netdev community,

I have 2 interfaces
eno1 : 192.168.1.10/24
eno2 : 192.168.2.10/24

I added routes as
172.16.0.0/12 via 192.168.1.254 metric 1
172.16.0.0/12 via 192.168.2.254 metric 2

My intention : All traffic to 172.16.0.0/12 should go thru eno1. If
192.168.1.254 is not reachable (no arp entry or link down), then it
should fall back to eno2.

But this is not working. My box keeps on looking for 192.168.1.254
(i.e. sending arp requests) and never falls back.

I have posted this in lartc but looks like solution, if any, has to be
from netdev.

Your views on this.

Do we have some plan/roadmap to resolve this in linux kernel?



On Wed, Jun 20, 2018 at 1:49 PM, Erik Auerswald
<auerswal@unix-ag.uni-kl.de> wrote:
> Hi,
>
> I have usually used the "replace" keyword of iproute2 for similar
> purposes. I would suggest a script as well, run via cron unless 1 minute
> failover times are not acceptable. The logic could be as follows:
>
> if ping -c1 $PRIMARY_NH >/dev/null 2>&1; then
>   ip route replace $PREFIX via $PRIMARY_NH
> elif ping -c1 $SECONDARY_NH >/dev/null 2>&1; then
>   ip route replace $PREFIX via $SECONDARY_NH
> else
>   ip route del $PREFIX
> fi
>
> Alternatively, one could look into a routing daemon that supports static
> routing (Zebra/Quagga/FRRouting, BIRD, ...) and check if that supports
> some form of next-hop tracking or at least removes static routes with
> unreachable next-hops as one would expect from experience with dedicated
> networking devices.
>
> IMHO static route handling as done by the Linux kernel does not seem
> useful for networking devices. I have even had bad experiences with
> Arista switches and static routing because they relied too much on the
> Linux kernel (probably still do).
>
> Thanks,
> Erik
> --
> Bufferbloat just waits in hiding to get you when you try to use the network.
>                         -- Jim Gettys
>
> On Wed, Jun 20, 2018 at 04:20:11AM +0100, cronolog+lartc wrote:
>> Hi,
>>
>> I /think/ Linux continues sending ARP requests and doesn't fall back
>> to the other route because the route to the failed next hop still
>> exists in the routing table with highest metric, so it continues
>> looking for this next hop.  I get the same behaviour as you when
>> labbing this up, I could not see a straightforward option to mark a
>> route as invalid under changes in reachability, I'd also like to
>> know if this feature is built in and exists.
>>
>>
>> However in the enterprise Cisco world, we can do what you are trying
>> to do very easily using "route tracking" and "IP SLA" features.
>> Basically we define tests e.g. reachability via ping with
>> appropriate frequency and threshholds, then attach these tests to
>> one or more preferred routes.  If the test fails, the associated
>> route is automatically uninstalled from the forwarding table, so any
>> existing lower metric routes get exposed and are used instead.  When
>> the test passes again, the preferred routes are reapplied.
>>
>> The underlying logic of this can certainly be scripted under Linux
>> to get very similar functionality, then put into a cron job or a
>> while loop or similar.  Something along the lines of (pseudocode):
>>    if [the test such as ping fails] ; then
>>       if [preferred route exists] ; then ip route delete ... ; fi
>>    else  ## ping is successful
>>       if [preferred route doesn't exist] ; then ip route add ... ; fi
>>    fi
>>
>>
>> Hope that helps.  I'm also interested in any other solutions to do
>> this under Linux.
>>
>>
>> On 2018-06-19 13:18, Akshat Kakkar wrote:
>> >I have 2 interfaces
>> >eno1 : 192.168.1.10/24
>> >eno2 : 192.168.2.10/24
>> >
>> >I added routes as
>> >172.16.0.0/12 via 192.168.1.254 metric 1
>> >172.16.0.0/12 via 192.168.2.254 metric 2
>> >
>> >My intention : All traffic to 172.16.0.0/12 should go thru eno1. If
>> >192.168.1.254 is not reachable (no arp entry or link down), then it
>> >should fall back to eno2.
>> >
>> >But this is not working. My box keeps on looking for 192.168.1.254
>> >(i.e. sending arp requests) and never falls back.
>> >
>> >Can anyone help?
>> >

^ permalink raw reply

* Re: [PATCH net 0/5] net sched actions: code style cleanup and fixes
From: Simon Horman @ 2018-06-20  8:20 UTC (permalink / raw)
  To: Roman Mashak; +Cc: davem, netdev, kernel, jhs, xiyou.wangcong, jiri
In-Reply-To: <1529427368-17129-1-git-send-email-mrv@mojatatu.com>

On Tue, Jun 19, 2018 at 12:56:03PM -0400, Roman Mashak wrote:
> The patchset fixes a few code stylistic issues and typos, as well as one
> detected by sparse semantic checker tool.
> 
> No functional changes introduced.
> 
> Patch 1 & 2 fix coding style bits caught by the checkpatch.pl script
> Patch 3 fixes an issue with a shadowed variable
> Patch 4 adds sizeof() operator instead of magic number for buffer length
> Patch 5 fixes typos in diagnostics messages
> 
> Roman Mashak (5):
>   net sched actions: fix coding style in pedit action
>   net sched actions: fix coding style in pedit headers
>   net sched actions: fix sparse warning
>   net sched actions: use sizeof operator for buffer length
>   net sched actions: fix misleading text strings in pedit action

All patches:

Reviewed-by: Simon Horman <simon.horman@netronome.com>

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox