Netdev List

Netdev List
 help / color / mirror / Atom feed

* [PATCH net] esp: skip GRO for fragmented packets
From: Sabrina Dubroca @ 2017-04-27 10:31 UTC (permalink / raw)
  To: netdev; +Cc: Sabrina Dubroca, Steffen Klassert, Herbert Xu

Currently, ESP4 GRO doesn't work for fragmented packets, so let's send
these through the normal path.

Fixes: 7785bba299a8 ("esp: Add a software GRO codepath")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
---
Steffen, if you prefer to drop this patch and fix this properly,
that's okay for me. I can't look much deeper into this right now and
it's broken on current net/master.

It seems like the first fragment gets dropped, at least I don't see it
on tcpdump on the RX machine.

 net/ipv4/esp4_offload.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/net/ipv4/esp4_offload.c b/net/ipv4/esp4_offload.c
index 1de442632406..ab5faca28e19 100644
--- a/net/ipv4/esp4_offload.c
+++ b/net/ipv4/esp4_offload.c
@@ -38,6 +38,9 @@ static struct sk_buff **esp4_gro_receive(struct sk_buff **head,
 	__be32 spi;
 	int err;
 
+	if (ip_is_fragment(ip_hdr(skb)))
+		goto flush;
+
 	skb_pull(skb, offset);
 
 	if ((err = xfrm_parse_spi(skb, IPPROTO_ESP, &spi, &seq)) != 0)
@@ -78,6 +81,7 @@ static struct sk_buff **esp4_gro_receive(struct sk_buff **head,
 	return ERR_PTR(-EINPROGRESS);
 out:
 	skb_push(skb, offset);
+flush:
 	NAPI_GRO_CB(skb)->same_flow = 0;
 	NAPI_GRO_CB(skb)->flush = 1;
 
-- 
2.12.2

^ permalink raw reply related

* Re: [PATCH v1 net-next 3/6] net: add new control message for incoming HW-timestamped packets
From: Miroslav Lichvar @ 2017-04-27 10:15 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: Network Development, Richard Cochran, Willem de Bruijn,
	Soheil Hassas Yeganeh, Keller, Jacob E, Denny Page, Jiri Benc
In-Reply-To: <CAF=yD-+XKGnGASVN4+W2u2A7CBqS9CRN2Jkvx6UoPQQ6jLpMkQ@mail.gmail.com>

Thanks for the comments.

On Wed, Apr 26, 2017 at 07:34:49PM -0400, Willem de Bruijn wrote:
> > +struct net_device *dev_get_by_napi_id(unsigned int napi_id)
> > +{
> > +       struct net_device *dev = NULL;
> > +       struct napi_struct *napi;
> > +
> > +       rcu_read_lock();
> > +
> > +       napi = napi_by_id(napi_id);
> > +       if (napi)
> > +               dev = napi->dev;
> > +
> > +       rcu_read_unlock();
> > +
> > +       return dev;
> > +}
> > +EXPORT_SYMBOL(dev_get_by_napi_id);
> 
> Returning dev without holding a reference is not safe. You'll probably
> have to call this with rcu_read_lock held instead.

How about changing the function to simply return the index instead of
the device (e.g. dev_get_ifindex_by_napi_id())? Would that be too
specific?

> >  /*
> >   * called from sock_recv_timestamp() if sock_flag(sk, SOCK_RCVTSTAMP)
> >   */
> > @@ -699,8 +719,12 @@ void __sock_recv_timestamp(struct msghdr *msg, struct sock *sk,
> >                 empty = 0;
> >         if (shhwtstamps &&
> >             (sk->sk_tsflags & SOF_TIMESTAMPING_RAW_HARDWARE) &&
> 
> This information is also informative with software timestamps.

But is it useful and worth the cost? If I have two interfaces and only
one has HW timestamping, or just one interface which can timestamp
incoming packets at a limited rate, I would prefer to not waste CPU
cycles preparing and processing useless data.

> And getting the real iif is definitely useful outside timestamps.

Do you have an example? We have asked that in the original thread,
but no one suggested anything. For AF_PACKET there is PACKET_ORIGDEV.
When I was searching the Internet on how to get the index with INET
sockets, it looked like I was the only one who had this problem :).

> An
> alternative approach is to add versioning to IP_PKTINFO with a new
> setsockopt IP_PKTINFO_VERSION plus a new struct in_pktinfo_v2
> that extends in_pktinfo. Just a thought.

The struct would contain both the original and last interface index,
and the length as well? And similarly with in6_pktinfo?

If there is an agreement that the information would useful also for
other things than timestamping, I can try that. If not, I think it
would be better to keep it tied to HW timestamping.

-- 
Miroslav Lichvar

^ permalink raw reply

* Re: rhashtable - Cap total number of entries to 2^31
From: Herbert Xu @ 2017-04-27 10:13 UTC (permalink / raw)
  To: Florian Westphal; +Cc: David Miller, netdev, Thomas Graf
In-Reply-To: <20170427101134.GB448@breakpoint.cc>

On Thu, Apr 27, 2017 at 12:11:34PM +0200, Florian Westphal wrote:
>
> > +	/* Cap total entries at 2^31 to avoid nelems overflow. */
> > +	ht->max_elems = 1u << 31;
> > +	if (ht->p.max_size < ht->max_elems / 2)
> > +		ht->max_elems = ht->p.max_size * 2;
> > +
> 
> Looks like instead of adding this max_elems you could instead have fixed this via
> 
> if (!ht->p.max_size)
> 	ht->p.max_size = INT_MAX / 2;
> 
> if (ht->p.max_size > INT_MAX / 2)
> 	return -EINVAL;

No that doesn't do the same thing.  Setting max_size caps the table
size and I don't want to do that at all.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: rhashtable - Cap total number of entries to 2^31
From: Florian Westphal @ 2017-04-27 10:11 UTC (permalink / raw)
  To: Herbert Xu; +Cc: David Miller, fw, netdev, Thomas Graf
In-Reply-To: <20170427054451.GA529@gondor.apana.org.au>

Herbert Xu <herbert@gondor.apana.org.au> wrote:
> On Tue, Apr 25, 2017 at 10:48:22AM -0400, David Miller wrote:
> > From: Florian Westphal <fw@strlen.de>
> > Date: Tue, 25 Apr 2017 16:17:49 +0200
> > 
> > > I'd have less of an issue with this if we'd be talking about
> > > something computationally expensive, but this is about storing
> > > an extra value inside a struct just to avoid one "shr" in insert path...
> > 
> > Agreed, this shift is probably filling an available cpu cycle :-)
> 
> OK, but we need to have an extra field for another reason anyway.
> The problem is that we're not capping the total number of elements
> in the hashtable when max_size is not set, this means that nelems
> can overflow which will cause havoc with the automatic shrinking
> when it tries to fit 2^32 entries into a minimum-sized table.

Right, good catch.

I guess eventually we should get rid of min_size and max_size
completely as parameters and keep actual sizing/bucket count internal to
rhashtable.

In fact I would not be surprised if some existing users did set
max_size under assumption it is a 'max element count'.

> ---8<---
> When max_size is not set or if it set to a sufficiently large
> value, the nelems counter can overflow.  This would cause havoc
> with the automatic shrinking as it would then attempt to fit a
> huge number of entries into a tiny hash table.
> 
> This patch fixes this by adding max_elems to struct rhashtable
> to cap the number of elements.  This is set to 2^31 as nelems is
> not a precise count.  This is sufficiently smaller than UINT_MAX
> that it should be safe.
> 
> When max_size is set max_elems will be lowered to at most twice
> max_size as is the status quo.
> 
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

[..]

> diff --git a/include/linux/rhashtable.h b/include/linux/rhashtable.h
> @@ -165,6 +166,7 @@ struct rhashtable {
>  	atomic_t			nelems;
>  	unsigned int			key_len;
>  	struct rhashtable_params	p;
> +	unsigned int			max_elems;
>  	bool				rhlist;
>  	struct work_struct		run_work;
>  	struct mutex                    mutex;
> @@ -327,8 +329,7 @@ static inline bool rht_grow_above_100(const struct rhashtable *ht,
>  static inline bool rht_grow_above_max(const struct rhashtable *ht,
>  				      const struct bucket_table *tbl)
>  {
> -	return ht->p.max_size &&
> -	       (atomic_read(&ht->nelems) / 2u) >= ht->p.max_size;
> +	return atomic_read(&ht->nelems) >= ht->max_elems;
>  }
>  
>  /* The bucket lock is selected based on the hash and protects mutations
> diff --git a/lib/rhashtable.c b/lib/rhashtable.c
> index f3b82e0..751630b 100644
> --- a/lib/rhashtable.c
> +++ b/lib/rhashtable.c
> @@ -961,6 +961,11 @@ int rhashtable_init(struct rhashtable *ht,
>  	if (params->max_size)
>  		ht->p.max_size = rounddown_pow_of_two(params->max_size);
>  
> +	/* Cap total entries at 2^31 to avoid nelems overflow. */
> +	ht->max_elems = 1u << 31;
> +	if (ht->p.max_size < ht->max_elems / 2)
> +		ht->max_elems = ht->p.max_size * 2;
> +

Looks like instead of adding this max_elems you could instead have fixed this via

if (!ht->p.max_size)
	ht->p.max_size = INT_MAX / 2;

if (ht->p.max_size > INT_MAX / 2)
	return -EINVAL;

^ permalink raw reply

* Re: [PATCH net-next 4/6] bpf: track if the bpf program was loaded with SYS_ADMIN capabilities
From: kbuild test robot @ 2017-04-27 10:09 UTC (permalink / raw)
  To: Hannes Frederic Sowa; +Cc: kbuild-all, netdev, ast, daniel, jbenc, aconole
In-Reply-To: <20170426182419.14574-5-hannes@stressinduktion.org>

Hi Hannes,

[auto build test WARNING on net-next/master]

url:    https://github.com/0day-ci/linux/commits/Hannes-Frederic-Sowa/bpf-list-all-loaded-ebpf-programs-in-proc-bpf-programs/20170427-090839
reproduce:
        # apt-get install sparse
        make ARCH=x86_64 allmodconfig
        make C=1 CF=-D__CHECK_ENDIAN__


sparse warnings: (new ones prefixed by >>)

   include/linux/compiler.h:264:8: sparse: attribute 'no_sanitize_address': unknown attribute
   lib/test_bpf.c:407:29: sparse: subtraction of functions? Share your drugs
   lib/test_bpf.c:418:29: sparse: subtraction of functions? Share your drugs
>> lib/test_bpf.c:5613:36: sparse: not enough arguments for function bpf_prog_alloc
   lib/test_bpf.c: In function 'generate_filter':
   lib/test_bpf.c:5613:8: error: too few arguments to function 'bpf_prog_alloc'
      fp = bpf_prog_alloc(bpf_prog_size(flen), 0);
           ^~~~~~~~~~~~~~
   In file included from lib/test_bpf.c:20:0:
   include/linux/filter.h:619:18: note: declared here
    struct bpf_prog *bpf_prog_alloc(unsigned int size, gfp_t gfp_extra_flags,
                     ^~~~~~~~~~~~~~

vim +5613 lib/test_bpf.c

10f18e0ba Daniel Borkmann    2014-05-23  5597  				/* Verifier didn't reject the test that's
10f18e0ba Daniel Borkmann    2014-05-23  5598  				 * bad enough, just return!
10f18e0ba Daniel Borkmann    2014-05-23  5599  				 */
10f18e0ba Daniel Borkmann    2014-05-23  5600  				*err = -EINVAL;
10f18e0ba Daniel Borkmann    2014-05-23  5601  				return NULL;
10f18e0ba Daniel Borkmann    2014-05-23  5602  			}
10f18e0ba Daniel Borkmann    2014-05-23  5603  		}
10f18e0ba Daniel Borkmann    2014-05-23  5604  		/* We don't expect to fail. */
10f18e0ba Daniel Borkmann    2014-05-23  5605  		if (*err) {
10f18e0ba Daniel Borkmann    2014-05-23  5606  			pr_cont("FAIL to attach err=%d len=%d\n",
10f18e0ba Daniel Borkmann    2014-05-23  5607  				*err, fprog.len);
10f18e0ba Daniel Borkmann    2014-05-23  5608  			return NULL;
64a8946b4 Alexei Starovoitov 2014-05-08  5609  		}
10f18e0ba Daniel Borkmann    2014-05-23  5610  		break;
10f18e0ba Daniel Borkmann    2014-05-23  5611  
10f18e0ba Daniel Borkmann    2014-05-23  5612  	case INTERNAL:
60a3b2253 Daniel Borkmann    2014-09-02 @5613  		fp = bpf_prog_alloc(bpf_prog_size(flen), 0);
10f18e0ba Daniel Borkmann    2014-05-23  5614  		if (fp == NULL) {
10f18e0ba Daniel Borkmann    2014-05-23  5615  			pr_cont("UNEXPECTED_FAIL no memory left\n");
10f18e0ba Daniel Borkmann    2014-05-23  5616  			*err = -ENOMEM;
10f18e0ba Daniel Borkmann    2014-05-23  5617  			return NULL;
10f18e0ba Daniel Borkmann    2014-05-23  5618  		}
10f18e0ba Daniel Borkmann    2014-05-23  5619  
10f18e0ba Daniel Borkmann    2014-05-23  5620  		fp->len = flen;
4962fa10f Daniel Borkmann    2015-07-30  5621  		/* Type doesn't really matter here as long as it's not unspec. */

:::::: The code at line 5613 was first introduced by commit
:::::: 60a3b2253c413cf601783b070507d7dd6620c954 net: bpf: make eBPF interpreter images read-only

:::::: TO: Daniel Borkmann <dborkman@redhat.com>
:::::: CC: David S. Miller <davem@davemloft.net>

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

^ permalink raw reply

* [PATCH net] xfrm: fix GRO for !CONFIG_NETFILTER
From: Sabrina Dubroca @ 2017-04-27 10:03 UTC (permalink / raw)
  To: netdev; +Cc: Sabrina Dubroca, Steffen Klassert, Herbert Xu

In xfrm_input() when called from GRO, async == 0, and we end up
skipping the processing in xfrm4_transport_finish(). GRO path will
always skip the NF_HOOK, so we don't need the special-case for
!NETFILTER during GRO processing.

Fixes: 7785bba299a8 ("esp: Add a software GRO codepath")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
---
 net/xfrm/xfrm_input.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/xfrm/xfrm_input.c b/net/xfrm/xfrm_input.c
index 46bdb4fbed0b..e23570b647ae 100644
--- a/net/xfrm/xfrm_input.c
+++ b/net/xfrm/xfrm_input.c
@@ -395,7 +395,7 @@ int xfrm_input(struct sk_buff *skb, int nexthdr, __be32 spi, int encap_type)
 		if (xo)
 			xfrm_gro = xo->flags & XFRM_GRO;
 
-		err = x->inner_mode->afinfo->transport_finish(skb, async);
+		err = x->inner_mode->afinfo->transport_finish(skb, xfrm_gro || async);
 		if (xfrm_gro) {
 			skb_dst_drop(skb);
 			gro_cells_receive(&gro_cells, skb);
-- 
2.12.2

^ permalink raw reply related

* Re: stmmac still supporting spear600 ?
From: Thomas Petazzoni @ 2017-04-27  9:49 UTC (permalink / raw)
  To: Giuseppe CAVALLARO
  Cc: Alexandre Torgue, netdev, Viresh Kumar, Shiraz Hashim,
	linux-arm-kernel
In-Reply-To: <0468e2fb-5a7d-97ec-c51c-2436a13dda69@st.com>

Hello Giuseppe,

On Mon, 3 Apr 2017 08:16:50 +0200, Giuseppe CAVALLARO wrote:

> I tested the SMSC on other platform (+ stmmac), not on SPEAr.
> 
> ok for reset, keep the radar on clock. Hmm, can you attach a piece of 
> log file to see the failure?

We finally identified the issue: in a MII configuration, the PS bit
need to be set for the DMA reset procedure to work, but setting the DMA
reset bit clears the PS bit. So you have to set the PS bit after
asserting the DMA reset, and before polling for the DMA reset bit to
clear.

I have sent a fix that works for us (tested GMII and MII platforms),
but not sure if the implementation is the most appropriate. Let me know
if you have better suggestions.

See: http://marc.info/?l=linux-netdev&m=149328635210461&w=2

Thanks!

Thomas
-- 
Thomas Petazzoni, CTO, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply

* [PATCH] net: ethernet: stmmac: properly set PS bit in MII configurations during reset
From: Thomas Petazzoni @ 2017-04-27  9:45 UTC (permalink / raw)
  To: Giuseppe Cavallaro, Alexandre Torgue; +Cc: netdev, Thomas Petazzoni, stable

On the SPEAr600 SoC, which has the dwmac1000 variant of the IP block,
the DMA reset never succeeds when a MII PHY is used (no problem with a
GMII PHY). The dwmac_dma_reset() function sets the
DMA_BUS_MODE_SFT_RESET bit in the DMA_BUS_MODE register, and then
polls until this bit clears. When a MII PHY is used, with the current
driver, this bit never clears and the driver therefore doesn't work.

The reason is that the PS bit of the GMAC_CONTROL register should be
correctly configured for the DMA reset to work. When the PS bit is 0,
it tells the MAC we have a GMII PHY, when the PS bit is 1, it tells
the MAC we have a MII PHY.

Doing a DMA reset clears all registers, so the PS bit is cleared as
well. This makes the DMA reset work fine with a GMII PHY. However,
with MII PHY, the PS bit should be set.

We have identified this issue thanks to two SPEAr600 platform:

 - One equipped with a GMII PHY, with which the existing driver was
   working fine.

 - One equipped with a MII PHY, where the current driver fails because
   the DMA reset times out.

This patch fixes the problem for the MII PHY configuration, and has
been tested with a GMII PHY configuration as well.

In terms of implement, since the ->reset() hook is implemented in the
DMA related code, we do not want to touch directly from this function
the MAC registers. Therefore, a ->set_ps() hook has been added to
stmmac_ops, which gets called between the moment the reset is asserted
and the polling loop waiting for the reset bit to clear.

In order for this ->set_ps() hook to decide what to do, we pass it the
"struct mac_device_info" so it can access the MAC registers, and the
PHY interface type so it knows if we're using a MII PHY or not.

The ->set_ps() hook is only implemented for the dwmac1000 case.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: <stable@vger.kernel.org>
---
Do not hesitate to suggest ideas for alternative implementations, I'm
not sure if the current proposal is the one that fits best with the
current design of the driver.
---
 drivers/net/ethernet/stmicro/stmmac/common.h         | 12 +++++++++---
 drivers/net/ethernet/stmicro/stmmac/dwmac1000_core.c | 16 ++++++++++++++++
 drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h     |  3 ++-
 drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c     |  7 ++++++-
 drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h      |  3 ++-
 drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c      |  6 +++++-
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c    |  3 ++-
 7 files changed, 42 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/common.h b/drivers/net/ethernet/stmicro/stmmac/common.h
index 04d9245..d576f95 100644
--- a/drivers/net/ethernet/stmicro/stmmac/common.h
+++ b/drivers/net/ethernet/stmicro/stmmac/common.h
@@ -407,10 +407,13 @@ struct stmmac_desc_ops {
 extern const struct stmmac_desc_ops enh_desc_ops;
 extern const struct stmmac_desc_ops ndesc_ops;
 
+struct mac_device_info;
+
 /* Specific DMA helpers */
 struct stmmac_dma_ops {
 	/* DMA core initialization */
-	int (*reset)(void __iomem *ioaddr);
+	int (*reset)(void __iomem *ioaddr, struct mac_device_info *hw,
+		     phy_interface_t interface);
 	void (*init)(void __iomem *ioaddr, struct stmmac_dma_cfg *dma_cfg,
 		     u32 dma_tx, u32 dma_rx, int atds);
 	/* Configure the AXI Bus Mode Register */
@@ -445,12 +448,15 @@ struct stmmac_dma_ops {
 	void (*enable_tso)(void __iomem *ioaddr, bool en, u32 chan);
 };
 
-struct mac_device_info;
-
 /* Helpers to program the MAC core */
 struct stmmac_ops {
 	/* MAC core initialization */
 	void (*core_init)(struct mac_device_info *hw, int mtu);
+	/* Set port select. Called between asserting DMA reset and
+	 * waiting for the reset bit to clear.
+	 */
+	void (*set_ps)(struct mac_device_info *hw,
+		       phy_interface_t interface);
 	/* Enable and verify that the IPC module is supported */
 	int (*rx_ipc)(struct mac_device_info *hw);
 	/* Enable RX Queues */
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac1000_core.c b/drivers/net/ethernet/stmicro/stmmac/dwmac1000_core.c
index 19b9b308..dfcbb5b 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac1000_core.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac1000_core.c
@@ -75,6 +75,21 @@ static void dwmac1000_core_init(struct mac_device_info *hw, int mtu)
 #endif
 }
 
+static void dwmac1000_set_ps(struct mac_device_info *hw,
+			     phy_interface_t interface)
+{
+	void __iomem *ioaddr = hw->pcsr;
+	u32 value = readl(ioaddr + GMAC_CONTROL);
+
+	/* When a MII PHY is used, we must set the PS bit for the DMA
+	 * reset to succeed.
+	 */
+	if (interface == PHY_INTERFACE_MODE_MII)
+		value |= GMAC_CONTROL_PS;
+
+	writel(value, ioaddr + GMAC_CONTROL);
+}
+
 static int dwmac1000_rx_ipc_enable(struct mac_device_info *hw)
 {
 	void __iomem *ioaddr = hw->pcsr;
@@ -488,6 +503,7 @@ static void dwmac1000_debug(void __iomem *ioaddr, struct stmmac_extra_stats *x)
 
 static const struct stmmac_ops dwmac1000_ops = {
 	.core_init = dwmac1000_core_init,
+	.set_ps = dwmac1000_set_ps,
 	.rx_ipc = dwmac1000_rx_ipc_enable,
 	.dump_regs = dwmac1000_dump_regs,
 	.host_irq_status = dwmac1000_irq_status,
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h b/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h
index 1b06df7..e9c6c49 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h
@@ -183,7 +183,8 @@
 #define DMA_CHAN0_DBG_STAT_RPS		GENMASK(11, 8)
 #define DMA_CHAN0_DBG_STAT_RPS_SHIFT	8
 
-int dwmac4_dma_reset(void __iomem *ioaddr);
+int dwmac4_dma_reset(void __iomem *ioaddr, struct mac_device_info *hw,
+		     phy_interface_t interface);
 void dwmac4_enable_dma_transmission(void __iomem *ioaddr, u32 tail_ptr);
 void dwmac4_enable_dma_irq(void __iomem *ioaddr);
 void dwmac410_enable_dma_irq(void __iomem *ioaddr);
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c b/drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c
index c7326d5..485eecb 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c
@@ -14,7 +14,8 @@
 #include "dwmac4_dma.h"
 #include "dwmac4.h"
 
-int dwmac4_dma_reset(void __iomem *ioaddr)
+int dwmac4_dma_reset(void __iomem *ioaddr, struct mac_device_info *hw,
+		     phy_interface_t interface)
 {
 	u32 value = readl(ioaddr + DMA_BUS_MODE);
 	int limit;
@@ -22,6 +23,10 @@ int dwmac4_dma_reset(void __iomem *ioaddr)
 	/* DMA SW reset */
 	value |= DMA_BUS_MODE_SFT_RESET;
 	writel(value, ioaddr + DMA_BUS_MODE);
+
+	if (hw->mac->set_ps)
+		hw->mac->set_ps(hw, interface);
+
 	limit = 10;
 	while (limit--) {
 		if (!(readl(ioaddr + DMA_BUS_MODE) & DMA_BUS_MODE_SFT_RESET))
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h b/drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h
index 56e485f..25ae028 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h
@@ -144,6 +144,7 @@ void dwmac_dma_stop_tx(void __iomem *ioaddr);
 void dwmac_dma_start_rx(void __iomem *ioaddr);
 void dwmac_dma_stop_rx(void __iomem *ioaddr);
 int dwmac_dma_interrupt(void __iomem *ioaddr, struct stmmac_extra_stats *x);
-int dwmac_dma_reset(void __iomem *ioaddr);
+int dwmac_dma_reset(void __iomem *ioaddr, struct mac_device_info *hw,
+		    phy_interface_t interface);
 
 #endif /* __DWMAC_DMA_H__ */
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c b/drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c
index e60bfca..1a17df5 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c
@@ -23,7 +23,8 @@
 
 #define GMAC_HI_REG_AE		0x80000000
 
-int dwmac_dma_reset(void __iomem *ioaddr)
+int dwmac_dma_reset(void __iomem *ioaddr, struct mac_device_info *hw,
+		    phy_interface_t interface)
 {
 	u32 value = readl(ioaddr + DMA_BUS_MODE);
 	int err;
@@ -32,6 +33,9 @@ int dwmac_dma_reset(void __iomem *ioaddr)
 	value |= DMA_BUS_MODE_SFT_RESET;
 	writel(value, ioaddr + DMA_BUS_MODE);
 
+	if (hw->mac->set_ps)
+		hw->mac->set_ps(hw, interface);
+
 	err = readl_poll_timeout(ioaddr + DMA_BUS_MODE, value,
 				 !(value & DMA_BUS_MODE_SFT_RESET),
 				 100000, 10000);
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 4498a38..66bc218 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -1585,7 +1585,8 @@ static int stmmac_init_dma_engine(struct stmmac_priv *priv)
 	if (priv->extend_desc && (priv->mode == STMMAC_RING_MODE))
 		atds = 1;
 
-	ret = priv->hw->dma->reset(priv->ioaddr);
+	ret = priv->hw->dma->reset(priv->ioaddr, priv->hw,
+				   priv->plat->interface);
 	if (ret) {
 		dev_err(priv->device, "Failed to reset the dma\n");
 		return ret;
-- 
2.7.4

^ permalink raw reply related

* Re: cpsw regression in mainline with "cpsw/netcp: cpts depends on posix_timers"
From: Arnd Bergmann @ 2017-04-27  9:44 UTC (permalink / raw)
  To: Tony Lindgren; +Cc: Grygorii Strashko, Networking, linux-omap
In-Reply-To: <20170424182358.GE3780@atomide.com>

On Mon, Apr 24, 2017 at 8:23 PM, Tony Lindgren <tony@atomide.com> wrote:
> * Arnd Bergmann <arnd@arndb.de> [170424 11:14]:
>> On Mon, Apr 24, 2017 at 7:44 PM, Tony Lindgren <tony@atomide.com> wrote:
>> > * Arnd Bergmann <arnd@arndb.de> [170424 10:38]:
>> >> On Mon, Apr 24, 2017 at 6:51 PM, Tony Lindgren <tony@atomide.com> wrote:
>> >> > Hi,
>> >> >
>> >> > Looks like commit 07fef3623407 ("cpsw/netcp: cpts depends on posix_timers")
>> >> > in mainline started triggering the following oops at least on j5eco-evm.
>> >> >
>> >> > Adding CONFIG_PTP_1588_CLOCK to .config solves it, but the oops hints
>> >> > something is wrong with the dependencies.. CONFIG_TI_CPTS defaults to N
>> >> > and not selecting it causes the oops.
>> >> >
>> >> > Any ideas what's needed to properly fix this?
>> >>
>> >> Does your configuration have POSIX_TIMERS enabled? If not, then CPTS
>> >> is now also disabled. There are two issues here that we need to solve:
>> >>
>> >> a) find out why POSIX_TIMERS got disabled, and make sure it's always
>> >>     turned on for normal users
>> >
>> > $ make omap2plus_defconfig
>> > ...
>> > CONFIG_POSIX_TIMERS=y
>> > # CONFIG_PTP_1588_CLOCK is not set
>> >
>> > So CONFIG_TI_CPTS is unselectable.
>>
>> Ok, so at least this one is easy to understand, it's a direct consequence
>> of my change, as nothing else will select CONFIG_PTP_1588_CLOCK
>> now.
>>
>> My first try would be this one:
>>
>> diff --git a/drivers/net/ethernet/ti/Kconfig b/drivers/net/ethernet/ti/Kconfig
>> index 9e631952b86f..8042a7626056 100644
>> --- a/drivers/net/ethernet/ti/Kconfig
>> +++ b/drivers/net/ethernet/ti/Kconfig
>> @@ -76,7 +76,7 @@ config TI_CPSW
>>  config TI_CPTS
>>   bool "TI Common Platform Time Sync (CPTS) Support"
>>   depends on TI_CPSW || TI_KEYSTONE_NETCP
>> - depends on PTP_1588_CLOCK
>> + depends on POSIX_TIMERS
>>   ---help---
>>    This driver supports the Common Platform Time Sync unit of
>>    the CPSW Ethernet Switch and Keystone 2 1g/10g Switch Subsystem.
>> @@ -87,6 +87,7 @@ config TI_CPTS_MOD
>>   tristate
>>   depends on TI_CPTS
>>   default y if TI_CPSW=y || TI_KEYSTONE_NETCP=y
>> + imply PTP_1588_CLOCK
>>   default m
>>
>>  config TI_KEYSTONE_NETCP
>
> Yup this produces a booting .config for me:

Unfortunately it seems it's not complete yet, I've run into
the old build error again after a few days of randconfig builds
with my patch applied:

drivers/net/ethernet/ti/cpts.o: In function `cpts_find_ts':
cpts.c:(.text.cpts_find_ts+0x20): undefined reference to `ptp_classify_raw'

I'll keep looking for a better fix.

      Arnd

^ permalink raw reply

* [PATCH iproute2] routel: fix infinite loop in line parser
From: Michal Kubecek @ 2017-04-27  9:43 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev

As noticed by one of the few users of routel script, it ends up in an
infinite loop when they pull out the cable from the NIC used for some
route. This is caused by its parser expecting the line of "ip route show"
output consists of "key value" pairs (except for the initial target range),
together with an old trap of Bourne style shells that "shift 2" does
nothing if there is only one argument left. Some keywords, e.g. "linkdown",
are not followed by a value.

Improve the parser to

  (1) only set variables for keywords we care about
  (2) recognize (currently) known keywords without value

This is still far from perfect (and certainly not future proof) but to
fully fix the script, one would probably have to rewrite the logic
completely (and I'm not sure it's worth the effort).

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
---
 ip/routel | 20 ++++++++++++++++----
 1 file changed, 16 insertions(+), 4 deletions(-)

diff --git a/ip/routel b/ip/routel
index 8d1d352abdff..9a30462aa6b2 100644
--- a/ip/routel
+++ b/ip/routel
@@ -32,10 +32,22 @@ ip route list table "$@" |
     esac
     while test $# != 0
     do
-       key=$1
-       val=$2
-       eval "$key=$val"
-       shift 2
+       case "$1" in
+          proto|via|dev|scope|src|table)
+             key=$1
+             val=$2
+             eval "$key='$val'"
+             shift 2
+             ;;
+          dead|onlink|pervasive|offload|notify|linkdown|unresolved)
+             shift
+             ;;
+          *)
+             # avoid infinite loop on unknown keyword without value at line end
+             shift
+             shift
+             ;;
+       esac
     done
     echo "$network	$via	$src	$proto	$scope	$dev	$table"
  done | awk -F '	' '
-- 
2.12.2

^ permalink raw reply related

* pull-request: wireless-drivers-next 2017-04-27
From: Kalle Valo @ 2017-04-27  9:41 UTC (permalink / raw)
  To: David Miller; +Cc: linux-wireless, netdev, linux-kernel

Hi Dave,

here's a pull request for net-next, more info in the tag below. This
should be the last pull request to net-next for 4.12. Please let me know
if there are any problems.

Kalle

The following changes since commit 7acedaf5c4355f812cfef883ac28bf15f7d9205e:

  net: move xdp_prog field in RX cache lines (2017-04-25 16:25:36 -0400)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next.git tags/wireless-drivers-next-for-davem-2017-04-27

for you to fetch changes up to 47d272f0f9887343f4e4d31bb22910b141b96654:

  Merge tag 'iwlwifi-next-for-kalle-2017-04-26' of git://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/iwlwifi-next (2017-04-26 14:21:00 +0300)

----------------------------------------------------------------
wireless-drivers-next patches for 4.12

Few remaining patches for 4.12 submitted during the last week.

Major changes:

iwlwifi

* the firmware for 7265D and 3168 NICs is frozen at version 29

* more support for the upcoming A000 series

----------------------------------------------------------------
Colin Ian King (1):
      orinoco: fix spelling mistake: "Registerred" -> "Registered"

Dor Shaish (1):
      iwlwifi: mvm: freeze 7265D and 3168 on API version 29

Haim Dreyfuss (1):
      iwlwifi: mvm: Ignore wifi mcc update in the driver while associated

James Hughes (2):
      brcmfmac: Ensure pointer correctly set if skb data location changes
      brcmfmac: Make skb header writable before use

Johannes Berg (6):
      iwlwifi: mvm: make iwl_run_unified_mvm_ucode() static
      iwlwifi: mvm: avoid variable shadowing
      iwlwifi: pcie: remove superfluous trans->dev assignment
      iwlwifi: don't leak memory on allocation failure
      iwlwifi: remove module loading failure message
      iwlwifi: pcie: apply no-reclaim logic only to group 0

Kalle Valo (1):
      Merge tag 'iwlwifi-next-for-kalle-2017-04-26' of git://git.kernel.org/.../iwlwifi/iwlwifi-next

Larry Finger (1):
      rtlwifi: rtl8821ae: setup 8812ae RFE according to device type

Liad Kaufman (2):
      iwlwifi: pcie: support debug applying on a000 hw
      iwlwifi: gen2: support nmi triggering from host

Maksim Salau (1):
      orinoco_usb: Fix buffer on stack

Mordechai Goodstein (1):
      iwlwifi: mvm: scan: avoid "big" prints

Pan Bian (3):
      mt7601u: check return value of alloc_skb
      libertas: check return value of alloc_workqueue
      rndis_wlan: add return value validation

Sara Sharon (12):
      iwlwifi: mvm: support new rate flags
      iwlwifi: mvm: don't reserve queue in TVQM mode
      iwlwifi: mvm: map cab_queue to different txq_id
      iwlwifi: mvm: move internally to use bigger INVALID_TXQ
      iwlwifi: mvm: remove color definition
      iwlwifi: mvm: use defines instead of variables for shared dwell times
      iwlwifi: mvm: remove references to queue_info in new TX path
      iwlwifi: mvm: support station type API
      iwlwifi: move to 512 queues
      iwlwifi: rename wait_for_tx_queues_empty
      iwlwifi: mvm: memset binding before setting values
      iwlwifi: adjust NVM parsing APIs for new a000 method

Sharon Dvir (2):
      iwlwifi: mvm: check if returned value is NULL
      iwlwifi: mvm: handle possible BIOS bug

 .../wireless/broadcom/brcm80211/brcmfmac/core.c    |  23 +--
 drivers/net/wireless/intel/iwlwifi/dvm/lib.c       |   2 +-
 drivers/net/wireless/intel/iwlwifi/dvm/mac80211.c  |   2 +-
 drivers/net/wireless/intel/iwlwifi/iwl-7000.c      |   4 +-
 drivers/net/wireless/intel/iwlwifi/iwl-a000.c      |   2 +-
 drivers/net/wireless/intel/iwlwifi/iwl-config.h    |   2 +-
 drivers/net/wireless/intel/iwlwifi/iwl-drv.c       |  15 +-
 drivers/net/wireless/intel/iwlwifi/iwl-fw-file.h   |   2 +
 drivers/net/wireless/intel/iwlwifi/iwl-io.c        |   3 +
 drivers/net/wireless/intel/iwlwifi/iwl-nvm-parse.c |  32 ++-
 drivers/net/wireless/intel/iwlwifi/iwl-nvm-parse.h |  16 +-
 drivers/net/wireless/intel/iwlwifi/iwl-prph.h      |   1 +
 drivers/net/wireless/intel/iwlwifi/iwl-trans.h     |  10 +-
 drivers/net/wireless/intel/iwlwifi/mvm/binding.c   |   4 +-
 .../net/wireless/intel/iwlwifi/mvm/fw-api-mac.h    |   5 +-
 drivers/net/wireless/intel/iwlwifi/mvm/fw-api-rs.h |  28 ++-
 .../net/wireless/intel/iwlwifi/mvm/fw-api-sta.h    |  38 ++--
 drivers/net/wireless/intel/iwlwifi/mvm/fw.c        | 161 ++++++++-------
 drivers/net/wireless/intel/iwlwifi/mvm/mac-ctxt.c  |  11 +-
 drivers/net/wireless/intel/iwlwifi/mvm/mac80211.c  |  37 ++--
 drivers/net/wireless/intel/iwlwifi/mvm/mvm.h       |  10 +-
 drivers/net/wireless/intel/iwlwifi/mvm/nvm.c       |   9 +-
 drivers/net/wireless/intel/iwlwifi/mvm/ops.c       |   9 +-
 drivers/net/wireless/intel/iwlwifi/mvm/rs.c        |  11 +-
 drivers/net/wireless/intel/iwlwifi/mvm/rx.c        |   4 +-
 drivers/net/wireless/intel/iwlwifi/mvm/rxmq.c      |   4 +-
 drivers/net/wireless/intel/iwlwifi/mvm/scan.c      |  86 +++-----
 drivers/net/wireless/intel/iwlwifi/mvm/sta.c       | 229 ++++++++++++++-------
 drivers/net/wireless/intel/iwlwifi/mvm/sta.h       |   7 +-
 drivers/net/wireless/intel/iwlwifi/mvm/tx.c        |  13 +-
 drivers/net/wireless/intel/iwlwifi/mvm/utils.c     |  96 ++++++---
 .../net/wireless/intel/iwlwifi/pcie/ctxt-info.c    |   4 +
 drivers/net/wireless/intel/iwlwifi/pcie/internal.h |   7 +-
 drivers/net/wireless/intel/iwlwifi/pcie/rx.c       |   2 +-
 drivers/net/wireless/intel/iwlwifi/pcie/trans.c    |   5 +-
 drivers/net/wireless/intersil/orinoco/main.c       |   2 +-
 .../net/wireless/intersil/orinoco/orinoco_usb.c    |  21 +-
 drivers/net/wireless/marvell/libertas/if_spi.c     |   5 +
 drivers/net/wireless/mediatek/mt7601u/mcu.c        |  10 +-
 .../net/wireless/realtek/rtlwifi/rtl8821ae/phy.c   | 122 +++++++++--
 .../net/wireless/realtek/rtlwifi/rtl8821ae/reg.h   |   1 +
 drivers/net/wireless/rndis_wlan.c                  |   4 +
 42 files changed, 671 insertions(+), 388 deletions(-)

^ permalink raw reply

* Re: [PATCH] net: macb: fix phy interrupt parsing
From: Nicolas Ferre @ 2017-04-27  9:36 UTC (permalink / raw)
  To: Alexandre Belloni, David S . Miller; +Cc: Bartosz Folta, netdev, linux-kernel
In-Reply-To: <20170426100628.14493-1-alexandre.belloni@free-electrons.com>

Le 26/04/2017 à 12:06, Alexandre Belloni a écrit :
> Since 83a77e9ec415, the phydev irq is explicitly set to PHY_POLL when
> there is no pdata. It doesn't work on DT enabled platforms because the
> phydev irq is already set by libphy before.
> 
> Fixes: 83a77e9ec415 ("net: macb: Added PCI wrapper for Platform Driver.")

Means 4.10+

> Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>

Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>

Seems a good candidate for net stable.

Bye,

> ---
>  drivers/net/ethernet/cadence/macb.c | 18 ++++++++++--------
>  1 file changed, 10 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/net/ethernet/cadence/macb.c b/drivers/net/ethernet/cadence/macb.c
> index e131ecd17ab9..004223a866fe 100644
> --- a/drivers/net/ethernet/cadence/macb.c
> +++ b/drivers/net/ethernet/cadence/macb.c
> @@ -432,15 +432,17 @@ static int macb_mii_probe(struct net_device *dev)
>  	}
>  
>  	pdata = dev_get_platdata(&bp->pdev->dev);
> -	if (pdata && gpio_is_valid(pdata->phy_irq_pin)) {
> -		ret = devm_gpio_request(&bp->pdev->dev, pdata->phy_irq_pin,
> -					"phy int");
> -		if (!ret) {
> -			phy_irq = gpio_to_irq(pdata->phy_irq_pin);
> -			phydev->irq = (phy_irq < 0) ? PHY_POLL : phy_irq;
> +	if (pdata) {
> +		if (gpio_is_valid(pdata->phy_irq_pin)) {
> +			ret = devm_gpio_request(&bp->pdev->dev,
> +						pdata->phy_irq_pin, "phy int");
> +			if (!ret) {
> +				phy_irq = gpio_to_irq(pdata->phy_irq_pin);
> +				phydev->irq = (phy_irq < 0) ? PHY_POLL : phy_irq;
> +			}
> +		} else {
> +			phydev->irq = PHY_POLL;
>  		}
> -	} else {
> -		phydev->irq = PHY_POLL;
>  	}
>  
>  	/* attach the mac to the phy */
> 


-- 
Nicolas Ferre

^ permalink raw reply

* 50854 netdev
From: gestu @ 2017-04-27  9:32 UTC (permalink / raw)
  To: netdev

[-- Attachment #1: 87611415.zip --]
[-- Type: application/zip, Size: 1270 bytes --]

^ permalink raw reply

* Re: [PATCH v1 net-next 0/6] Extend socket timestamping API
From: Miroslav Lichvar @ 2017-04-27  9:28 UTC (permalink / raw)
  To: Richard Cochran
  Cc: netdev, Willem de Bruijn, Soheil Hassas Yeganeh, Keller, Jacob E,
	Denny Page, Jiri Benc
In-Reply-To: <20170426165435.GA21114@localhost.localdomain>

On Wed, Apr 26, 2017 at 06:54:35PM +0200, Richard Cochran wrote:
> On Wed, Apr 26, 2017 at 04:50:29PM +0200, Miroslav Lichvar wrote:
> > This patchset adds new options to the timestamping API that will be
> > useful for NTP implementations and possibly other applications.
> 
> Are there any userland ntp patches floating around to exercise the new
> HW time stamping option?

I'm not sure if anyone is working on support for HW timestamping in
ntp. With chrony, you can test it with an experimental code in the
hwts branch of https://github.com/mlichvar/chrony.

$ ./configure --enable-debug && make
# ./chronyd -d -d 'hwtimestamp *' 'server pool.ntp.org iburst' |& grep TIMESTAMP
 TX SCM_TIMESTAMPING: swts=1493285228.512531924 hwts=0.000000000
 TX SCM_TIMESTAMPING: swts=0.000000000 hwts=1493285226.073103885
 SCM_TIMESTAMPING_PKTINFO: if=2 len=90
 RX SCM_TIMESTAMPING: swts=1493285228.530657351 hwts=1493285226.091054104
 TX SCM_TIMESTAMPING: swts=1493285230.553457791 hwts=0.000000000
 TX SCM_TIMESTAMPING: swts=0.000000000 hwts=1493285228.113705104
 SCM_TIMESTAMPING_PKTINFO: if=2 len=90
 RX SCM_TIMESTAMPING: swts=1493285230.582817079 hwts=1493285228.142890229

-- 
Miroslav Lichvar

^ permalink raw reply

* Re: [PATCH net] xfrm: calculate L4 checksums also for GSO case before encrypting packets
From: Steffen Klassert @ 2017-04-27  9:04 UTC (permalink / raw)
  To: Ansis Atteka; +Cc: Ansis Atteka, netdev
In-Reply-To: <CAMQa7Bgn58eWhwH-tOJcoNoF7ATs5yBpAyu1rR4JN5=8pBdaMA@mail.gmail.com>

On Fri, Apr 21, 2017 at 02:45:17PM -0700, Ansis Atteka wrote:
> 
> I removed Geneve tunneling from equation and tried to run a simple
> iperf underlay UDP test while IPsec was still enabled to observe
> issues with the udp4_ufo_fragment() case.
> 
> Unfortunately, as can be seen from kernel tracer output below, I was
> unable to come up with a test case where udp4_ufo_fragment function
> would ever be invoked while IPsec is enabled:
> 
> admin1@ubuntu1:~/xfrm_test/net$ ifconfig em2.4001 | grep "inet addr"
>           inet addr:192.168.1.1  Bcast:192.168.1.255  Mask:255.255.255.0
> admin1@ubuntu1:~/xfrm_test/net$ ethtool -k em2.4001 | grep
> udp-fragmentation-offload
> udp-fragmentation-offload: on
> admin1@ubuntu1:~/xfrm_test/net$ sudo trace-cmd record -p
> function_graph -c -F iperf -c 192.168.1.2 -u -l20000
> admin1@ubuntu1:~/xfrm_test/net$ trace-cmd report | grep udp4
> admin1@ubuntu1:~/xfrm_test/net$
> 
> 
> Nevertheless, after disabling IPsec and leaving everything else the
> same, I start to see that udp4_ufo_fragment() gets invoked:
> 
> admin1@ubuntu1:~/xfrm_test/net$ trace-cmd report | grep udp4
>            iperf-25466 [004] 242431.203307: funcgraph_entry:
> 0.113 us   |                  udp4_hwcsum();
>            iperf-25466 [004] 242431.203360: funcgraph_entry:
>         |
> udp4_ufo_fragment() {
>            iperf-25466 [004] 242431.508436: funcgraph_entry:
> 0.080 us   |                  udp4_hwcsum();
>            iperf-25466 [004] 242431.508542: funcgraph_entry:
>         |
> udp4_ufo_fragment() {
> 
> 
> However, non-IPsec case really does not have this ESP packet
> corruption problem, because then the packets are in plain and can
> utilize checksum offloads. Do we really have a problem there for
> IPsec?

Probably not, at least locally generated packets don't do
ufo if they have an IPsec route. So it seems to be ok to
leave udp4_ufo_fragment as it is.

^ permalink raw reply

* Re: [PATCH v6 1/5] skbuff: return -EMSGSIZE in skb_to_sgvec to prevent overflow
From: Jason A. Donenfeld @ 2017-04-27  9:21 UTC (permalink / raw)
  To: Netdev, LKML, David Laight, kernel-hardening, David Miller
  Cc: Jason A. Donenfeld
In-Reply-To: <20170425184734.26563-1-Jason@zx2c4.com>

Hey Dave,

David Laight and I have been discussing offlist. It occurred to both
of us that this could just be turned into a loop because perhaps this
is actually just tail-recursive. Upon further inspection, however, the
way the current algorithm works, it's possible that each of the
fraglist skbs has its own fraglist, which would make this into tree
recursion, which is why in the first place I wanted to place that
limit on it. If that's the case, then the patch I proposed above is
the best way forward. However, perhaps there's the chance that
fraglist skbs having separate fraglists are actually forbidden? Is
this the case? Are there other parts of the API that enforce this
contract? Is it something we could safely rely on here? If you say
yes, I'll send a v7 that makes this into a non-recursive loop.

Regards,
Jason

^ permalink raw reply

* Re: ipsec doesn't route TCP with 4.11 kernel
From: Steffen Klassert @ 2017-04-27  8:42 UTC (permalink / raw)
  To: Cong Wang
  Cc: Don Bowman, linux-kernel@vger.kernel.org, Herbert Xu,
	Linux Kernel Network Developers
In-Reply-To: <CAM_iQpWT5tF5+LpoTP98JNJ=440jEkxHFkn8=jtAsZgondN49A@mail.gmail.com>

On Wed, Apr 26, 2017 at 10:01:34PM -0700, Cong Wang wrote:
> (Cc'ing netdev and IPSec maintainers)
> 
> On Tue, Apr 25, 2017 at 6:08 PM, Don Bowman <db@donbowman.ca> wrote:
> > I'm not sure how to describe this.
> >
> > 4.11rc2 worked, after that, no.

We had some recent IPsec GRO changes, this could influence TCP.
But these changes were introduced before rc2. If I read this correct,
the regression was introduced between rc2 and rc3, right?

> >
> > My ipsec tunnel comes up ok.

When talking about IPsec, I guess you use ESP, right?

> > ICMP works. UDP works. But TCP, the
> > sender [which is the ipsec client] does not reach the destination.
> >
> > Its not a routing rule issue (since ICMP/UDP work).
> > Its not a traffic selector just selecting TCP (I think) since ipsec
> > status shows just a subnet, no protocol.
> >
> > Using tcpdump:
> > # iptables -t mangle -I PREROUTING -m policy --pol ipsec --dir in -j
> > NFLOG --nflog-group 5
> > # iptables -t mangle -I POSTROUTING -m policy --pol ipsec --dir out -j
> > NFLOG --nflog-group 5
> > # tcpdump -s 0 -n -i nflog:5
> >
> > I see that it thinks it is sending the TCP packet, but the server end
> > does not receive.
> >
> > Does anyone have any suggestion to try?

If it is a GRO issue, then it is on the receive side, could you do
tcpdump on the receiving interface to see what you get there?

What shows /proc/net/xfrm_stat?

Can you do 'ip -s x s' to see if the SAs are used?

Do you have INET_ESP_OFFLOAD enabled?

^ permalink raw reply

* Re: [REGRESSION next-20170426] Commit 09515ef5ddad ("of/acpi: Configure dma operations at probe time for platform/amba/pci bus devices") causes oops in mvneta
From: Sricharan R @ 2017-04-27  8:54 UTC (permalink / raw)
  To: Joerg Roedel, Ralph Sennhauser
  Cc: Rafael J. Wysocki, Bjorn Helgaas, linux-acpi, linux-kernel,
	linux-pci, Thomas Petazzoni, netdev
In-Reply-To: <20170427084427.GV5077@suse.de>

Hi Joerg,

On 4/27/2017 2:14 PM, Joerg Roedel wrote:
> Sricharan,
> 
> On Wed, Apr 26, 2017 at 06:15:08PM +0200, Ralph Sennhauser wrote:
>> Commit 09515ef5ddad ("of/acpi: Configure dma operations at probe time
>> for platform/amba/pci bus devices") causes a kernel panic as in the log
>> below on an armada-385. Reverting the commit fixes the issue.
> 
> Any insight here? I tend to revert the patch in my tree, or is there a
> quick and easy fix?

I am checking on this manually to see what could be going wrong in the
driver. From logs i could not conclude directly. I will need some
more testing help (i will ask for) to root cause this.

Regards,
 Sricharan

> 
> 
> 
> 	Joerg
> 

-- 
"QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation

^ permalink raw reply

* Re: [REGRESSION next-20170426] Commit 09515ef5ddad ("of/acpi: Configure dma operations at probe time for platform/amba/pci bus devices") causes oops in mvneta
From: Joerg Roedel @ 2017-04-27  8:44 UTC (permalink / raw)
  To: Ralph Sennhauser
  Cc: Sricharan R, Rafael J. Wysocki, Bjorn Helgaas, linux-acpi,
	linux-kernel, linux-pci, Thomas Petazzoni, netdev
In-Reply-To: <20170426181508.687b52af@gmail.com>

Sricharan,

On Wed, Apr 26, 2017 at 06:15:08PM +0200, Ralph Sennhauser wrote:
> Commit 09515ef5ddad ("of/acpi: Configure dma operations at probe time
> for platform/amba/pci bus devices") causes a kernel panic as in the log
> below on an armada-385. Reverting the commit fixes the issue.

Any insight here? I tend to revert the patch in my tree, or is there a
quick and easy fix?



	Joerg

^ permalink raw reply

* Re: [PATCH] net: hso: register netdev later to avoid a race condition
From: Johan Hovold @ 2017-04-27  8:44 UTC (permalink / raw)
  To: Andreas Kemnade
  Cc: davem-fT/PcQaiUtIeIZ0/mPfg9Q, joe-6d6DIl74uiNBDgjK7y7TUQ,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r,
	peter-WaGBZJeGNqdsbIuE7sb01tBPR1lH4CV8,
	hns-xXXSsgcRVICgSpxsJD1C4w, linux-usb-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1493227600-10102-1-git-send-email-andreas-cLv4Z9ELZ06ZuzBka8ofvg@public.gmane.org>

On Wed, Apr 26, 2017 at 07:26:40PM +0200, Andreas Kemnade wrote:
> If the netdev is accessed before the urbs are initialized,
> there will be NULL pointer dereferences. That is avoided by
> registering it when it is fully initialized.

> Reported-by: H. Nikolaus Schaller <hns-xXXSsgcRVICgSpxsJD1C4w@public.gmane.org>
> Signed-off-by: Andreas Kemnade <andreas-cLv4Z9ELZ06ZuzBka8ofvg@public.gmane.org>
> ---
>  drivers/net/usb/hso.c | 14 +++++++-------
>  1 file changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/net/usb/hso.c b/drivers/net/usb/hso.c
> index 93411a3..00067a0 100644
> --- a/drivers/net/usb/hso.c
> +++ b/drivers/net/usb/hso.c
> @@ -2534,13 +2534,6 @@ static struct hso_device *hso_create_net_device(struct usb_interface *interface,
>  	SET_NETDEV_DEV(net, &interface->dev);
>  	SET_NETDEV_DEVTYPE(net, &hso_type);
>  
> -	/* registering our net device */
> -	result = register_netdev(net);
> -	if (result) {
> -		dev_err(&interface->dev, "Failed to register device\n");
> -		goto exit;
> -	}
> -
>  	/* start allocating */
>  	for (i = 0; i < MUX_BULK_RX_BUF_COUNT; i++) {
>  		hso_net->mux_bulk_rx_urb_pool[i] = usb_alloc_urb(0, GFP_KERNEL);
> @@ -2560,6 +2553,13 @@ static struct hso_device *hso_create_net_device(struct usb_interface *interface,
>  
>  	add_net_device(hso_dev);
>  
> +	/* registering our net device */
> +	result = register_netdev(net);
> +	if (result) {
> +		dev_err(&interface->dev, "Failed to register device\n");
> +		goto exit;

This all looks good, but you should consider cleaning up the error
handling of this function as a follow-up as we should not be
deregistering netdevs that have never been registered (e.g. if a
required endpoint is missing or if registration fails for some reason).

But just to be clear, this problem existed also before this change.

> +	}
> +
>  	hso_log_port(hso_dev);
>  
>  	hso_create_rfkill(hso_dev, interface);

Reviewed-by: Johan Hovold <johan-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>

Thanks,
Johan
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: xdp_redirect ifindex vs port. Was: best API for returning/setting egress port?
From: Jesper Dangaard Brouer @ 2017-04-27  8:41 UTC (permalink / raw)
  To: Andy Gospodarek
  Cc: Alexei Starovoitov, John Fastabend, Alexei Starovoitov,
	Daniel Borkmann, Daniel Borkmann, netdev@vger.kernel.org,
	xdp-newbies@vger.kernel.org, brouer
In-Reply-To: <20170426205544.GA40859@C02RW35GFVH8>

On Wed, 26 Apr 2017 16:55:44 -0400
Andy Gospodarek <andy@greyhouse.net> wrote:

> On Wed, Apr 26, 2017 at 10:58:45AM -0700, Alexei Starovoitov wrote:
> > On 4/26/17 9:35 AM, John Fastabend wrote:  
> > >   
> > > > As Alexei also mentioned before, ifindex vs port makes no real
> > > > difference seen from the bpf program side.  It is userspace's
> > > > responsibility to add ifindex/port's to the bpf-maps, according to how
> > > > the bpf program "policy" want to "connect" these ports.  The
> > > > port-table system add one extra step, of also adding this port to the
> > > > port-table (which lives inside the kernel).
> > > >   
> > > 
> > > I'm not sure I understand the "lives inside the kernel" bit. I assumed
> > > the 'map' should be a bpf map and behave like any other bpf map.
> > > 
> > > I wanted a new map to be defined, something like this from the bpf programmer
> > > side.
> > > 
> > > struct bpf_map_def SEC("maps") port_table =
> > > 	.type = BPF_MAP_TYPE_PORT_CONNECTION,
> > > 	.key_size = sizeof(u32),
> > > 	.value_size = BPF_PORT_CONNECTION_SIZE,
> > > 	.max_entries = 256,
> > > };  
> > 
> > I like the idea.
> > We have prog_array, perf_event_array, cgroup_array map specializations.
> > This one can be new netdev_array with some new bpf_redirect-like helper
> > accessing it.
> >   
> > > > When loading the XDP program, we also need to pass along a port table
> > > > "id" this XDP program is associated with (and if it doesn't exists you
> > > > create it).  And your userspace "control-plane" application also need
> > > > to know this port table "id", when adding a new port.  
> > > 
> > > So the user space application that is loading the program also needs
> > > to handle this map. This seems correct to me. But I don't see the
> > > value in making some new port table when we already have well understood
> > > framework for maps.  
> > 
> > +1
> >   
> > > > 
> > > > The concept of having multiple port tables is key.  As this implies we
> > > > can have several simultaneous "data-planes" that is *isolated* from
> > > > each-other.  Think about how network-namespaces/containers want
> > > > isolation. A subtle thing I'm afraid to mention, is that oppose to the
> > > > ifindex model, a port table with mapping to a net_device pointer, would
> > > > allow (faster) delivery into the container's inner net_device, which
> > > > sort of violates the isolation, but I would argue it is not a problem
> > > > as this net_device pointer could only be added from a process within the
> > > > namespace.  I like this feature, but it could easily be disallowed via
> > > > port insertion-time validation.
> > > >   
> > > 
> > > I think the above optimization should be allowed. And agree multiple port
> > > tables (maps?) is needed. Again all this points to using standard maps
> > > logic in my mind. For permissions and different domains, which I think
> > > you were starting to touch on, it looks like we could extend the pinning API.
> > > At the moment it does an inode_permission(inode, MAY_WRITE) check but I
> > > presume this could be extended. None of this would be needed in v1 and
> > > could be added subsequently. read-only maps seems doable.  
> > 
> > this is great idea. Once BPF_MAP_TYPE_NETDEV_ARRAY is populated
> > the user space can make it readonly to prevent further changes.
> > 
> > From user space it can be done similar to perf_events/cgroups as well.
> > bpf_map_update_elem(&netdev_array, &port_num, &ifindex)
> > should work.
> > For bpf_map_lookup_elem() from such netdev_array we can return
> > ifindex back.
> > The bpf_map_show_fdinfo() can be customized as well to pretty print
> > ifindexes of netdevs stored in there.
> >   
> 
> I agree with both of you on all of these points.  Having the port
> redirection in a new type of map and/or array seems like the way to go.
> 
> I understood Jesper's perspecitive when thinking about a way to pass a
> port-table id down, but I think the idea that the userspace loader code
> defining the maps is going to be the one making this link is the right
> idea and handling things like ifindex changes (rather than identifiers
> that perform lookups in other tables) is going to have to be yet another
> exercise left up to the...user.  :-)
> 

I love this idea. Integrating the port table closer with the bpf-maps
infrastructure makes sense.  This gives me a place to hook the code into,
instead of (re)inventing a new infrastructure for this port table, and the
interface will be more natural from a bpf-API point of view.

When registering/attaching a XDP/bpf program, we would just send the
file-descriptor for this port-map along (like we do with the bpf_prog
FD). Plus, it own ingress-port number this program is in the port-map.

It is not clear to me, in-which-data-structure on the kernel-side we
store this reference to the port-map and ingress-port. As today we only
have the "raw" struct bpf_prog pointer. I see several options:

1. Create a new xdp_prog struct that contains existing bpf_prog,
a port-map pointer and ingress-port. (IMHO easiest solution)

2. Just create a new pointer to port-map and store it in driver rx-ring
struct (like existing bpf_prog), but this create a race-challenge
replacing (cmpxchg) the program (or perhaps it's not a problem as it
runs under rcu and RTNL-lock).

3. Extend bpf_prog to store this port-map and ingress-port, and have a
fast-way to access it.  I assume it will be accessible via
bpf_prog->bpf_prog_aux->used_maps[X] but it will be too slow for XDP.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply

* RE: [PATCH net] driver/net: Fix possible memleaks when fail to register_netdevice
From: Gao Feng @ 2017-04-27  8:33 UTC (permalink / raw)
  To: 'Herbert Xu'
  Cc: jiri, davem, kuznet, jmorris, yoshfuji, kaber, steffen.klassert,
	netdev, 'Gao Feng'
In-Reply-To: <20170427081559.GA1058@gondor.apana.org.au>

> From: Herbert Xu [mailto:herbert@gondor.apana.org.au]
> Sent: Thursday, April 27, 2017 4:16 PM
> On Tue, Apr 25, 2017 at 08:01:50PM +0800, gfree.wind@foxmail.com wrote:
> > From: Gao Feng <fgao@ikuai8.com>
> >
[...]
> 
> This has the potential of creating future bugs, because there is no
guarantee
> that the ndo_init function has been invoked at all.
> 
> Wouldn't it be safer to move the freeing from the destructors into their
> ndo_uninit functions instead?

I considered about this solution, I am not sure if it is safe to move the
freeing from destructors into ndo_uninit.
Because when the free action is done in ndo_uninit, it is earlier than
destructor. 
I am not sure if it break the design of original driver.

I just tested the team driver before. It is ok to free all mems in
ndo_uninit.
Is it possible that anyone are using the net_dev after ndo_uninit ?

If no one, i would like to update the patch. 
Could you give me some guide please?

Regards
Feng

> 
> Thanks,
> --

^ permalink raw reply

* Re: [PATCH 1/2] wcn36xx: Pass used skb to ieee80211_tx_status()
From: Johannes Berg @ 2017-04-27  8:22 UTC (permalink / raw)
  To: Bjorn Andersson, Eugene Krasnikov, Kalle Valo
  Cc: Andy Gross, David Brown, devicetree-u79uwXL29TY76Z2rM5mHXA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-arm-msm-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-soc-u79uwXL29TY76Z2rM5mHXA,
	linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA,
	wcn36xx-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Nicolas Dechesne
In-Reply-To: <20170426220444.10539-1-bjorn.andersson-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>


> @@ -371,7 +371,7 @@ static void reap_tx_dxes(struct wcn36xx *wcn,
> struct wcn36xx_dxe_ch *ch)
>  			info = IEEE80211_SKB_CB(ctl->skb);
>  			if (!(info->flags &
> IEEE80211_TX_CTL_REQ_TX_STATUS)) {
>  				/* Keep frame until TX status comes
> */
> -				ieee80211_free_txskb(wcn->hw, ctl-
> >skb);
> +				ieee80211_tx_status(wcn->hw, ctl-
> >skb);
> 

I don't think this is a good idea. This code intentionally checked if
TX status was requested, and if not then it doesn't go to the effort of
building it.

As it is with your patch, it'll go and report the TX status without any
TX status information - which is handled in wcn36xx_dxe_tx_ack_ind()
for those frames needing it.

johannes

^ permalink raw reply

* Re: [PATCH net] driver/net: Fix possible memleaks when fail to register_netdevice
From: Herbert Xu @ 2017-04-27  8:15 UTC (permalink / raw)
  To: gfree.wind
  Cc: jiri, davem, kuznet, jmorris, yoshfuji, kaber, steffen.klassert,
	netdev, Gao Feng
In-Reply-To: <1493121710-27910-1-git-send-email-gfree.wind@foxmail.com>

On Tue, Apr 25, 2017 at 08:01:50PM +0800, gfree.wind@foxmail.com wrote:
> From: Gao Feng <fgao@ikuai8.com>
> 
> These drivers allocate kinds of resources in init routine, and free
> some resources in the destructor of net_device. It may cause memleak
> when some errors happen after register_netdevice invokes the init
> callback. Because only the uninit callback is invoked in the error
> handler of register_netdevice, but the destructor not. As a result,
> some resources are lost forever.
> 
> Now invokes the destructor instead of free_netdev somewhere, and free
> the left resources in the newlink func when fail to register_netdevice.
> 
> Signed-off-by: Gao Feng <fgao@ikuai8.com>

This has the potential of creating future bugs, because there
is no guarantee that the ndo_init function has been invoked at
all.

Wouldn't it be safer to move the freeing from the destructors
into their ndo_uninit functions instead?

Thanks,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* [PATCH net-next] net: bridge: Fix improper taking over HW learned FDB
From: Arkadi Sharshevsky @ 2017-04-27  8:08 UTC (permalink / raw)
  To: netdev
  Cc: davem, stephen, bridge, Arkadi Sharshevsky, Ido Schimmel,
	Nikolay Aleksandrov

Commit 7e26bf45e4cb ("net: bridge: allow SW learn to take over HW fdb
entries") added the ability to "take over an entry which was previously
learned via HW when it shows up from a SW port".

However, if an entry was learned via HW and then a control packet
(e.g., ARP request) was trapped to the CPU, the bridge driver will
update the entry and remove the externally learned flag, although the
entry is still present in HW. Instead, only clear the externally learned
flag in case of roaming.

Fixes: 7e26bf45e4cb ("net: bridge: allow SW learn to take over HW fdb entries")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Arkadi Sharashevsky <arkadis@mellanox.com>
Cc: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
---
 net/bridge/br_fdb.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
index 5a40a87..ab0c7cc 100644
--- a/net/bridge/br_fdb.c
+++ b/net/bridge/br_fdb.c
@@ -589,14 +589,14 @@ void br_fdb_update(struct net_bridge *br, struct net_bridge_port *source,
 			if (unlikely(source != fdb->dst)) {
 				fdb->dst = source;
 				fdb_modified = true;
+				/* Take over HW learned entry */
+				if (unlikely(fdb->added_by_external_learn))
+					fdb->added_by_external_learn = 0;
 			}
 			if (now != fdb->updated)
 				fdb->updated = now;
 			if (unlikely(added_by_user))
 				fdb->added_by_user = 1;
-			/* Take over HW learned entry */
-			if (unlikely(fdb->added_by_external_learn))
-				fdb->added_by_external_learn = 0;
 			if (unlikely(fdb_modified))
 				fdb_notify(br, fdb, RTM_NEWNEIGH);
 		}
-- 
2.4.11

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox