Netdev List

Netdev List
 help / color / mirror / Atom feed

* [PATCH 00/12] Netfilter/IPVS fixes for net
From: Pablo Neira Ayuso @ 2017-10-09 16:25 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

Hi David,

The following patchset contains Netfilter/IPVS fixes for your net tree,
they are:

1) Fix packet drops due to incorrect ECN handling in IPVS, from Vadim
   Fedorenko.

2) Fix splat with mark restoration in xt_socket with non-full-sock,
   patch from Subash Abhinov Kasiviswanathan.

3) ipset bogusly bails out when adding IPv4 range containing more than
   2^31 addresses, from Jozsef Kadlecsik.

4) Incorrect pernet unregistration order in ipset, from Florian Westphal.

5) Races between dump and swap in ipset results in BUG_ON splats, from
   Ross Lagerwall.

6) Fix chain renames in nf_tables, from JingPiao Chen.

7) Fix race in pernet codepath with ebtables table registration, from
   Artem Savkov.

8) Memory leak in error path in set name allocation in nf_tables, patch
   from Arvind Yadav.

9) Don't dump chain counters if they are not available, this fixes a
   crash when listing the ruleset.

10) Fix out of bound memory read in strlcpy() in x_tables compat code,
    from Eric Dumazet.

11) Make sure we only process TCP packets in SYNPROXY hooks, patch from
    Lin Zhang.

12) Cannot load rules incrementally anymore after xt_bpf with pinned
    objects, added in revision 1. From Shmulik Ladkani.

You can pull these changes from:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git

Thanks!

----------------------------------------------------------------

The following changes since commit b4391db42308c9940944b5d7be5ca4b78fb88dd0:

  netlink: fix nla_put_{u8,u16,u32} for KASAN (2017-09-25 20:18:27 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git HEAD

for you to fetch changes up to 98589a0998b8b13c4a8fa1ccb0e62751a019faa5:

  netfilter: xt_bpf: Fix XT_BPF_MODE_FD_PINNED mode of 'xt_bpf_info_v1' (2017-10-09 15:18:04 +0200)

----------------------------------------------------------------
Artem Savkov (1):
      netfilter: ebtables: fix race condition in frame_filter_net_init()

Arvind Yadav (1):
      netfilter: nf_tables: Release memory obtained by kasprintf

Eric Dumazet (1):
      netfilter: x_tables: avoid stack-out-of-bounds read in xt_copy_counters_from_user

Florian Westphal (1):
      netfilter: ipset: pernet ops must be unregistered last

JingPiao Chen (1):
      netfilter: nf_tables: fix update chain error

Jozsef Kadlecsik (1):
      netfilter: ipset: Fix adding an IPv4 range containing more than 2^31 addresses

Lin Zhang (1):
      netfilter: SYNPROXY: skip non-tcp packet in {ipv4, ipv6}_synproxy_hook

Pablo Neira Ayuso (1):
      netfilter: nf_tables: do not dump chain counters if not enabled

Ross Lagerwall (1):
      netfilter: ipset: Fix race between dump and swap

Shmulik Ladkani (1):
      netfilter: xt_bpf: Fix XT_BPF_MODE_FD_PINNED mode of 'xt_bpf_info_v1'

Subash Abhinov Kasiviswanathan (1):
      netfilter: xt_socket: Restore mark from full sockets only

Vadim Fedorenko (1):
      netfilter: ipvs: full-functionality option for ECN encapsulation in tunnel

 include/linux/bpf.h                          |  5 +++++
 include/linux/netfilter_bridge/ebtables.h    |  7 ++++---
 include/uapi/linux/netfilter/xt_bpf.h        |  1 +
 kernel/bpf/inode.c                           |  1 +
 net/bridge/netfilter/ebtable_broute.c        |  4 ++--
 net/bridge/netfilter/ebtable_filter.c        |  4 ++--
 net/bridge/netfilter/ebtable_nat.c           |  4 ++--
 net/bridge/netfilter/ebtables.c              | 17 ++++++++--------
 net/ipv4/netfilter/ipt_SYNPROXY.c            |  3 ++-
 net/ipv6/netfilter/ip6t_SYNPROXY.c           |  2 +-
 net/netfilter/ipset/ip_set_core.c            | 29 +++++++++++++++++-----------
 net/netfilter/ipset/ip_set_hash_ip.c         | 22 +++++++++++----------
 net/netfilter/ipset/ip_set_hash_ipmark.c     |  2 +-
 net/netfilter/ipset/ip_set_hash_ipport.c     |  2 +-
 net/netfilter/ipset/ip_set_hash_ipportip.c   |  2 +-
 net/netfilter/ipset/ip_set_hash_ipportnet.c  |  4 ++--
 net/netfilter/ipset/ip_set_hash_net.c        |  2 +-
 net/netfilter/ipset/ip_set_hash_netiface.c   |  2 +-
 net/netfilter/ipset/ip_set_hash_netnet.c     |  4 ++--
 net/netfilter/ipset/ip_set_hash_netport.c    |  2 +-
 net/netfilter/ipset/ip_set_hash_netportnet.c |  4 ++--
 net/netfilter/ipvs/ip_vs_xmit.c              |  8 ++++++--
 net/netfilter/nf_tables_api.c                | 10 ++++++----
 net/netfilter/x_tables.c                     |  4 ++--
 net/netfilter/xt_bpf.c                       | 22 +++++++++++++++++++--
 net/netfilter/xt_socket.c                    |  4 ++--
 26 files changed, 107 insertions(+), 64 deletions(-)

^ permalink raw reply

* Re: [oss-drivers] Re: [PATCH net-next 0/7] nfp: extend match and action for flower offload
From: Simon Horman @ 2017-10-09 16:02 UTC (permalink / raw)
  To: Tom Herbert
  Cc: David Miller, Jakub Kicinski, Linux Kernel Network Developers,
	oss-drivers
In-Reply-To: <CALx6S34Akd2Z3TDaEKwG3-1BwSZ79bO4kweuLojw=00cg0t=Nw@mail.gmail.com>

On Mon, Oct 09, 2017 at 08:45:41AM -0700, Tom Herbert wrote:
> On Mon, Oct 9, 2017 at 1:05 AM, Simon Horman <simon.horman@netronome.com> wrote:
> > On Fri, Oct 06, 2017 at 08:34:59AM -0700, Tom Herbert wrote:
> >> Simon,
> >>
> >> Maybe a bit off topic, but I had the impression netronome would
> >> support BPF so that filters could be programmed for arbitrary
> >> protocols and fields. Is that true? If so, what is the relationship
> >> between that functionality and these patches?
> >
> > Hi Tom,
> >
> > you are correct in thinking that Netronome is supporting BPF offload
> > in its nfp driver. That support continues to be enhanced and supported.
> >
> > This patch-set relates to a different set of functionality, offload of the
> > TC flower classifier. At this point there is no relationship between the
> > two sets of functionality and they cannot be used at the same time;
> > different firmware images are required and the driver initiates itself
> > according to the firmware loaded.
> >
> > In future it may be possible to use both BPF and TC flower offloads at the
> > same time but that is not the case at this time.
> >
> > Does that answer your question?
> 
> Yes... A couple of follow up questions. If someone uses tc-bpf would
> that be offloaded to nfp? Is there anything that TC flower offloads
> can do that the BPF solution can't do?

I believe that the NFP driver also offloads tc-bpf.
Jakub can correct me if I am wrong.

I would expect that in general one can write BPF programs to
offload use-cases cases covered by the TC flower offloads.

^ permalink raw reply

* Re: [PATCH net-next 0/7] nfp: extend match and action for flower offload
From: Tom Herbert @ 2017-10-09 15:45 UTC (permalink / raw)
  To: Simon Horman
  Cc: David Miller, Jakub Kicinski, Linux Kernel Network Developers,
	oss-drivers
In-Reply-To: <20171009080545.GB11130@netronome.com>

On Mon, Oct 9, 2017 at 1:05 AM, Simon Horman <simon.horman@netronome.com> wrote:
> On Fri, Oct 06, 2017 at 08:34:59AM -0700, Tom Herbert wrote:
>> Simon,
>>
>> Maybe a bit off topic, but I had the impression netronome would
>> support BPF so that filters could be programmed for arbitrary
>> protocols and fields. Is that true? If so, what is the relationship
>> between that functionality and these patches?
>
> Hi Tom,
>
> you are correct in thinking that Netronome is supporting BPF offload
> in its nfp driver. That support continues to be enhanced and supported.
>
> This patch-set relates to a different set of functionality, offload of the
> TC flower classifier. At this point there is no relationship between the
> two sets of functionality and they cannot be used at the same time;
> different firmware images are required and the driver initiates itself
> according to the firmware loaded.
>
> In future it may be possible to use both BPF and TC flower offloads at the
> same time but that is not the case at this time.
>
> Does that answer your question?

Yes... A couple of follow up questions. If someone uses tc-bpf would
that be offloaded to nfp? Is there anything that TC flower offloads
can do that the BPF solution can't do?

Thanks,
Tom

^ permalink raw reply

* [PATCH v2] net/core: Fix BUG to BUG_ON conditionals.
From: Tim Hansen @ 2017-10-09 15:37 UTC (permalink / raw)
  To: davem
  Cc: willemb, edumazet, soheil, pabeni, elena.reshetova, tom, Jason,
	fw, netdev, linux-kernel, devtimhansen, alexander.levin

Fix BUG() calls to use BUG_ON(conditional) macros.

This was found using make coccicheck M=net/core on linux next
tag next-2017092

Signed-off-by: Tim Hansen <devtimhansen@gmail.com>
---
 net/core/skbuff.c | 15 ++++++---------
 1 file changed, 6 insertions(+), 9 deletions(-)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index d98c2e3ce2bf..34ce4c1a0f3c 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -1350,8 +1350,7 @@ struct sk_buff *skb_copy(const struct sk_buff *skb, gfp_t gfp_mask)
 	/* Set the tail pointer and length */
 	skb_put(n, skb->len);
 
-	if (skb_copy_bits(skb, -headerlen, n->head, headerlen + skb->len))
-		BUG();
+	BUG_ON(skb_copy_bits(skb, -headerlen, n->head, headerlen + skb->len));
 
 	copy_skb_header(n, skb);
 	return n;
@@ -1449,8 +1448,7 @@ int pskb_expand_head(struct sk_buff *skb, int nhead, int ntail,
 
 	BUG_ON(nhead < 0);
 
-	if (skb_shared(skb))
-		BUG();
+	BUG_ON(skb_shared(skb));
 
 	size = SKB_DATA_ALIGN(size);
 
@@ -1595,9 +1593,8 @@ struct sk_buff *skb_copy_expand(const struct sk_buff *skb,
 		head_copy_off = newheadroom - head_copy_len;
 
 	/* Copy the linear header and data. */
-	if (skb_copy_bits(skb, -head_copy_len, n->head + head_copy_off,
-			  skb->len + head_copy_len))
-		BUG();
+	BUG_ON(skb_copy_bits(skb, -head_copy_len, n->head + head_copy_off,
+			     skb->len + head_copy_len));
 
 	copy_skb_header(n, skb);
 
@@ -1878,8 +1875,8 @@ void *__pskb_pull_tail(struct sk_buff *skb, int delta)
 			return NULL;
 	}
 
-	if (skb_copy_bits(skb, skb_headlen(skb), skb_tail_pointer(skb), delta))
-		BUG();
+	BUG_ON(skb_copy_bits(skb, skb_headlen(skb),
+			     skb_tail_pointer(skb), delta));
 
 	/* Optimization: no fragments, no reasons to preestimate
 	 * size of pulled pages. Superb.
-- 
2.14.2

^ permalink raw reply related

* Re: [PATCH net] net: enable interface alias removal via rtnl
From: Nicolas Dichtel @ 2017-10-09 15:25 UTC (permalink / raw)
  To: David Ahern, Oliver Hartkopp, davem
  Cc: netdev, Oliver Hartkopp, Stephen Hemminger
In-Reply-To: <ffd17810-8d65-04f5-6841-98c74201780c@gmail.com>

Le 09/10/2017 à 16:02, David Ahern a écrit :
> On 10/9/17 2:23 AM, Nicolas Dichtel wrote:
>> Le 06/10/2017 à 22:10, Oliver Hartkopp a écrit :
>>>
>>>
>>> On 10/06/2017 08:18 PM, David Ahern wrote:
>>>> On 10/5/17 4:19 AM, Nicolas Dichtel wrote:
>>>>> IFLA_IFALIAS is defined as NLA_STRING. It means that the minimal length of
>>>>> the attribute is 1 ("\0"). However, to remove an alias, the attribute
>>>>> length must be 0 (see dev_set_alias()).
>>>>
>>>> why not add a check in dev_set_alias that if len is 1 and the 1
>>>> character is '\0' it means remove the alias?
>> Because it requires an iproute2 patch. iproute2 doesn't send the '\0'. With the
>> command 'ip link set dummy0 alias ""', the attribute length is 0.
> 
> iproute2 needs the feature for 0-len strings or perhaps a 'noalias' option.
iproute2 needs nothing ...

> 
> You can reset the alias using the sysfs file. Given that there is a
> workaround for existing kernels and userspace, upstream can get fixed
> without changing the UAPI.
> 
I don't get the point with the UAPI. What will be broken?
I don't see why allowing an attribute with no data is a problem.

^ permalink raw reply

* [net 2/5] ixgbe: fix masking of bits read from IXGBE_VXLANCTRL register
From: Jeff Kirsher @ 2017-10-09 15:12 UTC (permalink / raw)
  To: davem; +Cc: Sabrina Dubroca, netdev, nhorman, sassmann, jogreene,
	Jeff Kirsher
In-Reply-To: <20171009151251.53939-1-jeffrey.t.kirsher@intel.com>

From: Sabrina Dubroca <sd@queasysnail.net>

In ixgbe_clear_udp_tunnel_port(), we read the IXGBE_VXLANCTRL register
and then try to mask some bits out of the value, using the logical
instead of bitwise and operator.

Fixes: a21d0822ff69 ("ixgbe: add support for geneve Rx offload")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 822cdb4f2c25..4d76afd13868 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -4881,7 +4881,7 @@ static void ixgbe_clear_udp_tunnel_port(struct ixgbe_adapter *adapter, u32 mask)
 				IXGBE_FLAG_GENEVE_OFFLOAD_CAPABLE)))
 		return;
 
-	vxlanctrl = IXGBE_READ_REG(hw, IXGBE_VXLANCTRL) && ~mask;
+	vxlanctrl = IXGBE_READ_REG(hw, IXGBE_VXLANCTRL) & ~mask;
 	IXGBE_WRITE_REG(hw, IXGBE_VXLANCTRL, vxlanctrl);
 
 	if (mask & IXGBE_VXLANCTRL_VXLAN_UDPPORT_MASK)
-- 
2.14.2

^ permalink raw reply related

* [net 3/5] Revert commit 1a8b6d76dc5b ("net:add one common config...")
From: Jeff Kirsher @ 2017-10-09 15:12 UTC (permalink / raw)
  To: davem
  Cc: Ding Tianhong, linux-kernel, netdev, nhorman, sassmann, jogreene,
	Jeff Kirsher
In-Reply-To: <20171009151251.53939-1-jeffrey.t.kirsher@intel.com>

From: Ding Tianhong <dingtianhong@huawei.com>

The new flag PCI_DEV_FLAGS_NO_RELAXED_ORDERING has been added
to indicate that Relaxed Ordering Attributes (RO) should not
be used for Transaction Layer Packets (TLP) targeted toward
these affected Root Port, it will clear the bit4 in the PCIe
Device Control register, so the PCIe device drivers could
query PCIe configuration space to determine if it can send
TLPs to Root Port with the Relaxed Ordering Attributes set.

With this new flag  we don't need the config ARCH_WANT_RELAX_ORDER
to control the Relaxed Ordering Attributes for the ixgbe drivers
just like the commit 1a8b6d76dc5b ("net:add one common config...") did,
so revert this commit.

Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 arch/Kconfig                                    | 3 ---
 arch/sparc/Kconfig                              | 1 -
 drivers/net/ethernet/intel/ixgbe/ixgbe_common.c | 2 +-
 3 files changed, 1 insertion(+), 5 deletions(-)

diff --git a/arch/Kconfig b/arch/Kconfig
index 1aafb4efbb51..d789a89cb32c 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -937,9 +937,6 @@ config STRICT_MODULE_RWX
 	  and non-text memory will be made non-executable. This provides
 	  protection against certain security exploits (e.g. writing to text)
 
-config ARCH_WANT_RELAX_ORDER
-	bool
-
 config ARCH_HAS_REFCOUNT
 	bool
 	help
diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig
index 0be3828752e5..4e83f950713e 100644
--- a/arch/sparc/Kconfig
+++ b/arch/sparc/Kconfig
@@ -44,7 +44,6 @@ config SPARC
 	select ARCH_HAS_SG_CHAIN
 	select CPU_NO_EFFICIENT_FFS
 	select LOCKDEP_SMALL if LOCKDEP
-	select ARCH_WANT_RELAX_ORDER
 
 config SPARC32
 	def_bool !64BIT
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_common.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_common.c
index 2c19070d2a0b..e8c1788aed1f 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_common.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_common.c
@@ -366,7 +366,7 @@ s32 ixgbe_start_hw_gen2(struct ixgbe_hw *hw)
 	}
 	IXGBE_WRITE_FLUSH(hw);
 
-#ifndef CONFIG_ARCH_WANT_RELAX_ORDER
+#ifndef CONFIG_SPARC
 	/* Disable relaxed ordering */
 	for (i = 0; i < hw->mac.max_tx_queues; i++) {
 		u32 regval;
-- 
2.14.2

^ permalink raw reply related

* [net 4/5] net: ixgbe: Use new PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag
From: Jeff Kirsher @ 2017-10-09 15:12 UTC (permalink / raw)
  To: davem; +Cc: Ding Tianhong, netdev, nhorman, sassmann, jogreene, Jeff Kirsher
In-Reply-To: <20171009151251.53939-1-jeffrey.t.kirsher@intel.com>

From: Ding Tianhong <dingtianhong@huawei.com>

The ixgbe driver use the compile check to determine if it can
send TLPs to Root Port with the Relaxed Ordering Attribute set,
this is too inconvenient, now the new flag PCI_DEV_FLAGS_NO_RELAXED_ORDERING
has been added to the kernel and we could check the bit4 in the PCIe
Device Control register to determine whether we should use the Relaxed
Ordering Attributes or not, so use this new way in the ixgbe driver.

Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Acked-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_82598.c  | 22 ----------------------
 drivers/net/ethernet/intel/ixgbe/ixgbe_common.c | 19 -------------------
 2 files changed, 41 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_82598.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_82598.c
index 523f9d05a810..8a32eb7d47b9 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_82598.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_82598.c
@@ -175,31 +175,9 @@ static s32 ixgbe_init_phy_ops_82598(struct ixgbe_hw *hw)
  **/
 static s32 ixgbe_start_hw_82598(struct ixgbe_hw *hw)
 {
-#ifndef CONFIG_SPARC
-	u32 regval;
-	u32 i;
-#endif
 	s32 ret_val;
 
 	ret_val = ixgbe_start_hw_generic(hw);
-
-#ifndef CONFIG_SPARC
-	/* Disable relaxed ordering */
-	for (i = 0; ((i < hw->mac.max_tx_queues) &&
-	     (i < IXGBE_DCA_MAX_QUEUES_82598)); i++) {
-		regval = IXGBE_READ_REG(hw, IXGBE_DCA_TXCTRL(i));
-		regval &= ~IXGBE_DCA_TXCTRL_DESC_WRO_EN;
-		IXGBE_WRITE_REG(hw, IXGBE_DCA_TXCTRL(i), regval);
-	}
-
-	for (i = 0; ((i < hw->mac.max_rx_queues) &&
-	     (i < IXGBE_DCA_MAX_QUEUES_82598)); i++) {
-		regval = IXGBE_READ_REG(hw, IXGBE_DCA_RXCTRL(i));
-		regval &= ~(IXGBE_DCA_RXCTRL_DATA_WRO_EN |
-			    IXGBE_DCA_RXCTRL_HEAD_WRO_EN);
-		IXGBE_WRITE_REG(hw, IXGBE_DCA_RXCTRL(i), regval);
-	}
-#endif
 	if (ret_val)
 		return ret_val;
 
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_common.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_common.c
index e8c1788aed1f..6e6ab6f6875e 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_common.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_common.c
@@ -366,25 +366,6 @@ s32 ixgbe_start_hw_gen2(struct ixgbe_hw *hw)
 	}
 	IXGBE_WRITE_FLUSH(hw);
 
-#ifndef CONFIG_SPARC
-	/* Disable relaxed ordering */
-	for (i = 0; i < hw->mac.max_tx_queues; i++) {
-		u32 regval;
-
-		regval = IXGBE_READ_REG(hw, IXGBE_DCA_TXCTRL_82599(i));
-		regval &= ~IXGBE_DCA_TXCTRL_DESC_WRO_EN;
-		IXGBE_WRITE_REG(hw, IXGBE_DCA_TXCTRL_82599(i), regval);
-	}
-
-	for (i = 0; i < hw->mac.max_rx_queues; i++) {
-		u32 regval;
-
-		regval = IXGBE_READ_REG(hw, IXGBE_DCA_RXCTRL(i));
-		regval &= ~(IXGBE_DCA_RXCTRL_DATA_WRO_EN |
-			    IXGBE_DCA_RXCTRL_HEAD_WRO_EN);
-		IXGBE_WRITE_REG(hw, IXGBE_DCA_RXCTRL(i), regval);
-	}
-#endif
 	return 0;
 }
 
-- 
2.14.2

^ permalink raw reply related

* [net 5/5] ixgbe: incorrect XDP ring accounting in ethtool tx_frame param
From: Jeff Kirsher @ 2017-10-09 15:12 UTC (permalink / raw)
  To: davem; +Cc: John Fastabend, netdev, nhorman, sassmann, jogreene, Jeff Kirsher
In-Reply-To: <20171009151251.53939-1-jeffrey.t.kirsher@intel.com>

From: John Fastabend <john.fastabend@gmail.com>

Changing the TX ring parameters with an XDP program attached may
cause the XDP queues to be cleared and the TX rings to be incorrectly
configured.

Fix by doing correct ring accounting in setup call.

Fixes: 33fdc82f0883 ("ixgbe: add support for XDP_TX action")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
index 72c565712a5f..c3e7a8191128 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
@@ -1048,7 +1048,7 @@ static int ixgbe_set_ringparam(struct net_device *netdev,
 {
 	struct ixgbe_adapter *adapter = netdev_priv(netdev);
 	struct ixgbe_ring *temp_ring;
-	int i, err = 0;
+	int i, j, err = 0;
 	u32 new_rx_count, new_tx_count;
 
 	if ((ring->rx_mini_pending) || (ring->rx_jumbo_pending))
@@ -1085,8 +1085,8 @@ static int ixgbe_set_ringparam(struct net_device *netdev,
 	}
 
 	/* allocate temporary buffer to store rings in */
-	i = max_t(int, adapter->num_tx_queues, adapter->num_rx_queues);
-	i = max_t(int, i, adapter->num_xdp_queues);
+	i = max_t(int, adapter->num_tx_queues + adapter->num_xdp_queues,
+		  adapter->num_rx_queues);
 	temp_ring = vmalloc(i * sizeof(struct ixgbe_ring));
 
 	if (!temp_ring) {
@@ -1118,8 +1118,8 @@ static int ixgbe_set_ringparam(struct net_device *netdev,
 			}
 		}
 
-		for (i = 0; i < adapter->num_xdp_queues; i++) {
-			memcpy(&temp_ring[i], adapter->xdp_ring[i],
+		for (j = 0; j < adapter->num_xdp_queues; j++, i++) {
+			memcpy(&temp_ring[i], adapter->xdp_ring[j],
 			       sizeof(struct ixgbe_ring));
 
 			temp_ring[i].count = new_tx_count;
@@ -1139,10 +1139,10 @@ static int ixgbe_set_ringparam(struct net_device *netdev,
 			memcpy(adapter->tx_ring[i], &temp_ring[i],
 			       sizeof(struct ixgbe_ring));
 		}
-		for (i = 0; i < adapter->num_xdp_queues; i++) {
-			ixgbe_free_tx_resources(adapter->xdp_ring[i]);
+		for (j = 0; j < adapter->num_xdp_queues; j++, i++) {
+			ixgbe_free_tx_resources(adapter->xdp_ring[j]);
 
-			memcpy(adapter->xdp_ring[i], &temp_ring[i],
+			memcpy(adapter->xdp_ring[j], &temp_ring[i],
 			       sizeof(struct ixgbe_ring));
 		}
 
-- 
2.14.2

^ permalink raw reply related

* [net 0/5][pull request] Intel Wired LAN Driver Updates 2017-10-09
From: Jeff Kirsher @ 2017-10-09 15:12 UTC (permalink / raw)
  To: davem; +Cc: Jeff Kirsher, netdev, nhorman, sassmann, jogreene

This series contains updates to ixgbe and arch/Kconfig.

Mark fixes a case where PHY register access is not supported and we were
returning a PHY address, when we should have been returning -EOPNOTSUPP.

Sabrina Dubroca fixes the use of a logical "and" when it should have been
the bitwise "and" operator.

Ding Tianhong reverts the commit that added the Kconfig bool option
ARCH_WANT_RELAX_ORDER, since there is now a new flag
PCI_DEV_FLAGS_NO_RELAXED_ORDERING that has been added to indicate that
Relaxed Ordering Attributes should not be used for Transaction Layer
Packets.  Then follows up with making the needed changes to ixgbe to
use the new PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag.

John Fastabend fixes an issue in the ring accounting when the transmit
ring parameters are changed via ethtool when an XDP program is attached.

The following are changes since commit a9e2971b8cd3ef469de0112ba15778b5b98ad72e:
  tipc: Unclone message at secondary destination lookup
and are available in the git repository at:
  git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue 10GbE

Ding Tianhong (2):
  Revert commit 1a8b6d76dc5b ("net:add one common config...")
  net: ixgbe: Use new PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag

John Fastabend (1):
  ixgbe: incorrect xdp ring accounting in ethtool tx_frame param

Mark D Rustad (1):
  ixgbe: Return error when getting PHY address if PHY access is not
    supported

Sabrina Dubroca (1):
  ixgbe: fix masking of bits read from IXGBE_VXLANCTRL register

 arch/Kconfig                                     |  3 ---
 arch/sparc/Kconfig                               |  1 -
 drivers/net/ethernet/intel/ixgbe/ixgbe_82598.c   | 22 ----------------------
 drivers/net/ethernet/intel/ixgbe/ixgbe_common.c  | 19 -------------------
 drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c | 16 ++++++++--------
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c    |  6 +++++-
 6 files changed, 13 insertions(+), 54 deletions(-)

-- 
2.14.2

^ permalink raw reply

* [net 1/5] ixgbe: Return error when getting PHY address if PHY access is not supported
From: Jeff Kirsher @ 2017-10-09 15:12 UTC (permalink / raw)
  To: davem; +Cc: Mark D Rustad, netdev, nhorman, sassmann, jogreene, Jeff Kirsher
In-Reply-To: <20171009151251.53939-1-jeffrey.t.kirsher@intel.com>

From: Mark D Rustad <mark.d.rustad@intel.com>

In cases where PHY register access is not supported, don't mislead
a caller into thinking that it is supported by returning a PHY
address. Instead, return -EOPNOTSUPP when PHY access is not
supported.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index d962368d08d0..822cdb4f2c25 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -8529,6 +8529,10 @@ static int ixgbe_ioctl(struct net_device *netdev, struct ifreq *req, int cmd)
 		return ixgbe_ptp_set_ts_config(adapter, req);
 	case SIOCGHWTSTAMP:
 		return ixgbe_ptp_get_ts_config(adapter, req);
+	case SIOCGMIIPHY:
+		if (!adapter->hw.phy.ops.read_reg)
+			return -EOPNOTSUPP;
+		/* fall through */
 	default:
 		return mdio_mii_ioctl(&adapter->hw.phy.mdio, if_mii(req), cmd);
 	}
-- 
2.14.2

^ permalink raw reply related

* Re: [PATCH net] net: enable interface alias removal via rtnl
From: David Ahern @ 2017-10-09 14:02 UTC (permalink / raw)
  To: nicolas.dichtel, Oliver Hartkopp, davem
  Cc: netdev, Oliver Hartkopp, Stephen Hemminger
In-Reply-To: <6e1671c7-93d7-0090-54f6-6b36abb8ba89@6wind.com>

On 10/9/17 2:23 AM, Nicolas Dichtel wrote:
> Le 06/10/2017 à 22:10, Oliver Hartkopp a écrit :
>>
>>
>> On 10/06/2017 08:18 PM, David Ahern wrote:
>>> On 10/5/17 4:19 AM, Nicolas Dichtel wrote:
>>>> IFLA_IFALIAS is defined as NLA_STRING. It means that the minimal length of
>>>> the attribute is 1 ("\0"). However, to remove an alias, the attribute
>>>> length must be 0 (see dev_set_alias()).
>>>
>>> why not add a check in dev_set_alias that if len is 1 and the 1
>>> character is '\0' it means remove the alias?
> Because it requires an iproute2 patch. iproute2 doesn't send the '\0'. With the
> command 'ip link set dummy0 alias ""', the attribute length is 0.

iproute2 needs the feature for 0-len strings or perhaps a 'noalias' option.

You can reset the alias using the sysfs file. Given that there is a
workaround for existing kernels and userspace, upstream can get fixed
without changing the UAPI.

> A kernel patch is probably enough for this problem. Updating iproute2 on old
> distributions is not always easy.

Can't say I have ever heard someone suggest that a kernel is easier to
change than userspace.

^ permalink raw reply

* [PATCH net-next 2/3] net/mlx4_core: Fix cast warning in fw.c
From: Tariq Toukan @ 2017-10-09 13:59 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Eran Ben Elisha, Tariq Toukan
In-Reply-To: <1507557590-17747-1-git-send-email-tariqt@mellanox.com>

Fix the following SPARSE warning, in MLX4_GET() macro:
drivers/net/ethernet/mellanox/mlx4/fw.c:233:9: warning: cast to restricted __be64

Fixes: 17d5ceb6e43e ("net/mlx4_core: Fix unaligned accesses")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx4/fw.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/fw.c b/drivers/net/ethernet/mellanox/mlx4/fw.c
index 16c09949afd5..634f603f941c 100644
--- a/drivers/net/ethernet/mellanox/mlx4/fw.c
+++ b/drivers/net/ethernet/mellanox/mlx4/fw.c
@@ -57,12 +57,12 @@ enum {
 #define MLX4_GET(dest, source, offset)				      \
 	do {							      \
 		void *__p = (char *) (source) + (offset);	      \
-		u64 val;                                              \
-		switch (sizeof(dest)) {			      \
+		__be64 val;                                           \
+		switch (sizeof(dest)) {				      \
 		case 1: (dest) = *(u8 *) __p;	    break;	      \
 		case 2: (dest) = be16_to_cpup(__p); break;	      \
 		case 4: (dest) = be32_to_cpup(__p); break;	      \
-		case 8: val = get_unaligned((u64 *)__p);              \
+		case 8: val = get_unaligned((__be64 *)__p);           \
 			(dest) = be64_to_cpu(val);  break;            \
 		default: __buggy_use_of_MLX4_GET();		      \
 		}						      \
-- 
1.8.3.1

^ permalink raw reply related

* [PATCH net-next 3/3] net/mlx4_en: Use __force to fix a sparse warning in TX datapath
From: Tariq Toukan @ 2017-10-09 13:59 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Eran Ben Elisha, Tariq Toukan
In-Reply-To: <1507557590-17747-1-git-send-email-tariqt@mellanox.com>

In TX data-path, we intentionally do not byte-swap, as documented
in code and in the cited commit log.
This fixes sparse warning:
en_tx.c:720:23: warning: incorrect type in argument 1 (different base types)
en_tx.c:720:23:    expected unsigned int [unsigned] [usertype] <noident>
en_tx.c:720:23:    got restricted __be32 [usertype] doorbell_qpn

Fixes: 492f5add4be8 ("net/mlx4_en: Doorbell is byteswapped in Little Endian archs")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx4/en_tx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
index 8a32a8f7f9c0..2cc82dc07397 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
@@ -718,7 +718,7 @@ void mlx4_en_xmit_doorbell(struct mlx4_en_tx_ring *ring)
 #else
 	iowrite32be(
 #endif
-		  ring->doorbell_qpn,
+		  (__force u32)ring->doorbell_qpn,
 		  ring->bf.uar->map + MLX4_SEND_DOORBELL);
 }

-- 
1.8.3.1

^ permalink raw reply related

* [PATCH net-next 1/3] net/mlx4: Fix endianness issue in qp context params
From: Tariq Toukan @ 2017-10-09 13:59 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Eran Ben Elisha, Tariq Toukan
In-Reply-To: <1507557590-17747-1-git-send-email-tariqt@mellanox.com>

Should take care of the endianness before assigning to params2 field.

Fixes: 53f33ae295a5 ("net/mlx4_core: Port aggregation upper layer interface")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx4/en_resources.c     | 2 +-
 drivers/net/ethernet/mellanox/mlx4/qp.c               | 2 +-
 drivers/net/ethernet/mellanox/mlx4/resource_tracker.c | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_resources.c b/drivers/net/ethernet/mellanox/mlx4/en_resources.c
index 5a47f9669621..6883ac75d37f 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_resources.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_resources.c
@@ -53,7 +53,7 @@ void mlx4_en_fill_qp_context(struct mlx4_en_priv *priv, int size, int stride,
 	if (is_tx) {
 		context->sq_size_stride = ilog2(size) << 3 | (ilog2(stride) - 4);
 		if (mdev->dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_PORT_REMAP)
-			context->params2 |= MLX4_QP_BIT_FPP;
+			context->params2 |= cpu_to_be32(MLX4_QP_BIT_FPP);
 
 	} else {
 		context->sq_size_stride = ilog2(TXBB_SIZE) - 4;
diff --git a/drivers/net/ethernet/mellanox/mlx4/qp.c b/drivers/net/ethernet/mellanox/mlx4/qp.c
index 728a2fb1f5c0..203320923340 100644
--- a/drivers/net/ethernet/mellanox/mlx4/qp.c
+++ b/drivers/net/ethernet/mellanox/mlx4/qp.c
@@ -925,7 +925,7 @@ int mlx4_qp_to_ready(struct mlx4_dev *dev, struct mlx4_mtt *mtt,
 		context->flags &= cpu_to_be32(~(0xf << 28));
 		context->flags |= cpu_to_be32(states[i + 1] << 28);
 		if (states[i + 1] != MLX4_QP_STATE_RTR)
-			context->params2 &= ~MLX4_QP_BIT_FPP;
+			context->params2 &= ~cpu_to_be32(MLX4_QP_BIT_FPP);
 		err = mlx4_qp_modify(dev, mtt, states[i], states[i + 1],
 				     context, 0, 0, qp);
 		if (err) {
diff --git a/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c b/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c
index fabb53379727..04304dd894c6 100644
--- a/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c
+++ b/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c
@@ -3185,7 +3185,7 @@ static int verify_qp_parameters(struct mlx4_dev *dev,
 	optpar	= be32_to_cpu(*(__be32 *) inbox->buf);
 
 	if (slave != mlx4_master_func_num(dev)) {
-		qp_ctx->params2 &= ~MLX4_QP_BIT_FPP;
+		qp_ctx->params2 &= ~cpu_to_be32(MLX4_QP_BIT_FPP);
 		/* setting QP rate-limit is disallowed for VFs */
 		if (qp_ctx->rate_limit_params)
 			return -EPERM;
-- 
1.8.3.1

^ permalink raw reply related

* [PATCH net-next 0/3] Fix mlx4 static checker warnings
From: Tariq Toukan @ 2017-10-09 13:59 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Eran Ben Elisha, Tariq Toukan

Hi Dave,

This patchset contains fixes for static checker warnings
in the mlx4 Core and Eth drivers.

Patch 1 fixes an actual bug discovered by the checker.
Patches 2 and 3 fix the warnings without functional changes.

Series generated against net-next commit:
c49c777f9c87 qed: Delete redundant check on dcb_app priority

Thanks,
Tariq.

Tariq Toukan (3):
  net/mlx4: Fix endianness issue in qp context params
  net/mlx4_core: Fix cast warning in fw.c
  net/mlx4_en: Use __force to fix a sparse warning in TX datapath

 drivers/net/ethernet/mellanox/mlx4/en_resources.c     | 2 +-
 drivers/net/ethernet/mellanox/mlx4/en_tx.c            | 2 +-
 drivers/net/ethernet/mellanox/mlx4/fw.c               | 6 +++---
 drivers/net/ethernet/mellanox/mlx4/qp.c               | 2 +-
 drivers/net/ethernet/mellanox/mlx4/resource_tracker.c | 2 +-
 5 files changed, 7 insertions(+), 7 deletions(-)

-- 
1.8.3.1

^ permalink raw reply

* Re: [pull request][for-next 0/9] Mellanox, mlx5 updates 2017-10-06
From: Doug Ledford @ 2017-10-09 13:47 UTC (permalink / raw)
  To: Saeed Mahameed, David S. Miller; +Cc: netdev, linux-rdma, Leon Romanovsky
In-Reply-To: <20171006233749.25545-1-saeedm@mellanox.com>

On Fri, 2017-10-06 at 16:37 -0700, Saeed Mahameed wrote:
> The following changes since commit
> e19b205be43d11bff638cad4487008c48d21c103:
> 
>   Linux 4.14-rc2 (2017-09-24 16:38:56 -0700)

Thanks for keeping the base at rc2 like I requested.  Pulled.

-- 
Doug Ledford <dledford@redhat.com>
    GPG KeyID: B826A3330E572FDD
    Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD

^ permalink raw reply

* Re: [net-next V5 PATCH 1/5] bpf: introduce new bpf cpu map type BPF_MAP_TYPE_CPUMAP
From: Daniel Borkmann @ 2017-10-09 13:40 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, netdev
  Cc: jakub.kicinski, Michael S. Tsirkin, pavel.odintsov, Jason Wang,
	mchan, John Fastabend, peter.waskiewicz.jr, Daniel Borkmann,
	Alexei Starovoitov, Andy Gospodarek
In-Reply-To: <59DB7A29.5050906@iogearbox.net>

On 10/09/2017 03:31 PM, Daniel Borkmann wrote:
> On 10/06/2017 06:12 PM, Jesper Dangaard Brouer wrote:
[...]
>> +    /* Pre-limit array size based on NR_CPUS, not final CPU check */
>> +    if (cmap->map.max_entries > NR_CPUS)
>
> Nit: needs to be >= NR_CPUS.

Scratch that comment, you bail out on key_cpu >= cmap->map.max_entries
in the other handlers, so that's fine.

^ permalink raw reply

* Re: [net-next V5 PATCH 1/5] bpf: introduce new bpf cpu map type BPF_MAP_TYPE_CPUMAP
From: Daniel Borkmann @ 2017-10-09 13:31 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, netdev
  Cc: jakub.kicinski, Michael S. Tsirkin, pavel.odintsov, Jason Wang,
	mchan, John Fastabend, peter.waskiewicz.jr, Daniel Borkmann,
	Alexei Starovoitov, Andy Gospodarek
In-Reply-To: <150730636196.22839.17119032803741721925.stgit@firesoul>

On 10/06/2017 06:12 PM, Jesper Dangaard Brouer wrote:
[...]
> +static struct bpf_map *cpu_map_alloc(union bpf_attr *attr)
> +{
> +	struct bpf_cpu_map *cmap;
> +	int err = -ENOMEM;

err init here is basically not needed since overriden later anyway
w/o being read, but ...

> +	u64 cost;
> +
> +	if (!capable(CAP_SYS_ADMIN))
> +		return ERR_PTR(-EPERM);
> +
> +	/* check sanity of attributes */
> +	if (attr->max_entries == 0 || attr->key_size != 4 ||
> +	    attr->value_size != 4 || attr->map_flags & ~BPF_F_NUMA_NODE)
> +		return ERR_PTR(-EINVAL);
> +
> +	cmap = kzalloc(sizeof(*cmap), GFP_USER);
> +	if (!cmap)
> +		return ERR_PTR(-ENOMEM);
> +
> +	/* mandatory map attributes */
> +	cmap->map.map_type = attr->map_type;
> +	cmap->map.key_size = attr->key_size;
> +	cmap->map.value_size = attr->value_size;
> +	cmap->map.max_entries = attr->max_entries;
> +	cmap->map.map_flags = attr->map_flags;
> +	cmap->map.numa_node = bpf_map_attr_numa_node(attr);
> +
> +	/* Pre-limit array size based on NR_CPUS, not final CPU check */
> +	if (cmap->map.max_entries > NR_CPUS)

Nit: needs to be >= NR_CPUS.

> +		return ERR_PTR(-E2BIG);
> +
> +	/* make sure page count doesn't overflow */
> +	cost = (u64) cmap->map.max_entries * sizeof(struct bpf_cpu_map_entry *);
> +	cost += cpu_map_bitmap_size(attr) * num_possible_cpus();
> +	if (cost >= U32_MAX - PAGE_SIZE)
> +		goto free_cmap;
> +	cmap->map.pages = round_up(cost, PAGE_SIZE) >> PAGE_SHIFT;
> +
> +	/* Notice returns -EPERM on if map size is larger than memlock limit */
> +	err = bpf_map_precharge_memlock(cmap->map.pages);
> +	if (err)
> +		goto free_cmap;

... here, you need to set err = -ENOMEM.

> +	/* A per cpu bitfield with a bit per possible CPU in map  */
> +	cmap->flush_needed = __alloc_percpu(cpu_map_bitmap_size(attr),
> +					    __alignof__(unsigned long));
> +	if (!cmap->flush_needed)
> +		goto free_cmap;

Otherwise when we fail here or in error case for bpf_map_area_alloc()
below, we still return 0 although it's really -ENOMEM. And returning 0,
would mean that find_and_alloc_map() will miss this since it only tests
for IS_ERR(), and we'll crash later on thinking we have a valid map
pointer.

> +	/* Alloc array for possible remote "destination" CPUs */
> +	cmap->cpu_map = bpf_map_area_alloc(cmap->map.max_entries *
> +					   sizeof(struct bpf_cpu_map_entry *),
> +					   cmap->map.numa_node);
> +	if (!cmap->cpu_map)
> +		goto free_cmap;
> +
> +	return &cmap->map;
> +free_cmap:
> +	free_percpu(cmap->flush_needed);
> +	kfree(cmap);
> +	return ERR_PTR(err);
> +}
> +
[...]
> +int cpu_map_update_elem(struct bpf_map *map, void *key, void *value,
> +				u64 map_flags)
> +{
> +	struct bpf_cpu_map *cmap = container_of(map, struct bpf_cpu_map, map);
> +	struct bpf_cpu_map_entry *rcpu;
> +
> +	/* Array index key correspond to CPU number */
> +	u32 key_cpu = *(u32 *)key;
> +	/* Value is the queue size */
> +	u32 qsize = *(u32 *)value;
> +
> +	/* Make sure CPU is a valid possible cpu */
> +	if (!cpu_possible(key_cpu))
> +		return -ENODEV;

Nit: cpu_possible() expects that key_cpu < NR_CPUS, otherwise you'd
access the bitmap out of bounds.

Better move the below test for 'key_cpu >= cmap->map.max_entries'
first as on map alloc you enforce upper limit of NR_CPUS on the
max_entries, then above cpu_possible() test will be valid, too.

> +	if (unlikely(map_flags > BPF_EXIST))
> +		return -EINVAL;
> +	if (unlikely(key_cpu >= cmap->map.max_entries))
> +		return -E2BIG;
> +	if (unlikely(map_flags == BPF_NOEXIST))
> +		return -EEXIST;
> +	if (unlikely(qsize > 16384)) /* sanity limit on qsize */
> +		return -EOVERFLOW;
> +
> +	if (qsize == 0) {
> +		rcpu = NULL; /* Same as deleting */
> +	} else {
> +		/* Updating qsize cause re-allocation of bpf_cpu_map_entry */
> +		rcpu = __cpu_map_entry_alloc(qsize, key_cpu, map->id);
> +		if (!rcpu)
> +			return -ENOMEM;
> +	}
> +	rcu_read_lock();
> +	__cpu_map_entry_replace(cmap, key_cpu, rcpu);
> +	rcu_read_unlock();
> +	return 0;
> +}
[...]
> +struct bpf_cpu_map_entry *__cpu_map_lookup_elem(struct bpf_map *map, u32 key)
> +{
> +	struct bpf_cpu_map *cmap = container_of(map, struct bpf_cpu_map, map);
> +	struct bpf_cpu_map_entry *rcpu;
> +
> +	if (key >= map->max_entries)
> +		return NULL;
> +
> +	rcpu = READ_ONCE(cmap->cpu_map[key]);
> +	return rcpu;
> +}
> +
> +static void *cpu_map_lookup_elem(struct bpf_map *map, void *key)
> +{
> +	struct bpf_cpu_map_entry *rcpu =
> +		__cpu_map_lookup_elem(map, *(u32 *)key);
> +
> +	return rcpu ? &rcpu->qsize : NULL;

I still think from my prior email/comment that we should use per-cpu
scratch buffer here. Would be nice to keep the guarantee that noone
can modify it, it's just a tiny change.

> +}
> +
> +static int cpu_map_get_next_key(struct bpf_map *map, void *key, void *next_key)
> +{
> +	struct bpf_cpu_map *cmap = container_of(map, struct bpf_cpu_map, map);
> +	u32 index = key ? *(u32 *)key : U32_MAX;
> +	u32 *next = next_key;
> +
> +	if (index >= cmap->map.max_entries) {
> +		*next = 0;
> +		return 0;
> +	}
> +
> +	if (index == cmap->map.max_entries - 1)
> +		return -ENOENT;
> +	*next = index + 1;
> +	return 0;
> +}
> +
> +const struct bpf_map_ops cpu_map_ops = {
> +	.map_alloc		= cpu_map_alloc,
> +	.map_free		= cpu_map_free,
> +	.map_delete_elem	= cpu_map_delete_elem,
> +	.map_update_elem	= cpu_map_update_elem,
> +	.map_lookup_elem	= cpu_map_lookup_elem,
> +	.map_get_next_key	= cpu_map_get_next_key,
> +};

^ permalink raw reply

* Re: [PATCH 10/13] timer: Remove expires and data arguments from DEFINE_TIMER
From: Ralf Baechle @ 2017-10-09 13:27 UTC (permalink / raw)
  To: Kees Cook
  Cc: Thomas Gleixner, Andrew Morton, Arnd Bergmann,
	Benjamin Herrenschmidt, Chris Metcalf, Geert Uytterhoeven,
	Greg Kroah-Hartman, Guenter Roeck, Harish Patil, Heiko Carstens,
	James E.J. Bottomley, John Stultz, Julian Wiedmann, Kalle Valo,
	Lai Jiangshan, Len Brown, Manish Chopra, Mark Gross,
	"Martin K. Petersen"
In-Reply-To: <1507159627-127660-11-git-send-email-keescook@chromium.org>

On Wed, Oct 04, 2017 at 04:27:04PM -0700, Kees Cook wrote:

> Subject: [PATCH 10/13] timer: Remove expires and data arguments from
>  DEFINE_TIMER
> 
> Drop the arguments from the macro and adjust all callers with the
> following script:
> 
>   perl -pi -e 's/DEFINE_TIMER\((.*), 0, 0\);/DEFINE_TIMER($1);/g;' \
>     $(git grep DEFINE_TIMER | cut -d: -f1 | sort -u | grep -v timer.h)
> 
> Signed-off-by: Kees Cook <keescook@chromium.org>
> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> # for m68k parts
> ---
>  arch/arm/mach-ixp4xx/dsmg600-setup.c      | 2 +-
>  arch/arm/mach-ixp4xx/nas100d-setup.c      | 2 +-
>  arch/m68k/amiga/amisound.c                | 2 +-
>  arch/m68k/mac/macboing.c                  | 2 +-
>  arch/mips/mti-malta/malta-display.c       | 2 +-
>  arch/parisc/kernel/pdc_cons.c             | 2 +-
>  arch/s390/mm/cmm.c                        | 2 +-
>  drivers/atm/idt77105.c                    | 4 ++--
>  drivers/atm/iphase.c                      | 2 +-
>  drivers/block/ataflop.c                   | 8 ++++----
>  drivers/char/dtlk.c                       | 2 +-
>  drivers/char/hangcheck-timer.c            | 2 +-
>  drivers/char/nwbutton.c                   | 2 +-
>  drivers/char/rtc.c                        | 2 +-
>  drivers/input/touchscreen/s3c2410_ts.c    | 2 +-
>  drivers/net/cris/eth_v10.c                | 6 +++---
>  drivers/net/hamradio/yam.c                | 2 +-
>  drivers/net/wireless/atmel/at76c50x-usb.c | 2 +-
>  drivers/staging/speakup/main.c            | 2 +-
>  drivers/staging/speakup/synth.c           | 2 +-
>  drivers/tty/cyclades.c                    | 2 +-
>  drivers/tty/isicom.c                      | 2 +-
>  drivers/tty/moxa.c                        | 2 +-
>  drivers/tty/rocket.c                      | 2 +-
>  drivers/tty/vt/keyboard.c                 | 2 +-
>  drivers/tty/vt/vt.c                       | 2 +-
>  drivers/watchdog/alim7101_wdt.c           | 2 +-
>  drivers/watchdog/machzwd.c                | 2 +-
>  drivers/watchdog/mixcomwd.c               | 2 +-
>  drivers/watchdog/sbc60xxwdt.c             | 2 +-
>  drivers/watchdog/sc520_wdt.c              | 2 +-
>  drivers/watchdog/via_wdt.c                | 2 +-
>  drivers/watchdog/w83877f_wdt.c            | 2 +-
>  drivers/xen/grant-table.c                 | 2 +-
>  fs/pstore/platform.c                      | 2 +-
>  include/linux/timer.h                     | 4 ++--
>  kernel/irq/spurious.c                     | 2 +-
>  lib/random32.c                            | 2 +-
>  net/atm/mpc.c                             | 2 +-
>  net/decnet/dn_route.c                     | 2 +-
>  net/ipv6/ip6_flowlabel.c                  | 2 +-
>  net/netrom/nr_loopback.c                  | 2 +-
>  security/keys/gc.c                        | 2 +-
>  sound/oss/midibuf.c                       | 2 +-
>  sound/oss/soundcard.c                     | 2 +-
>  sound/oss/sys_timer.c                     | 2 +-
>  sound/oss/uart6850.c                      | 2 +-
>  47 files changed, 54 insertions(+), 54 deletions(-)

Acked-by: Ralf Baechle <ralf@linux-mips.org>

Thanks,

  Ralf

^ permalink raw reply

* Re: [PATCH 09/13] timer: Remove users of expire and data arguments to DEFINE_TIMER
From: Ralf Baechle @ 2017-10-09 13:23 UTC (permalink / raw)
  To: Kees Cook
  Cc: Thomas Gleixner, Wim Van Sebroeck, Guenter Roeck,
	Geert Uytterhoeven, linux-mips, linux-watchdog, Andrew Morton,
	Arnd Bergmann, Benjamin Herrenschmidt, Chris Metcalf,
	Greg Kroah-Hartman, Harish Patil, Heiko Carstens,
	James E.J. Bottomley, John Stultz, Julian Wiedmann, Kalle Valo,
	Lai Jiangshan, Len Brown, Manis
In-Reply-To: <1507159627-127660-10-git-send-email-keescook@chromium.org>

On Wed, Oct 04, 2017 at 04:27:03PM -0700, Kees Cook wrote:

> Subject: [PATCH 09/13] timer: Remove users of expire and data arguments to
>  DEFINE_TIMER
> 
> The expire and data arguments of DEFINE_TIMER are only used in two places
> and are ignored by the code (malta-display.c only uses mod_timer(),
> never add_timer(), so the preset expires value is ignored). Set both
> sets of arguments to zero.
> 
> Cc: Ralf Baechle <ralf@linux-mips.org>
> Cc: Wim Van Sebroeck <wim@iguana.be>
> Cc: Guenter Roeck <linux@roeck-us.net>
> Cc: Geert Uytterhoeven <geert@linux-m68k.org>
> Cc: linux-mips@linux-mips.org
> Cc: linux-watchdog@vger.kernel.org
> Signed-off-by: Kees Cook <keescook@chromium.org>
> ---
>  arch/mips/mti-malta/malta-display.c | 6 +++---
>  drivers/watchdog/alim7101_wdt.c     | 4 ++--
>  2 files changed, 5 insertions(+), 5 deletions(-)

For malta-display:

Acked-by: Ralf Baechle <ralf@linux-mips.org>

  Ralf

^ permalink raw reply

* Re: [PATCH net-next] ipv6: avoid zeroing per cpu data again
From: Tejun Heo @ 2017-10-09 13:22 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, netdev, Martin KaFai Lau
In-Reply-To: <1507554097.31614.11.camel@edumazet-glaptop3.roam.corp.google.com>

On Mon, Oct 09, 2017 at 06:01:37AM -0700, Eric Dumazet wrote:
> From: Eric Dumazet <edumazet@google.com>
> 
> per cpu allocations are already zeroed, no need to clear them again.
> 
> Fixes: d52d3997f843f ("ipv6: Create percpu rt6_info")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Martin KaFai Lau <kafai@fb.com>
> Cc: Tejun Heo <tj@kernel.org>

Acked-by: Tejun Heo <tj@kernel.org>

Thanks.

-- 
tejun

^ permalink raw reply

* [PATCH] thunderbolt: Initialize Thunderbolt bus earlier
From: Mika Westerberg @ 2017-10-09 13:22 UTC (permalink / raw)
  To: David S . Miller
  Cc: Andreas Noever, Michael Jamet, Yehezkel Bernat, kernel test robot,
	Greg Kroah-Hartman, Andy Shevchenko, Mika Westerberg, netdev,
	linux-kernel

The 0day kbuild robot reports following crash:

  BUG: unable to handle kernel NULL pointer dereference at 00000004
  IP: tb_property_find+0xe/0x41
  *pde = 00000000
  Oops: 0000 [#1]
  CPU: 0 PID: 1 Comm: swapper Not tainted 4.14.0-rc1-00741-ge69b6c0 #412
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
  task: 89c80000 task.stack: 89c7c000
  EIP: tb_property_find+0xe/0x41
  EFLAGS: 00210246 CPU: 0
  EAX: 00000000 EBX: 7a368f47 ECX: 00000044 EDX: 7a368f47
  ESI: 8851d340 EDI: 7a368f47 EBP: 89c7df0c ESP: 89c7defc
   DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
  CR0: 80050033 CR2: 00000004 CR3: 027a2000 CR4: 00000690
  Call Trace:
   tb_register_property_dir+0x49/0xb9
   ? cdc_mbim_driver_init+0x1b/0x1b
   tbnet_init+0x77/0x9f
   ? cdc_mbim_driver_init+0x1b/0x1b
   do_one_initcall+0x7e/0x145
   ? parse_args+0x10c/0x1b3
   ? kernel_init_freeable+0xbe/0x159
   kernel_init_freeable+0xd1/0x159
   ? rest_init+0x110/0x110
   kernel_init+0xd/0xd0
   ret_from_fork+0x19/0x30

The reason is that both Thunderbolt bus and thunderbolt-net are build
into the kernel image, and the latter is linked first because
drivers/net comes before drivers/thunderbolt. Since both use
module_init() thunderbolt-net ends up calling Thunderbolt bus functions
too early triggering the above crash.

Fix this by moving Thunderbolt bus initialization to happen earlier to
make sure all the data structures are ready when Thunderbolt service
drivers are initialized. To be on the safe side also add a check for
properly initialized xdomain_property_dir to tb_register_property_dir().

Reported-by: kernel test robot <fengguang.wu@intel.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
---
Hi David,

This fixes a crash introduced in the Thunderbolt networking patches, so I'm
wondering could you take this to your net-next as well?

Thanks.

 drivers/thunderbolt/nhi.c     | 2 +-
 drivers/thunderbolt/xdomain.c | 3 +++
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/thunderbolt/nhi.c b/drivers/thunderbolt/nhi.c
index 0e79eebfcbb7..419a7a90bce0 100644
--- a/drivers/thunderbolt/nhi.c
+++ b/drivers/thunderbolt/nhi.c
@@ -1144,5 +1144,5 @@ static void __exit nhi_unload(void)
 	tb_domain_exit();
 }
 
-module_init(nhi_init);
+fs_initcall(nhi_init);
 module_exit(nhi_unload);
diff --git a/drivers/thunderbolt/xdomain.c b/drivers/thunderbolt/xdomain.c
index f2d06f6f7be9..138027537d29 100644
--- a/drivers/thunderbolt/xdomain.c
+++ b/drivers/thunderbolt/xdomain.c
@@ -1487,6 +1487,9 @@ int tb_register_property_dir(const char *key, struct tb_property_dir *dir)
 {
 	int ret;
 
+	if (WARN_ON(!xdomain_property_dir))
+		return -EAGAIN;
+
 	if (!key || strlen(key) > 8)
 		return -EINVAL;
 
-- 
2.14.2

^ permalink raw reply related

* Re: [PATCH v2] netfilter: xt_bpf: Fix XT_BPF_MODE_FD_PINNED mode of 'xt_bpf_info_v1'
From: Pablo Neira Ayuso @ 2017-10-09 13:18 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Shmulik Ladkani, netfilter-devel, netdev, Rafael Buchbinder,
	Shmulik Ladkani, Willem de Bruijn
In-Reply-To: <59DB6D22.6080507@iogearbox.net>

On Mon, Oct 09, 2017 at 02:35:46PM +0200, Daniel Borkmann wrote:
> On 10/09/2017 02:27 PM, Shmulik Ladkani wrote:
> >From: Shmulik Ladkani <shmulik.ladkani@gmail.com>
> >
> >Commit 2c16d6033264 ("netfilter: xt_bpf: support ebpf") introduced
> >support for attaching an eBPF object by an fd, with the
> >'bpf_mt_check_v1' ABI expecting the '.fd' to be specified upon each
> >IPT_SO_SET_REPLACE call.
> >
> >However this breaks subsequent iptables calls:
> >
> >  # iptables -A INPUT -m bpf --object-pinned /sys/fs/bpf/xxx -j ACCEPT
> >  # iptables -A INPUT -s 5.6.7.8 -j ACCEPT
> >  iptables: Invalid argument. Run `dmesg' for more information.
> >
> >That's because iptables works by loading exising rules using
> >IPT_SO_GET_ENTRIES to userspace, then issuing IPT_SO_SET_REPLACE with
> >the replacement set.
> >
> >However, the loaded 'xt_bpf_info_v1' has an arbitrary '.fd' number
> >(from the initial "iptables -m bpf" invocation) - so when 2nd invocation
> >occurs, userspace passes a bogus fd number, which leads to
> >'bpf_mt_check_v1' to fail.
> >
> >One suggested solution [1] was to hack iptables userspace, to perform a
> >"entries fixup" immediatley after IPT_SO_GET_ENTRIES, by opening a new,
> >process-local fd per every 'xt_bpf_info_v1' entry seen.
> >
> >However, in [2] both Pablo Neira Ayuso and Willem de Bruijn suggested to
> >depricate the xt_bpf_info_v1 ABI dealing with pinned ebpf objects.
> >
> >This fix changes the XT_BPF_MODE_FD_PINNED behavior to ignore the given
> >'.fd' and instead perform an in-kernel lookup for the bpf object given
> >the provided '.path'.
> >
> >It also defines an alias for the XT_BPF_MODE_FD_PINNED mode, named
> >XT_BPF_MODE_PATH_PINNED, to better reflect the fact that the user is
> >expected to provide the path of the pinned object.
> >
> >Existing XT_BPF_MODE_FD_ELF behavior (non-pinned fd mode) is preserved.
> >
> >References: [1] https://marc.info/?l=netfilter-devel&m=150564724607440&w=2
> >             [2] https://marc.info/?l=netfilter-devel&m=150575727129880&w=2
> >
> >Cc: Pablo Neira Ayuso <pablo@netfilter.org>
> >Cc: Willem de Bruijn <willemb@google.com>
> >Reported-by: Rafael Buchbinder <rafi@rbk.ms>
> >Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
> 
> Acked-by: Daniel Borkmann <daniel@iogearbox.net>

Applied, thanks everyone.

^ permalink raw reply

* Re: [PATCH net-next] net: mvpp2: phylink support
From: Russell King - ARM Linux @ 2017-10-09 13:09 UTC (permalink / raw)
  To: Antoine Tenart
  Cc: davem, andrew, gregory.clement, thomas.petazzoni, miquel.raynal,
	nadavh, linux-kernel, mw, stefanc, netdev
In-Reply-To: <20171009125527.GA24414@kwain>

On Mon, Oct 09, 2017 at 02:55:27PM +0200, Antoine Tenart wrote:
> Hi Russell,
> 
> On Mon, Sep 25, 2017 at 11:55:14AM +0200, Antoine Tenart wrote:
> > On Fri, Sep 22, 2017 at 12:07:31PM +0100, Russell King - ARM Linux wrote:
> > > On Thu, Sep 21, 2017 at 03:45:22PM +0200, Antoine Tenart wrote:
> > 
> > > > +static int mvpp2_phylink_mac_link_state(struct net_device *dev,
> > > > +					struct phylink_link_state *state)
> > > > +{
> > > > +	struct mvpp2_port *port = netdev_priv(dev);
> > > > +	u32 val;
> > > > +
> > > > +	if (!phy_interface_mode_is_rgmii(port->phy_interface) &&
> > > > +	    port->phy_interface != PHY_INTERFACE_MODE_SGMII)
> > > > +		return 0;
> > > 
> > > You're blocking this for 1000base-X and 10G connections, which is not
> > > correct.  The expectation is that this function returns the current
> > > MAC state irrespective of the interface mode.
> > 
> > I moved what was already supported in the PPv2 driver and did not
> > implemented the full set of what is supported. It's not perfect, but it
> > does move what was already supported.
> > 
> > Any reason not to first move what's already supported to phylink, and
> > then add more supported modes in separate patches?
> 
> Any thoughts on this?

You're asking me to comment about something I know little about as
I've not used mvpp2.c.  I don't know the details of what your "already
supported" statement refers to.  Maybe you could give some clues -
maybe produce a list of what mvpp2 currently supports?

Here's the link modes that phylink supports:
1. PHY based links
2. PHYless fixed links with details specified in DT, in the same way as
   the existing "fixed-link" support works, but without needing to create
   fake PHYs.
3. PHYless fixed links with GPIO link indication (again, same way as the
   existing fixed-link support.)
4. Direct fibre connections via fixed-link or SFP.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox