Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH] net: ftgmac100: Request clock and set speed
From: Florian Fainelli @ 2017-10-10  5:04 UTC (permalink / raw)
  To: Joel Stanley, David S . Miller, Benjamin Herrenschmidt
  Cc: netdev, linux-kernel, Andrew Jeffery
In-Reply-To: <20171010044925.21078-1-joel@jms.id.au>



On 10/09/2017 09:49 PM, Joel Stanley wrote:
> According to the ASPEED datasheet, gigabit speeds require a clock of
> 100MHz or higher. Other speeds require 25MHz or higher.
> 
> Signed-off-by: Joel Stanley <joel@jms.id.au>
> ---

> @@ -161,6 +170,9 @@ static int ftgmac100_reset_and_config_mac(struct ftgmac100 *priv)
>  		break;
>  	}
>  
> +	if (freq && priv->clk)
> +		clk_set_rate(priv->clk, freq);

Checking for priv->clk should not be necessary all public clk APIs can
deal with a NULL clock pointer.

> +
>  	/* (Re)initialize the queue pointers */
>  	priv->rx_pointer = 0;
>  	priv->tx_clean_pointer = 0;
> @@ -1775,6 +1787,13 @@ static int ftgmac100_probe(struct platform_device *pdev)
>  	priv->dev = &pdev->dev;
>  	INIT_WORK(&priv->reset_task, ftgmac100_reset_task);
>  
> +	/* Enable clock if present */
> +	priv->clk = devm_clk_get(&pdev->dev, NULL);
> +	if (!IS_ERR(priv->clk))
> +		clk_prepare_enable(priv->clk);
> +	else
> +		priv->clk = NULL;

Same here.

> +
>  	/* map io memory */
>  	priv->res = request_mem_region(res->start, resource_size(res),
>  				       dev_name(&pdev->dev));
> @@ -1883,6 +1902,9 @@ static int ftgmac100_remove(struct platform_device *pdev)
>  
>  	unregister_netdev(netdev);
>  
> +	if (priv->clk)
> +		clk_disable_unprepare(priv->clk);

And here.

> +
>  	/* There's a small chance the reset task will have been re-queued,
>  	 * during stop, make sure it's gone before we free the structure.
>  	 */
> 

-- 
Florian

^ permalink raw reply

* Re: [PATCH] net: dsa: mv88e6xxx: rework in-chip bridging
From: Greg Ungerer @ 2017-10-10  5:02 UTC (permalink / raw)
  To: Florian Fainelli, Vivien Didelot; +Cc: netdev@vger.kernel.org, Andrew Lunn
In-Reply-To: <d2e2e8b4-e28f-a6ea-9031-d5b12f467231@gmail.com>

On 09/10/17 14:00, Florian Fainelli wrote:
> Le 10/08/17 à 20:23, Greg Ungerer a écrit :
>> On 07/10/17 13:04, Florian Fainelli wrote:
>>> Le 10/03/17 à 23:20, Greg Ungerer a écrit :
>>>> On Wed, Mar 29, 2017 at 04:30:16PM -0400, Vivien Didelot wrote:
>>>>> All ports -- internal and external, for chips featuring a PVT -- have a
>>>>> mask restricting to which internal ports a frame is allowed to egress.
>>>>>
>>>>> Now that DSA exposes the number of ports and their bridge devices, it is
>>>>> possible to extract the code generating the VLAN map and make it generic
>>>>> so that it can be shared later with the cross-chip bridging code.
>>>>
>>>> This patch changes the behavior of interfaces on startup if they are
>>>> not part of a bridge.
>>>>
>>>> I have a board with a Marvell 6350 switch with a device tree that sets
>>>> up the 5 ports as lan1, lan2, lan3, lan4, wan. With kernels before
>>>> this patch (so linux-4.12 and older) after system startup I could do:
>>>>
>>>>   ifconfig lan1 192.168.0.1
>>>>
>>>> And then ping out that interface with no problems.
>>>>
>>>> After this patch is applied (effects linux-4.13 and newer) then the
>>>> ping fails:
>>>>
>>>>   PING 192.168.0.22 (192.168.0.22) 56(84) bytes of data.
>>>>   From 192.168.0.1 icmp_seq=1 Destination Host Unreachable
>>>>   From 192.168.0.1 icmp_seq=2 Destination Host Unreachable
>>>>   From 192.168.0.1 icmp_seq=3 Destination Host Unreachable
>>>>
>>>> If I incorporate an interface into a bridge then it all works ok.
>>>> So simply:
>>>>
>>>>   brctl addbr br0
>>>>   brctl addif br0 lan1
>>>>   ifconfig lan1 up
>>>>   ifconfig br0 192.168.0.1
>>>>
>>>> Then pings out work as expected. And if I now remove that lan1
>>>> interface from the bridge and use it alone again then it will
>>>> now work ok:
>>>>
>>>>   ifconfig br0 down
>>>>   brctl delif br0 lan1
>>>>   ifconfig lan1 192.168.0.1
>>>>
>>>> And that now pings ok.
>>>>
>>>> I fixed this with the attached patch. It is probably not the correct
>>>> approach, but it does restore the older behavior.
>>>>
>>>> What do you think?
>>>
>>> This is strange, the dsa_switch_tree and its associated dsa_switch
>>> instances should be fully setup by the time ops->setup() is running in
>>> your driver but your patch suggests this may not be happening?
>>
>> That is what I am seeing, yep.
>>
>>
>>> Are you using the new style Device Tree binding or the old style Device
>>> Tree binding out of curiosity?
>>
>> This is my device tree fragment for the switch:
>>
>>         dsa@0 {
>>                 compatible = "marvell,dsa";
>>                 #address-cells = <2>;
>>                 #size-cells = <0>;
>>
>>                 dsa,ethernet = <&eth0>;
>>                 dsa,mii-bus = <&mdio>;
>>
>>                 switch@0 {
>>                         #address-cells = <1>;
>>                         #size-cells = <0>;
>>                         reg = <0x11 0>;
>>
>>                         port@0 {
>>                                 reg = <0>;
>>                                 label = "lan1";
>>                         };
>>                         port@1 {
>>                                 reg = <1>;
>>                                 label = "lan2";
>>                         };
>>                         port@2 {
>>                                 reg = <2>;
>>                                 label = "lan3";
>>                         };
>>                         port@3 {
>>                                 reg = <3>;
>>                                 label = "lan4";
>>                         };
>>                         port@4 {
>>                                 reg = <4>;
>>                                 label = "wan";
>>                         };
>>                         port@5 {
>>                                 reg = <5>;
>>                                 label = "cpu";
>>                         };
>>                 };
>>          };
>>
>> The board I am using is based around an Marvell Armada 370. This device tree
>> setup looks pretty similar to the other Marvell boards using marvell,dsa.
> 
> This is the old Device Tree binding which goes through an unfortunately
> different code path while initializing all the dsa_switch_tree and
> dsa_switch structures, while we should definitively look into fixing
> this, would you mind trying to update your board using something similar
> to this commit:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=4cb2ec8cad8c82cd7cfd19edcacd846861d6e703

Ok, converted the devicetree to use this form of dsa setup.
Does not show the problem, on first boot lan1 works properly.


> This would make you go through net/dsa/dsa2.c which is what most of us
> usually test. In the meantime we should probably start issuing warning
> messages when people use the old Device Tree binding to encourage them
> to migrate other.

Yeah, I had not noticed that the devicetree dsa setup had changed, I
have been using my devicetree for quite a few kernel versions.

Regards
Greg

^ permalink raw reply

* [PATCH] net: ftgmac100: Request clock and set speed
From: Joel Stanley @ 2017-10-10  4:49 UTC (permalink / raw)
  To: David S . Miller, Benjamin Herrenschmidt
  Cc: netdev, linux-kernel, Andrew Jeffery

According to the ASPEED datasheet, gigabit speeds require a clock of
100MHz or higher. Other speeds require 25MHz or higher.

Signed-off-by: Joel Stanley <joel@jms.id.au>
---
 drivers/net/ethernet/faraday/ftgmac100.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/drivers/net/ethernet/faraday/ftgmac100.c b/drivers/net/ethernet/faraday/ftgmac100.c
index 9ed8e4b81530..870ebd857978 100644
--- a/drivers/net/ethernet/faraday/ftgmac100.c
+++ b/drivers/net/ethernet/faraday/ftgmac100.c
@@ -21,6 +21,7 @@
 
 #define pr_fmt(fmt)	KBUILD_MODNAME ": " fmt
 
+#include <linux/clk.h>
 #include <linux/dma-mapping.h>
 #include <linux/etherdevice.h>
 #include <linux/ethtool.h>
@@ -59,6 +60,9 @@
 /* Min number of tx ring entries before stopping queue */
 #define TX_THRESHOLD		(MAX_SKB_FRAGS + 1)
 
+#define FTGMAC_100MHZ		100000000
+#define FTGMAC_25MHZ		25000000
+
 struct ftgmac100 {
 	/* Registers */
 	struct resource *res;
@@ -96,6 +100,7 @@ struct ftgmac100 {
 	struct napi_struct napi;
 	struct work_struct reset_task;
 	struct mii_bus *mii_bus;
+	struct clk *clk;
 
 	/* Link management */
 	int cur_speed;
@@ -142,18 +147,22 @@ static int ftgmac100_reset_mac(struct ftgmac100 *priv, u32 maccr)
 static int ftgmac100_reset_and_config_mac(struct ftgmac100 *priv)
 {
 	u32 maccr = 0;
+	int freq = 0;
 
 	switch (priv->cur_speed) {
 	case SPEED_10:
 	case 0: /* no link */
+		freq = FTGMAC_25MHZ;
 		break;
 
 	case SPEED_100:
 		maccr |= FTGMAC100_MACCR_FAST_MODE;
+		freq = FTGMAC_25MHZ;
 		break;
 
 	case SPEED_1000:
 		maccr |= FTGMAC100_MACCR_GIGA_MODE;
+		freq = FTGMAC_100MHZ;
 		break;
 	default:
 		netdev_err(priv->netdev, "Unknown speed %d !\n",
@@ -161,6 +170,9 @@ static int ftgmac100_reset_and_config_mac(struct ftgmac100 *priv)
 		break;
 	}
 
+	if (freq && priv->clk)
+		clk_set_rate(priv->clk, freq);
+
 	/* (Re)initialize the queue pointers */
 	priv->rx_pointer = 0;
 	priv->tx_clean_pointer = 0;
@@ -1775,6 +1787,13 @@ static int ftgmac100_probe(struct platform_device *pdev)
 	priv->dev = &pdev->dev;
 	INIT_WORK(&priv->reset_task, ftgmac100_reset_task);
 
+	/* Enable clock if present */
+	priv->clk = devm_clk_get(&pdev->dev, NULL);
+	if (!IS_ERR(priv->clk))
+		clk_prepare_enable(priv->clk);
+	else
+		priv->clk = NULL;
+
 	/* map io memory */
 	priv->res = request_mem_region(res->start, resource_size(res),
 				       dev_name(&pdev->dev));
@@ -1883,6 +1902,9 @@ static int ftgmac100_remove(struct platform_device *pdev)
 
 	unregister_netdev(netdev);
 
+	if (priv->clk)
+		clk_disable_unprepare(priv->clk);
+
 	/* There's a small chance the reset task will have been re-queued,
 	 * during stop, make sure it's gone before we free the structure.
 	 */
-- 
2.14.1

^ permalink raw reply related

* [PATCH iproute2] iplink: new option to set neigh suppression on a bridge port
From: Roopa Prabhu @ 2017-10-10  4:42 UTC (permalink / raw)
  To: stephen; +Cc: netdev, nikolay

From: Roopa Prabhu <roopa@cumulusnetworks.com>

neigh suppression can be used to suppress arp and nd flood
to bridge ports. It maps to the recently added
kernel support for bridge port flag IFLA_BRPORT_NEIGH_SUPPRESS.

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
---
 bridge/link.c            | 13 +++++++++++++
 ip/iplink_bridge_slave.c |  8 ++++++++
 man/man8/bridge.8        |  4 ++++
 3 files changed, 25 insertions(+)

diff --git a/bridge/link.c b/bridge/link.c
index 93472ad..d3a211e 100644
--- a/bridge/link.c
+++ b/bridge/link.c
@@ -198,6 +198,9 @@ int print_linkinfo(const struct sockaddr_nl *who,
 				if (prtb[IFLA_BRPORT_MCAST_FLOOD])
 					print_onoff(fp, "mcast_flood",
 						    rta_getattr_u8(prtb[IFLA_BRPORT_MCAST_FLOOD]));
+				if (prtb[IFLA_BRPORT_NEIGH_SUPPRESS])
+					print_onoff(fp, "neigh_suppress",
+						    rta_getattr_u8(prtb[IFLA_BRPORT_NEIGH_SUPPRESS]));
 			}
 		} else
 			print_portstate(fp, rta_getattr_u8(tb[IFLA_PROTINFO]));
@@ -266,6 +269,7 @@ static int brlink_modify(int argc, char **argv)
 		.ifm.ifi_family = PF_BRIDGE,
 	};
 	char *d = NULL;
+	__s8 neigh_suppress = -1;
 	__s8 learning = -1;
 	__s8 learning_sync = -1;
 	__s8 flood = -1;
@@ -355,6 +359,11 @@ static int brlink_modify(int argc, char **argv)
 			flags |= BRIDGE_FLAGS_SELF;
 		} else if (strcmp(*argv, "master") == 0) {
 			flags |= BRIDGE_FLAGS_MASTER;
+		} else if (strcmp(*argv, "neigh_suppress") == 0) {
+			NEXT_ARG();
+			if (!on_off("neigh_suppress", &neigh_suppress,
+				    *argv))
+				return -1;
 		} else {
 			usage();
 		}
@@ -407,6 +416,10 @@ static int brlink_modify(int argc, char **argv)
 	if (state >= 0)
 		addattr8(&req.n, sizeof(req), IFLA_BRPORT_STATE, state);
 
+	if (neigh_suppress != -1)
+		addattr8(&req.n, sizeof(req), IFLA_BRPORT_NEIGH_SUPPRESS,
+			 neigh_suppress);
+
 	addattr_nest_end(&req.n, nest);
 
 	/* IFLA_AF_SPEC nested attribute. Contains IFLA_BRIDGE_FLAGS that
diff --git a/ip/iplink_bridge_slave.c b/ip/iplink_bridge_slave.c
index 80272b0..fdf8e89 100644
--- a/ip/iplink_bridge_slave.c
+++ b/ip/iplink_bridge_slave.c
@@ -238,6 +238,10 @@ static void bridge_slave_print_opt(struct link_util *lu, FILE *f,
 	if (tb[IFLA_BRPORT_MCAST_FLOOD])
 		_print_onoff(f, "mcast_flood", "mcast_flood",
 			     rta_getattr_u8(tb[IFLA_BRPORT_MCAST_FLOOD]));
+
+	if (tb[IFLA_BRPORT_NEIGH_SUPPRESS])
+		_print_onoff(f, "neigh_suppress", "neigh_suppress",
+			     rta_getattr_u8(tb[IFLA_BRPORT_NEIGH_SUPPRESS]));
 }
 
 static void bridge_slave_parse_on_off(char *arg_name, char *arg_val,
@@ -328,6 +332,10 @@ static int bridge_slave_parse_opt(struct link_util *lu, int argc, char **argv,
 			NEXT_ARG();
 			bridge_slave_parse_on_off("mcast_fast_leave", *argv, n,
 						  IFLA_BRPORT_FAST_LEAVE);
+		} else if (matches(*argv, "neigh_suppress") == 0) {
+			NEXT_ARG();
+			bridge_slave_parse_on_off("neigh_suppress", *argv, n,
+						  IFLA_BRPORT_NEIGH_SUPPRESS);
 		} else if (matches(*argv, "help") == 0) {
 			explain();
 			return -1;
diff --git a/man/man8/bridge.8 b/man/man8/bridge.8
index 9c5f855..fdba0fe 100644
--- a/man/man8/bridge.8
+++ b/man/man8/bridge.8
@@ -323,6 +323,10 @@ switch.
 Controls whether a given port will be flooded with multicast traffic for which there is no MDB entry. By default this flag is on.
 
 .TP
+.BR "neigh_suppress on " or " neigh_suppress off "
+Controls whether neigh discovery (arp and nd) proxy and suppression is enabled on the port. By default this flag is off.
+
+.TP
 .BI self
 link setting is configured on specified physical device
 
-- 
2.1.4

^ permalink raw reply related

* Re: [PATCH net-next] openvswitch: add ct_clear action
From: Pravin Shelar @ 2017-10-10  4:41 UTC (permalink / raw)
  To: Eric Garver; +Cc: Linux Kernel Network Developers, ovs dev
In-Reply-To: <20171006164426.9752-1-e@erig.me>

On Fri, Oct 6, 2017 at 9:44 AM, Eric Garver <e@erig.me> wrote:
> This adds a ct_clear action for clearing conntrack state. ct_clear is
> currently implemented in OVS userspace, but is not backed by an action
> in the kernel datapath. This is useful for flows that may modify a
> packet tuple after a ct lookup has already occurred.
>
> Signed-off-by: Eric Garver <e@erig.me>
Patch mostly looks good. I have following comments.

> ---
>  include/uapi/linux/openvswitch.h |  2 ++
>  net/openvswitch/actions.c        |  5 +++++
>  net/openvswitch/conntrack.c      | 12 ++++++++++++
>  net/openvswitch/conntrack.h      |  7 +++++++
>  net/openvswitch/flow_netlink.c   |  5 +++++
>  5 files changed, 31 insertions(+)
>
> diff --git a/include/uapi/linux/openvswitch.h b/include/uapi/linux/openvswitch.h
> index 156ee4cab82e..1b6e510e2cc6 100644
> --- a/include/uapi/linux/openvswitch.h
> +++ b/include/uapi/linux/openvswitch.h
> @@ -806,6 +806,7 @@ struct ovs_action_push_eth {
>   * packet.
>   * @OVS_ACTION_ATTR_POP_ETH: Pop the outermost Ethernet header off the
>   * packet.
> + * @OVS_ACTION_ATTR_CT_CLEAR: Clear conntrack state from the packet.
>   *
>   * Only a single header can be set with a single %OVS_ACTION_ATTR_SET.  Not all
>   * fields within a header are modifiable, e.g. the IPv4 protocol and fragment
> @@ -835,6 +836,7 @@ enum ovs_action_attr {
>         OVS_ACTION_ATTR_TRUNC,        /* u32 struct ovs_action_trunc. */
>         OVS_ACTION_ATTR_PUSH_ETH,     /* struct ovs_action_push_eth. */
>         OVS_ACTION_ATTR_POP_ETH,      /* No argument. */
> +       OVS_ACTION_ATTR_CT_CLEAR,     /* No argument. */
>
>         __OVS_ACTION_ATTR_MAX,        /* Nothing past this will be accepted
>                                        * from userspace. */
> diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
> index a54a556fcdb5..db9c7f2e662b 100644
> --- a/net/openvswitch/actions.c
> +++ b/net/openvswitch/actions.c
> @@ -1203,6 +1203,10 @@ static int do_execute_actions(struct datapath *dp, struct sk_buff *skb,
>                                 return err == -EINPROGRESS ? 0 : err;
>                         break;
>
> +               case OVS_ACTION_ATTR_CT_CLEAR:
> +                       err = ovs_ct_clear(skb, key);
> +                       break;
> +
>                 case OVS_ACTION_ATTR_PUSH_ETH:
>                         err = push_eth(skb, key, nla_data(a));
>                         break;
> @@ -1210,6 +1214,7 @@ static int do_execute_actions(struct datapath *dp, struct sk_buff *skb,
>                 case OVS_ACTION_ATTR_POP_ETH:
>                         err = pop_eth(skb, key);
>                         break;
> +
>                 }
Unrelated change.

>
>                 if (unlikely(err)) {
> diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
> index d558e882ca0c..f9b73c726ad7 100644
> --- a/net/openvswitch/conntrack.c
> +++ b/net/openvswitch/conntrack.c
> @@ -1129,6 +1129,18 @@ int ovs_ct_execute(struct net *net, struct sk_buff *skb,
>         return err;
>  }
>
> +int ovs_ct_clear(struct sk_buff *skb, struct sw_flow_key *key)
> +{
> +       if (skb_nfct(skb)) {
> +               nf_conntrack_put(skb_nfct(skb));
> +               nf_ct_set(skb, NULL, 0);
Can the new conntract state be appropriate? may be IP_CT_UNTRACKED?

> +       }
> +
> +       ovs_ct_fill_key(skb, key);
> +
I do not see need to refill the key if there is no skb-nf-ct.

> +       return 0;
> +}
> +
>  static int ovs_ct_add_helper(struct ovs_conntrack_info *info, const char *name,
>                              const struct sw_flow_key *key, bool log)
>  {

^ permalink raw reply

* Re: [PATCH 1/3] net/atm: Delete an error message for a failed memory allocation in five functions
From: Joe Perches @ 2017-10-10  4:18 UTC (permalink / raw)
  To: SF Markus Elfring, netdev, Alexey Dobriyan, Andrew Morton,
	Augusto Mecking Caringi, Bhumika Goyal, David S. Miller,
	David Windsor, Elena Reshetova, Hans Liljestrand, Jarod Wilson,
	Johannes Berg, Kees Cook, Mitchell Blank Jr, Roopa Prabhu
  Cc: LKML, kernel-janitors
In-Reply-To: <5270f15b-5e97-0c3e-3e55-fbded48ae07d@users.sourceforge.net>

On Mon, 2017-10-09 at 22:49 +0200, SF Markus Elfring wrote:
> Omit extra messages for a memory allocation failure in these functions.
[]
> diff --git a/net/atm/mpc.c b/net/atm/mpc.c
> @@ -184,10 +184,8 @@ struct atm_mpoa_qos *atm_mpoa_add_qos(__be32 dst_ip, struct atm_qos *qos)
>  	}
>  
>  	entry = kmalloc(sizeof(struct atm_mpoa_qos), GFP_KERNEL);
> -	if (entry == NULL) {
> -		pr_info("mpoa: out of memory\n");
> +	if (!entry)

Unspecified change.  Make sure your changelog is comprehensive.

>  		return entry;
> -	}

 
>  	entry->ipaddr = dst_ip;
>  	entry->qos = *qos;
> @@ -473,10 +471,8 @@ static const uint8_t *copy_macs(struct mpoa_client *mpc,
>  			kfree(mpc->mps_macs);
>  		mpc->number_of_mps_macs = 0;
>  		mpc->mps_macs = kmalloc(num_macs * ETH_ALEN, GFP_KERNEL);
> -		if (mpc->mps_macs == NULL) {
> -			pr_info("(%s) out of mem\n", mpc->dev->name);
> +		if (!mpc->mps_macs)
>  			return NULL;

etc...

^ permalink raw reply

* Re: [PATCH net-next 0/2] net: defer cgroups init to accept()
From: David Miller @ 2017-10-10  3:55 UTC (permalink / raw)
  To: edumazet; +Cc: netdev, eric.dumazet, hannes, tj, jsperbeck
In-Reply-To: <CANn89iKwxY8fiozVvRVhUTO14Cj_DBBoTHrEzYmPE2EwbZL8qw@mail.gmail.com>

From: Eric Dumazet <edumazet@google.com>
Date: Sun, 8 Oct 2017 21:47:49 -0700

> On Sun, Oct 8, 2017 at 9:44 PM, Eric Dumazet <edumazet@google.com> wrote:
>> After TCP 3WHS became lockless, we should not attempt cgroup games
>> from sk_clone_lock() since listener/cgroup might be already gone.
>>
>> Move this business to inet_csk_accept() where we have
>> the guarantee both parent and child exist.
>>
>> Many thanks to John Sperbeck for spotting these issues
>>
>> Eric Dumazet (2):
>>   net: memcontrol: defer call to mem_cgroup_sk_alloc()
>>   net: defer call to cgroup_sk_alloc()
> 
> This was based on net tree, but I used the wrong script, and thus this
> has the [PATCH net-next] tag.
> 
> Sorry for the confusion, but I guess this also can be applied to
> net-next since this is not a recent regression.

Series applied to 'net', thanks.

^ permalink raw reply

* Re: [PATCHv2] Add a driver for Renesas uPD60620 and uPD60620A PHYs
From: David Miller @ 2017-10-10  3:53 UTC (permalink / raw)
  To: bernd.edlinger; +Cc: netdev, andrew, f.fainelli
In-Reply-To: <AM5PR0701MB2657AEF45DB8CDEC543A8A11E4770@AM5PR0701MB2657.eurprd07.prod.outlook.com>

From: Bernd Edlinger <bernd.edlinger@hotmail.de>
Date: Sun, 8 Oct 2017 13:40:08 +0000

> Signed-off-by: Bernd Edlinger <bernd.edlinger@hotmail.de>

Applied to net-next, thanks.

^ permalink raw reply

* Re: [PATCH net-next] openvswitch: add ct_clear action
From: David Miller @ 2017-10-10  3:52 UTC (permalink / raw)
  To: e; +Cc: netdev, dev, pshelar
In-Reply-To: <20171006164426.9752-1-e@erig.me>

From: Eric Garver <e@erig.me>
Date: Fri,  6 Oct 2017 12:44:26 -0400

> This adds a ct_clear action for clearing conntrack state. ct_clear is
> currently implemented in OVS userspace, but is not backed by an action
> in the kernel datapath. This is useful for flows that may modify a
> packet tuple after a ct lookup has already occurred.
> 
> Signed-off-by: Eric Garver <e@erig.me>

Pravin et al., this needs your review.

Thanks.

^ permalink raw reply

* Re: [PATCH RFC 0/3] tun zerocopy stats
From: David Miller @ 2017-10-10  3:52 UTC (permalink / raw)
  To: willemdebruijn.kernel; +Cc: netdev, mst, jasowang, willemb
In-Reply-To: <20171006222516.90654-1-willemdebruijn.kernel@gmail.com>

From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Date: Fri,  6 Oct 2017 18:25:13 -0400

> From: Willem de Bruijn <willemb@google.com>
> 
> Add zerocopy transfer statistics to the vhost_net/tun zerocopy path.
> 
> I've been using this to verify recent changes to zerocopy tuning [1].
> Sharing more widely, as it may be useful in similar future work.
> 
> Use ethtool stats as interface, as these are defined per device
> driver and can easily be extended.
> 
> Make the zerocopy release callback take an extra hop through the tun
> driver to allow the driver to increment its counters.
> 
> Care must be taken to avoid adding an alloc/free to this hot path.
> Since the caller already must allocate a ubuf_info, make it allocate
> two at a time and grant one to the tun device.
> 
>  1/3: introduce ethtool stats (`ethtool -S $DEV`) for tun devices
>  2/3: add zerocopy tx and tx_err counters
>  3/3: convert vhost_net to pass a pair of ubuf_info to tun
> 
> [1] http://patchwork.ozlabs.org/patch/822613/

This looks mostly fine to me, but I don't know enough about how vhost
and tap interact to tell whether this makes sense to upstream.

What are the runtime costs for these new statistics?

^ permalink raw reply

* Re: [PATCH net-next v2] vhost_net: do not stall on zerocopy depletion
From: David Miller @ 2017-10-10  3:47 UTC (permalink / raw)
  To: willemdebruijn.kernel; +Cc: netdev, mst, jasowang, den, virtualization, willemb
In-Reply-To: <20171006172231.87435-1-willemdebruijn.kernel@gmail.com>

From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Date: Fri,  6 Oct 2017 13:22:31 -0400

> From: Willem de Bruijn <willemb@google.com>
> 
> Vhost-net has a hard limit on the number of zerocopy skbs in flight.
> When reached, transmission stalls. Stalls cause latency, as well as
> head-of-line blocking of other flows that do not use zerocopy.
> 
> Instead of stalling, revert to copy-based transmission.
> 
> Tested by sending two udp flows from guest to host, one with payload
> of VHOST_GOODCOPY_LEN, the other too small for zerocopy (1B). The
> large flow is redirected to a netem instance with 1MBps rate limit
> and deep 1000 entry queue.
> 
>   modprobe ifb
>   ip link set dev ifb0 up
>   tc qdisc add dev ifb0 root netem limit 1000 rate 1MBit
> 
>   tc qdisc add dev tap0 ingress
>   tc filter add dev tap0 parent ffff: protocol ip \
>       u32 match ip dport 8000 0xffff \
>       action mirred egress redirect dev ifb0
> 
> Before the delay, both flows process around 80K pps. With the delay,
> before this patch, both process around 400. After this patch, the
> large flow is still rate limited, while the small reverts to its
> original rate. See also discussion in the first link, below.
> 
> Without rate limiting, {1, 10, 100}x TCP_STREAM tests continued to
> send at 100% zerocopy.
> 
> The limit in vhost_exceeds_maxpend must be carefully chosen. With
> vq->num >> 1, the flows remain correlated. This value happens to
> correspond to VHOST_MAX_PENDING for vq->num == 256. Allow smaller
> fractions and ensure correctness also for much smaller values of
> vq->num, by testing the min() of both explicitly. See also the
> discussion in the second link below.
> 
> Changes
>   v1 -> v2
>     - replaced min with typed min_t
>     - avoid unnecessary whitespace change
> 
> Link:http://lkml.kernel.org/r/CAF=yD-+Wk9sc9dXMUq1+x_hh=3ThTXa6BnZkygP3tgVpjbp93g@mail.gmail.com
> Link:http://lkml.kernel.org/r/20170819064129.27272-1-den@klaipeden.com
> Signed-off-by: Willem de Bruijn <willemb@google.com>

Applied, thanks Willem.

^ permalink raw reply

* Re: [PATCHv2 net-next] openvswitch: Add erspan tunnel support.
From: David Miller @ 2017-10-10  3:46 UTC (permalink / raw)
  To: u9012063; +Cc: netdev, pshelar
In-Reply-To: <1507161792-18340-1-git-send-email-u9012063@gmail.com>

From: William Tu <u9012063@gmail.com>
Date: Wed,  4 Oct 2017 17:03:12 -0700

> Add erspan netlink interface for OVS.
> 
> Signed-off-by: William Tu <u9012063@gmail.com>
> Cc: Pravin B Shelar <pshelar@ovn.org>
> ---
> v1->v2: remove unnecessary compat code.

Applied, thanks.

^ permalink raw reply

* Re: [PATCH net-next] once: switch to new jump label API
From: David Miller @ 2017-10-10  3:26 UTC (permalink / raw)
  To: ebiggers3; +Cc: netdev, linux-kernel, hannes, jbaron, peterz, ebiggers
In-Reply-To: <20171009213052.97771-1-ebiggers3@gmail.com>

From: Eric Biggers <ebiggers3@gmail.com>
Date: Mon,  9 Oct 2017 14:30:52 -0700

> From: Eric Biggers <ebiggers@google.com>
> 
> Switch the DO_ONCE() macro from the deprecated jump label API to the new
> one.  The new one is more readable, and for DO_ONCE() it also makes the
> generated code more icache-friendly: now the one-time initialization
> code is placed out-of-line at the jump target, rather than at the inline
> fallthrough case.
> 
> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
> Signed-off-by: Eric Biggers <ebiggers@google.com>

Applied, thank you.

^ permalink raw reply

* Re: [PATCH net-next] ipv6: use rcu_dereference_bh() in ipv6_route_seq_next()
From: David Miller @ 2017-10-10  3:00 UTC (permalink / raw)
  To: weiwan; +Cc: netdev, edumazet, kafai
In-Reply-To: <20171010001726.77046-1-tracywwnj@gmail.com>

From: Wei Wang <weiwan@google.com>
Date: Mon,  9 Oct 2017 17:17:26 -0700

> From: Wei Wang <weiwan@google.com>
> 
> This patch replaces rcu_deference() with rcu_dereference_bh() in
> ipv6_route_seq_next() to avoid the following warning:
 ...
> Fixes: 66f5d6ce53e6 ("ipv6: replace rwlock with rcu and spinlock in fib6_table")
> Reported-by: Xiaolong Ye <xiaolong.ye@intel.com>
> Signed-off-by: Wei Wang <weiwan@google.com>
> Acked-by: Eric Dumazet <edumazet@google.com>

Applied, thanks.

^ permalink raw reply

* Re: [PATCH 1/1] xdp: Sample xdp program implementing ip forward
From: Jacob, Christina @ 2017-10-10  2:24 UTC (permalink / raw)
  To: Daniel Borkmann, netdev@vger.kernel.org
  Cc: linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,
	alexei.starovoitov@gmail.com
In-Reply-To: <DM5PR07MB346826EDCF6C5F2B1287D1578A710@DM5PR07MB3468.namprd07.prod.outlook.com>

Sorry for the late reply. I will include the suggested changes in the next revision of the patch.

Please see inline for clarifications and questions.


Thanks,

Christina


________________________________
From: Daniel Borkmann <daniel@iogearbox.net>
Sent: Tuesday, October 3, 2017 9:24 PM
To: Jacob, Christina; netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org; linux-arm-kernel@lists.infradead.org; alexei.starovoitov@gmail.com
Subject: Re: [PATCH 1/1] xdp: Sample xdp program implementing ip forward

>On 10/03/2017 09:37 AM, cjacob wrote:
>> Implements port to port forwarding with route table and arp table
>> lookup for ipv4 packets using bpf_redirect helper function and
>> lpm_trie  map.
>>
>> Signed-off-by: cjacob <Christina.Jacob@cavium.com>
>
>Thanks for the patch, just few minor comments below!
>
>Note, should be full name, e.g.:
>
>   Signed-off-by: Christina Jacob <Christina.Jacob@cavium.com>
>
>Also you From: only shows 'cjacob' as can be seen from the cover letter
>as well, so perhaps check your git settings to make that full name:
>
>   cjacob (1):
>     xdp: Sample xdp program implementing ip forward
>
>If there's one single patch, then cover letter is not needed, only
>for >1 sets.
>
>[...]
>> +#define KBUILD_MODNAME "foo"
>> +#include <uapi/linux/bpf.h>
>> +#include <linux/in.h>
>> +#include <linux/if_ether.h>
>> +#include <linux/if_packet.h>
>> +#include <linux/if_vlan.h>
>> +#include <linux/ip.h>
>> +#include <linux/ipv6.h>
>> +#include "bpf_helpers.h"
>> +#include <linux/slab.h>
>> +#include <net/ip_fib.h>
>> +
>> +struct trie_value {
>> +     __u8 prefix[4];
>> +     long value;
>> +     int gw;
>> +     int ifindex;
>> +     int metric;
>> +};
>> +
>> +union key_4 {
>> +     u32 b32[2];
>> +     u8 b8[8];
>> +};
>> +
>> +struct arp_entry {
>> +     int dst;
>> +     long mac;
>> +};
>> +
>> +struct direct_map {
>> +     long mac;
>> +     int ifindex;
>> +     struct arp_entry arp;
>> +};
>> +
>> +/* Map for trie implementation*/
>> +struct bpf_map_def SEC("maps") lpm_map = {
>> +     .type = BPF_MAP_TYPE_LPM_TRIE,
>> +     .key_size = 8,
>> +     .value_size =
>> +             sizeof(struct trie_value),
>
>(Nit: there are couple of such breaks throughout the patch, can we
>  just use single line for such cases where reasonable?)
>
>> +     .max_entries = 50,
>> +     .map_flags = BPF_F_NO_PREALLOC,
>> +};
>> +
>> +/* Map for counter*/
>> +struct bpf_map_def SEC("maps") rxcnt = {
>> +     .type = BPF_MAP_TYPE_PERCPU_ARRAY,
>> +     .key_size = sizeof(u32),
>> +     .value_size = sizeof(long),
>> +     .max_entries = 256,
>> +};
>> +
>> +/* Map for ARP table*/
>> +struct bpf_map_def SEC("maps") arp_table = {
>> +     .type = BPF_MAP_TYPE_HASH,
>> +     .key_size = sizeof(int),
>> +     .value_size = sizeof(long),
>
>Perhaps these should be proper structs here, such that it
>becomes easier to read/handle later on lookup.
>

I am not clear about this. I am defining a ebpf map.
I did not understand what structure you are refering to
Am I missing something here?.

>> +     .max_entries = 50,
>> +};
>> +
>> +/* Map to keep the exact match entries in the route table*/
>> +struct bpf_map_def SEC("maps") exact_match = {
>> +     .type = BPF_MAP_TYPE_HASH,
>> +     .key_size = sizeof(int),
>> +     .value_size = sizeof(struct direct_map),
>> +     .max_entries = 50,
>> +};
>> +
>> +/**
>> + * Function to set source and destination mac of the packet
>> + */
>> +static inline void set_src_dst_mac(void *data, void *src, void *dst)
>> +{
>> +     unsigned short *p      = data;
>> +     unsigned short *dest   = dst;
>> +     unsigned short *source = src;
>> +
>> +     p[3] = source[0];
>> +     p[4] = source[1];
>> +     p[5] = source[2];
>> +     p[0] = dest[0];
>> +     p[1] = dest[1];
>> +     p[2] = dest[2];
>
>You could just use __builtin_memcpy() given length is
>constant anyway, so LLVM will do the inlining.
>
>> +}
>> +
>> +/**
>> + * Parse IPV4 packet to get SRC, DST IP and protocol
>> + */
>> +static inline int parse_ipv4(void *data, u64 nh_off, void *data_end,
>> +                          unsigned int *src, unsigned int *dest)
>> +{
>> +     struct iphdr *iph = data + nh_off;
>> +
>> +     if (iph + 1 > data_end)
>> +             return 0;
>> +     *src = (unsigned int)iph->saddr;
>> +     *dest = (unsigned int)iph->daddr;
>
>Why not stay with __be32 types?
>
>> +     return iph->protocol;
>> +}
>> +
>> +SEC("xdp3")
>> +int xdp_prog3(struct xdp_md *ctx)
>> +{
>> +     void *data_end = (void *)(long)ctx->data_end;
>> +     void *data = (void *)(long)ctx->data;
>> +     struct ethhdr *eth = data;
>> +     int rc = XDP_DROP, forward_to;
>> +     long *value;
>> +     struct trie_value *prefix_value;
>> +     long *dest_mac = NULL, *src_mac = NULL;
>> +     u16 h_proto;
>> +     u64 nh_off;
>> +     u32 ipproto;
>> +     union key_4 key4;
>> +
>> +     nh_off = sizeof(*eth);
>> +     if (data + nh_off > data_end)
>> +             return rc;
>> +
>> +     h_proto = eth->h_proto;
>> +
>> +     if (h_proto == htons(ETH_P_8021Q) || h_proto == htons(ETH_P_8021AD)) {
>> +             struct vlan_hdr *vhdr;
>> +
>> +             vhdr = data + nh_off;
>> +             nh_off += sizeof(struct vlan_hdr);
>> +             if (data + nh_off > data_end)
>> +                     return rc;
>> +             h_proto = vhdr->h_vlan_encapsulated_proto;
>> +     }
>> +     if (h_proto == htons(ETH_P_ARP)) {
>> +             return XDP_PASS;
>> +     } else if (h_proto == htons(ETH_P_IP)) {
>> +             int src_ip = 0, dest_ip = 0;
>> +             struct direct_map *direct_entry;
>> +
>> +             ipproto = parse_ipv4(data, nh_off, data_end, &src_ip, &dest_ip);
>> +             direct_entry = (struct direct_map *)bpf_map_lookup_elem
>> +                     (&exact_match, &dest_ip);
>> +             /*check for exact match, this would give a faster lookup*/
>> +             if (direct_entry && direct_entry->mac &&
>> +                 direct_entry->arp.mac) {
>> +                     src_mac = &direct_entry->mac;
>> +                     dest_mac = &direct_entry->arp.mac;
>> +                     forward_to = direct_entry->ifindex;
>> +             } else {
>> +                     /*Look up in the trie for lpm*/
>> +                     // Key for trie
>
>Nit: please check style throughout the patch.
>
>> +                     key4.b32[0] = 32;
>> +                     key4.b8[4] = dest_ip % 0x100;
>> +                     key4.b8[5] = (dest_ip >> 8) % 0x100;
>> +                     key4.b8[6] = (dest_ip >> 16) % 0x100;
>> +                     key4.b8[7] = (dest_ip >> 24) % 0x100;
>> +                     prefix_value =
>> +                             ((struct trie_value *)bpf_map_lookup_elem
>> +                              (&lpm_map, &key4));
>
>For key, please use proper struct bpf_lpm_trie_key, see also
>usage example in tools/testing/selftests/bpf/test_lpm_map.c
>for LPM handling.
>

I am following the way how it is done in the kernel program of other sample programs.
Can we do dynamic memory allocation in ebpf kernel program. I am getting invalid instruction errors in runtime.

>> +                     if (!prefix_value) {
>> +                             return XDP_DROP;
>> +                     } else {
>> +                             src_mac = &prefix_value->value;
>> +                             if (src_mac) {
>> +                                     dest_mac = (long *)bpf_map_lookup_elem
>> +                                             (&arp_table, &dest_ip);
>> +                                     if (!dest_mac) {
>> +                                             if (prefix_value->gw) {
>> +                                                     dest_ip = *(unsigned int *)(&(prefix_value->gw));
>> +                                                     dest_mac = (long *)bpf_map_lookup_elem
>> +                                                             (&arp_table, &dest_ip);
>> +                                             } else {
>> +                                                     return XDP_DROP;
>> +                                             }
>> +                                     }
>> +                                     forward_to = prefix_value->ifindex;
>> +                             } else {
>> +                                     return XDP_DROP;
>> +                             }
>> +                     }
>> +             }
>> +     } else {
>> +             ipproto = 0;
>> +     }
>> +     if (src_mac && dest_mac) {
>> +             set_src_dst_mac(data, src_mac,
>> +                             dest_mac);
>> +             value = bpf_map_lookup_elem
>> +                     (&rxcnt, &ipproto);
>> +             if (value)
>> +                     *value += 1;
>> +             return  bpf_redirect(
>> +                                  forward_to,
>> +                                  0);
>> +     }
>> +     return rc;

^ permalink raw reply

* Re: [PATCH net-next RFC 4/9] net: dsa: mv88e6xxx: add support for event capture
From: Richard Cochran @ 2017-10-10  1:53 UTC (permalink / raw)
  To: Levi Pearson
  Cc: Brandon Streiff, Linux Kernel Network Developers, linux-kernel,
	David S. Miller, Florian Fainelli, Andrew Lunn, Vivien Didelot,
	Erik Hons
In-Reply-To: <CAEYbN3TpMVPVapmLNCPj69pxP9b_1y8VizS3XvccskcWOrpwEw@mail.gmail.com>

On Mon, Oct 09, 2017 at 04:08:50PM -0600, Levi Pearson wrote:
> Another issue related to this is that while the free-running counter
> in the hardware can't be easily adjusted, the periodic event generator
> *can* be finely adjusted (via picosecond and sub-picosecond
> accumulators) to correct for drift between the local clock and the PTP
> grandmaster time. So to be semantically correct, this needs to be both
> started at the right time *and* it needs to have the periodic
> corrections made so that the fine correction parameters in the
> hardware keep it adjusted to be synchronous with PTP grandmaster time.

So if the accumulators are safe to adjust on the fly, then the
adjfine() method will have to program them with every adjustment.

Thanks,
Richard

^ permalink raw reply

* Re: [PATCH v2] net/core: Fix BUG to BUG_ON conditionals.
From: Levin, Alexander (Sasha Levin) @ 2017-10-10  1:14 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Tim Hansen, davem@davemloft.net, willemb@google.com,
	edumazet@google.com, soheil@google.com, pabeni@redhat.com,
	elena.reshetova@intel.com, tom@quantonium.net, Jason@zx2c4.com,
	fw@strlen.de, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org
In-Reply-To: <20171009232355.5mcd3fj7gjhenv25@ast-mbp>

On Mon, Oct 09, 2017 at 04:23:57PM -0700, Alexei Starovoitov wrote:
>On Mon, Oct 09, 2017 at 11:15:40PM +0000, Levin, Alexander (Sasha Levin) wrote:
>> On Mon, Oct 09, 2017 at 04:06:20PM -0700, Alexei Starovoitov wrote:
>> >On Mon, Oct 09, 2017 at 08:26:34PM +0000, Levin, Alexander (Sasha Levin) wrote:
>> >> On Mon, Oct 09, 2017 at 10:15:42AM -0700, Alexei Starovoitov wrote:
>> >> >On Mon, Oct 09, 2017 at 11:37:59AM -0400, Tim Hansen wrote:
>> >> >> Fix BUG() calls to use BUG_ON(conditional) macros.
>> >> >>
>> >> >> This was found using make coccicheck M=net/core on linux next
>> >> >> tag next-2017092
>> >> >>
>> >> >> Signed-off-by: Tim Hansen <devtimhansen@gmail.com>
>> >> >> ---
>> >> >>  net/core/skbuff.c | 15 ++++++---------
>> >> >>  1 file changed, 6 insertions(+), 9 deletions(-)
>> >> >>
>> >> >> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
>> >> >> index d98c2e3ce2bf..34ce4c1a0f3c 100644
>> >> >> --- a/net/core/skbuff.c
>> >> >> +++ b/net/core/skbuff.c
>> >> >> @@ -1350,8 +1350,7 @@ struct sk_buff *skb_copy(const struct sk_buff *skb, gfp_t gfp_mask)
>> >> >>  	/* Set the tail pointer and length */
>> >> >>  	skb_put(n, skb->len);
>> >> >>
>> >> >> -	if (skb_copy_bits(skb, -headerlen, n->head, headerlen + skb->len))
>> >> >> -		BUG();
>> >> >> +	BUG_ON(skb_copy_bits(skb, -headerlen, n->head, headerlen + skb->len));
>> >> >
>> >> >I'm concerned with this change.
>> >> >1. Calling non-trivial bit of code inside the macro is a poor coding style (imo)
>> >> >2. BUG_ON != BUG. Some archs like mips and ppc have HAVE_ARCH_BUG_ON and implementation
>> >> >of BUG and BUG_ON look quite different.
>> >>
>> >> For these archs, wouldn't it then be more efficient to use BUG_ON rather than BUG()?
>> >
>> >why more efficient? any data to prove that?
>>
>> Just guessing.
>>
>> Either way, is there a particular reason for not using BUG_ON() here
>> besides that it's implementation is "quite different"?
>>
>> >I'm pointing that the change is not equivalent and
>> >this code has been around forever (pre-git days), so I see
>> >no reason to risk changing it.
>>
>> Do you know that BUG_ON() is broken on any archs?
>>
>> If not, "this code has been around forever" is really not an excuse to
>> not touch code.
>>
>> If BUG_ON() behavior is broken somewhere, then it needs to get fixed.
>
>no idea whether it's broken. My main objection is #1.
>imo it's a very poor coding style to put functions with
>side-effects into macros. Especially debug/bug/warn-like.

This, however, seems to be an accepted coding style in the net/
subsys. Look a few lines lower in the very same file to find:

	BUG_ON(skb_copy_bits(from, 0, skb_put(to, len), len));

Side effects ahoy ;)

>For example llvm has DEBUG() macro and everything inside
>will disappear depending on compilation flags.
>I wouldn't be surprised if somebody for the name
>of security (to avoid crash on BUG_ON) will replace
>BUG/BUG_ON with some other implementation or nop
>and will have real bugs, since skb_copy_bits() is somehow
>not called or called in different context.

This was already discussed, with the conclusion that BUG() can never
be disabled, since, as you described, it'll lead to very catastrophic
results. See i.e.:

	commit b06dd879f5db33c1d7f5ab516ea671627f99c0c9
	Author: Josh Triplett <josh@joshtriplett.org>
	Date:   Mon Apr 7 15:39:14 2014 -0700

    		x86: always define BUG() and HAVE_ARCH_BUG, even with !CONFIG_BUG


Anyway, as you said, this boils down to coding style nitpicking. I
guess that my only comment here would be that it shpid go one way or
the other and we document the decision somewhere (either per subsys,
or for the entire tree).

-- 

Thanks,
Sasha

^ permalink raw reply

* Re: [net-next 00/14][pull request] 40GbE Intel Wired LAN Driver Updates 2017-10-09
From: David Miller @ 2017-10-10  1:12 UTC (permalink / raw)
  To: jeffrey.t.kirsher; +Cc: netdev, nhorman, sassmann, jogreene
In-Reply-To: <20171009223841.2557-1-jeffrey.t.kirsher@intel.com>

From: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Date: Mon,  9 Oct 2017 15:38:27 -0700

> This series contains updates to i40e and i40evf only.

Pulled, thanks Jeff.

^ permalink raw reply

* Re: [PATCH] timer: Remove meaningless .data/.function assignments
From: David Miller @ 2017-10-10  1:05 UTC (permalink / raw)
  To: keescook
  Cc: devel, gregkh, linux-wireless, linux-kernel, axboe,
	ganesh.krishna, netdev, aditya.shankar, tglx, khc
In-Reply-To: <20171010001032.GA119829@beast>

From: Kees Cook <keescook@chromium.org>
Date: Mon, 9 Oct 2017 17:10:32 -0700

> Several timer users needlessly reset their .function/.data fields during
> their timer callback, but nothing else changes them. Some users do not
> use their .data field at all. Each instance is removed here.
> 
> Cc: Krzysztof Halasa <khc@pm.waw.pl>
> Cc: Aditya Shankar <aditya.shankar@microchip.com>
> Cc: Ganesh Krishna <ganesh.krishna@microchip.com>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Jens Axboe <axboe@fb.com>
> Cc: netdev@vger.kernel.org
> Cc: linux-wireless@vger.kernel.org
> Cc: devel@driverdev.osuosl.org
> Signed-off-by: Kees Cook <keescook@chromium.org>
> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> # for staging
> Acked-by: Krzysztof Halasa <khc@pm.waw.pl> # for wan/hdlc*
> Acked-by: Jens Axboe <axboe@kernel.dk> # for amiflop
> ---
> This should go via the timer/core tree, please. It's been acked by each
> of the maintainers. Thanks!

For networking bits:

Acked-by: David S. Miller <davem@davemloft.net>

^ permalink raw reply

* [PATCH RFC tip/core/rcu 14/15] netfilter: Remove now-redundant smp_read_barrier_depends()
From: Paul E. McKenney @ 2017-10-10  0:22 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, torvalds, mark.rutland, dhowells, linux-arch, peterz,
	will.deacon, Paul E. McKenney, Pablo Neira Ayuso,
	Jozsef Kadlecsik, Florian Westphal, David S. Miller,
	netfilter-devel, coreteam, netdev
In-Reply-To: <20171010001951.GA6476@linux.vnet.ibm.com>

READ_ONCE() now implies smp_read_barrier_depends(), which means that
the instances in arpt_do_table(), ipt_do_table(), and ip6t_do_table()
are now redundant.  This commit removes them and adjusts the comments.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Cc: Florian Westphal <fw@strlen.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: <netfilter-devel@vger.kernel.org>
Cc: <coreteam@netfilter.org>
Cc: <netdev@vger.kernel.org>
---
 net/ipv4/netfilter/arp_tables.c | 7 +------
 net/ipv4/netfilter/ip_tables.c  | 7 +------
 net/ipv6/netfilter/ip6_tables.c | 7 +------
 3 files changed, 3 insertions(+), 18 deletions(-)

diff --git a/net/ipv4/netfilter/arp_tables.c b/net/ipv4/netfilter/arp_tables.c
index 9e2770fd00be..d555b3b31c49 100644
--- a/net/ipv4/netfilter/arp_tables.c
+++ b/net/ipv4/netfilter/arp_tables.c
@@ -202,13 +202,8 @@ unsigned int arpt_do_table(struct sk_buff *skb,
 
 	local_bh_disable();
 	addend = xt_write_recseq_begin();
-	private = table->private;
+	private = READ_ONCE(table->private); /* Address dependency. */
 	cpu     = smp_processor_id();
-	/*
-	 * Ensure we load private-> members after we've fetched the base
-	 * pointer.
-	 */
-	smp_read_barrier_depends();
 	table_base = private->entries;
 	jumpstack  = (struct arpt_entry **)private->jumpstack[cpu];
 
diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c
index 39286e543ee6..f63752bec442 100644
--- a/net/ipv4/netfilter/ip_tables.c
+++ b/net/ipv4/netfilter/ip_tables.c
@@ -260,13 +260,8 @@ ipt_do_table(struct sk_buff *skb,
 	WARN_ON(!(table->valid_hooks & (1 << hook)));
 	local_bh_disable();
 	addend = xt_write_recseq_begin();
-	private = table->private;
+	private = READ_ONCE(table->private); /* Address dependency. */
 	cpu        = smp_processor_id();
-	/*
-	 * Ensure we load private-> members after we've fetched the base
-	 * pointer.
-	 */
-	smp_read_barrier_depends();
 	table_base = private->entries;
 	jumpstack  = (struct ipt_entry **)private->jumpstack[cpu];
 
diff --git a/net/ipv6/netfilter/ip6_tables.c b/net/ipv6/netfilter/ip6_tables.c
index 01bd3ee5ebc6..52afcab9b0d6 100644
--- a/net/ipv6/netfilter/ip6_tables.c
+++ b/net/ipv6/netfilter/ip6_tables.c
@@ -282,12 +282,7 @@ ip6t_do_table(struct sk_buff *skb,
 
 	local_bh_disable();
 	addend = xt_write_recseq_begin();
-	private = table->private;
-	/*
-	 * Ensure we load private-> members after we've fetched the base
-	 * pointer.
-	 */
-	smp_read_barrier_depends();
+	private = READ_ONCE(table->private); /* Address dependency. */
 	cpu        = smp_processor_id();
 	table_base = private->entries;
 	jumpstack  = (struct ip6t_entry **)private->jumpstack[cpu];
-- 
2.5.2

^ permalink raw reply related

* [PATCH RFC tip/core/rcu 03/15] drivers/net/ethernet/qlogic/qed: Fix __qed_spq_block() ordering
From: Paul E. McKenney @ 2017-10-10  0:22 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, torvalds, mark.rutland, dhowells, linux-arch, peterz,
	will.deacon, Paul E. McKenney, Ariel Elior, everest-linux-l2,
	netdev
In-Reply-To: <20171010001951.GA6476@linux.vnet.ibm.com>

The __qed_spq_block() function expects an smp_read_barrier_depends()
to order a prior READ_ONCE() against a later load that does not depend
on the prior READ_ONCE(), an expectation that can fail to be met.
This commit therefore replaces the READ_ONCE() with smp_load_acquire()
and removes the smp_read_barrier_depends().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ariel Elior <Ariel.Elior@cavium.com>
Cc: <everest-linux-l2@cavium.com>
Cc: <netdev@vger.kernel.org>
---
 drivers/net/ethernet/qlogic/qed/qed_spq.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/qlogic/qed/qed_spq.c b/drivers/net/ethernet/qlogic/qed/qed_spq.c
index be48d9abd001..c1237ec58b6c 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_spq.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_spq.c
@@ -97,9 +97,7 @@ static int __qed_spq_block(struct qed_hwfn *p_hwfn,
 
 	while (iter_cnt--) {
 		/* Validate we receive completion update */
-		if (READ_ONCE(comp_done->done) == 1) {
-			/* Read updated FW return value */
-			smp_read_barrier_depends();
+		if (smp_load_acquire(&comp_done->done) == 1) { /* ^^^ */
 			if (p_fw_ret)
 				*p_fw_ret = comp_done->fw_return_code;
 			return 0;
-- 
2.5.2

^ permalink raw reply related

* [PATCH net-next] ipv6: use rcu_dereference_bh() in ipv6_route_seq_next()
From: Wei Wang @ 2017-10-10  0:17 UTC (permalink / raw)
  To: David Miller, netdev; +Cc: Eric Dumazet, Martin KaFai Lau, Wei Wang

From: Wei Wang <weiwan@google.com>

This patch replaces rcu_deference() with rcu_dereference_bh() in
ipv6_route_seq_next() to avoid the following warning:

[   19.431685] WARNING: suspicious RCU usage
[   19.433451] 4.14.0-rc3-00914-g66f5d6c #118 Not tainted
[   19.435509] -----------------------------
[   19.437267] net/ipv6/ip6_fib.c:2259 suspicious
rcu_dereference_check() usage!
[   19.440790]
[   19.440790] other info that might help us debug this:
[   19.440790]
[   19.444734]
[   19.444734] rcu_scheduler_active = 2, debug_locks = 1
[   19.447757] 2 locks held by odhcpd/3720:
[   19.449480]  #0:  (&p->lock){+.+.}, at: [<ffffffffb1231f7d>]
seq_read+0x3c/0x333
[   19.452720]  #1:  (rcu_read_lock_bh){....}, at: [<ffffffffb1d2b984>]
ipv6_route_seq_start+0x5/0xfd
[   19.456323]
[   19.456323] stack backtrace:
[   19.458812] CPU: 0 PID: 3720 Comm: odhcpd Not tainted
4.14.0-rc3-00914-g66f5d6c #118
[   19.462042] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 1.10.2-1 04/01/2014
[   19.465414] Call Trace:
[   19.466788]  dump_stack+0x86/0xc0
[   19.468358]  lockdep_rcu_suspicious+0xea/0xf3
[   19.470183]  ipv6_route_seq_next+0x71/0x164
[   19.471963]  seq_read+0x244/0x333
[   19.473522]  proc_reg_read+0x48/0x67
[   19.475152]  ? proc_reg_write+0x67/0x67
[   19.476862]  __vfs_read+0x26/0x10b
[   19.478463]  ? __might_fault+0x37/0x84
[   19.480148]  vfs_read+0xba/0x146
[   19.481690]  SyS_read+0x51/0x8e
[   19.483197]  do_int80_syscall_32+0x66/0x15a
[   19.484969]  entry_INT80_compat+0x32/0x50
[   19.486707] RIP: 0023:0xf7f0be8e
[   19.488244] RSP: 002b:00000000ffa75d04 EFLAGS: 00000246 ORIG_RAX:
0000000000000003
[   19.491431] RAX: ffffffffffffffda RBX: 0000000000000009 RCX:
0000000008056068
[   19.493886] RDX: 0000000000001000 RSI: 0000000008056008 RDI:
0000000000001000
[   19.496331] RBP: 00000000000001ff R08: 0000000000000000 R09:
0000000000000000
[   19.498768] R10: 0000000000000000 R11: 0000000000000000 R12:
0000000000000000
[   19.501217] R13: 0000000000000000 R14: 0000000000000000 R15:
0000000000000000

Fixes: 66f5d6ce53e6 ("ipv6: replace rwlock with rcu and spinlock in fib6_table")
Reported-by: Xiaolong Ye <xiaolong.ye@intel.com>
Signed-off-by: Wei Wang <weiwan@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
---
 net/ipv6/ip6_fib.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
index 52a29ba32928..c2ecd5ec638a 100644
--- a/net/ipv6/ip6_fib.c
+++ b/net/ipv6/ip6_fib.c
@@ -2262,7 +2262,7 @@ static void *ipv6_route_seq_next(struct seq_file *seq, void *v, loff_t *pos)
 	if (!v)
 		goto iter_table;
 
-	n = rcu_dereference(((struct rt6_info *)v)->dst.rt6_next);
+	n = rcu_dereference_bh(((struct rt6_info *)v)->dst.rt6_next);
 	if (n) {
 		++*pos;
 		return n;
-- 
2.14.2.920.gcf0c67979c-goog

^ permalink raw reply related

* [PATCH] timer: Remove meaningless .data/.function assignments
From: Kees Cook @ 2017-10-10  0:10 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: devel, netdev, linux-wireless, linux-kernel, Jens Axboe,
	Ganesh Krishna, Greg Kroah-Hartman, Aditya Shankar,
	Krzysztof Halasa

Several timer users needlessly reset their .function/.data fields during
their timer callback, but nothing else changes them. Some users do not
use their .data field at all. Each instance is removed here.

Cc: Krzysztof Halasa <khc@pm.waw.pl>
Cc: Aditya Shankar <aditya.shankar@microchip.com>
Cc: Ganesh Krishna <ganesh.krishna@microchip.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jens Axboe <axboe@fb.com>
Cc: netdev@vger.kernel.org
Cc: linux-wireless@vger.kernel.org
Cc: devel@driverdev.osuosl.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> # for staging
Acked-by: Krzysztof Halasa <khc@pm.waw.pl> # for wan/hdlc*
Acked-by: Jens Axboe <axboe@kernel.dk> # for amiflop
---
This should go via the timer/core tree, please. It's been acked by each
of the maintainers. Thanks!
---
 drivers/block/amiflop.c                           | 3 +--
 drivers/net/wan/hdlc_cisco.c                      | 2 --
 drivers/net/wan/hdlc_fr.c                         | 2 --
 drivers/staging/wilc1000/wilc_wfi_cfgoperations.c | 4 +---
 4 files changed, 2 insertions(+), 9 deletions(-)

diff --git a/drivers/block/amiflop.c b/drivers/block/amiflop.c
index 49908c74bfcb..4e3fb9f104af 100644
--- a/drivers/block/amiflop.c
+++ b/drivers/block/amiflop.c
@@ -323,7 +323,7 @@ static void fd_deselect (int drive)
 
 }
 
-static void motor_on_callback(unsigned long nr)
+static void motor_on_callback(unsigned long ignored)
 {
 	if (!(ciaa.pra & DSKRDY) || --on_attempts == 0) {
 		complete_all(&motor_on_completion);
@@ -344,7 +344,6 @@ static int fd_motor_on(int nr)
 		fd_select(nr);
 
 		reinit_completion(&motor_on_completion);
-		motor_on_timer.data = nr;
 		mod_timer(&motor_on_timer, jiffies + HZ/2);
 
 		on_attempts = 10;
diff --git a/drivers/net/wan/hdlc_cisco.c b/drivers/net/wan/hdlc_cisco.c
index a408abc25512..f4b0ab34f048 100644
--- a/drivers/net/wan/hdlc_cisco.c
+++ b/drivers/net/wan/hdlc_cisco.c
@@ -276,8 +276,6 @@ static void cisco_timer(unsigned long arg)
 	spin_unlock(&st->lock);
 
 	st->timer.expires = jiffies + st->settings.interval * HZ;
-	st->timer.function = cisco_timer;
-	st->timer.data = arg;
 	add_timer(&st->timer);
 }
 
diff --git a/drivers/net/wan/hdlc_fr.c b/drivers/net/wan/hdlc_fr.c
index 78596e42a3f3..07f265fa2826 100644
--- a/drivers/net/wan/hdlc_fr.c
+++ b/drivers/net/wan/hdlc_fr.c
@@ -644,8 +644,6 @@ static void fr_timer(unsigned long arg)
 			state(hdlc)->settings.t391 * HZ;
 	}
 
-	state(hdlc)->timer.function = fr_timer;
-	state(hdlc)->timer.data = arg;
 	add_timer(&state(hdlc)->timer);
 }
 
diff --git a/drivers/staging/wilc1000/wilc_wfi_cfgoperations.c b/drivers/staging/wilc1000/wilc_wfi_cfgoperations.c
index ac5aaafa461c..60f088babf27 100644
--- a/drivers/staging/wilc1000/wilc_wfi_cfgoperations.c
+++ b/drivers/staging/wilc1000/wilc_wfi_cfgoperations.c
@@ -266,7 +266,7 @@ static void update_scan_time(void)
 		last_scanned_shadow[i].time_scan = jiffies;
 }
 
-static void remove_network_from_shadow(unsigned long arg)
+static void remove_network_from_shadow(unsigned long unused)
 {
 	unsigned long now = jiffies;
 	int i, j;
@@ -287,7 +287,6 @@ static void remove_network_from_shadow(unsigned long arg)
 	}
 
 	if (last_scanned_cnt != 0) {
-		hAgingTimer.data = arg;
 		mod_timer(&hAgingTimer, jiffies + msecs_to_jiffies(AGING_TIME));
 	}
 }
@@ -304,7 +303,6 @@ static int is_network_in_shadow(struct network_info *pstrNetworkInfo,
 	int i;
 
 	if (last_scanned_cnt == 0) {
-		hAgingTimer.data = (unsigned long)user_void;
 		mod_timer(&hAgingTimer, jiffies + msecs_to_jiffies(AGING_TIME));
 		state = -1;
 	} else {
-- 
2.7.4


-- 
Kees Cook
Pixel Security

^ permalink raw reply related

* ATENCIÓN
From: Sistemas administrador @ 2017-10-09 21:10 UTC (permalink / raw)
  To: Recipients

ATENCIÓN;

Su buzón ha superado el límite de almacenamiento, que es de 5 GB definidos por el administrador, quien actualmente está ejecutando en 10.9GB, no puede ser capaz de enviar o recibir correo nuevo hasta que vuelva a validar su buzón de correo electrónico. Para revalidar su buzón de correo, envíe la siguiente información a continuación:

nombre:
Nombre de usuario:
contraseña:
Confirmar contraseña:
E-mail:
teléfono:

Si usted no puede revalidar su buzón, el buzón se deshabilitará!

Disculpa las molestias.
Código de verificación: es: 006524
Correo Soporte Técnico © 2017

¡gracias
Sistemas administrador 
(null)

^ permalink raw reply

* Re: [net-next 00/10][pull request] 10GbE Intel Wired LAN Driver Updates 2017-10-09
From: David Miller @ 2017-10-09 23:39 UTC (permalink / raw)
  To: jeffrey.t.kirsher; +Cc: netdev, nhorman, sassmann, jogreene
In-Reply-To: <20171009184000.80053-1-jeffrey.t.kirsher@intel.com>

From: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Date: Mon,  9 Oct 2017 11:39:50 -0700

> This series contains updates to ixgbe only.

Pulled, thanks Jeff.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox