Netdev List

Netdev List
 help / color / mirror / Atom feed

* [PATCH net-next 1/3 v10] net: ether: Add support for multiplexing and aggregation type
From: Subash Abhinov Kasiviswanathan @ 2017-08-30  0:47 UTC (permalink / raw)
  To: netdev, davem, fengguang.wu, dcbw, jiri, stephen, David.Laight,
	marcel, andrew
  Cc: Subash Abhinov Kasiviswanathan
In-Reply-To: <1504054078-10173-1-git-send-email-subashab@codeaurora.org>

Define the Qualcomm multiplexing and aggregation (MAP) ether type 0x00F9.
This is needed for receiving data in the MAP protocol like RMNET. This is
not an officially registered ID.

Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
---
 include/uapi/linux/if_ether.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/include/uapi/linux/if_ether.h b/include/uapi/linux/if_ether.h
index 5bc9bfd..30526db 100644
--- a/include/uapi/linux/if_ether.h
+++ b/include/uapi/linux/if_ether.h
@@ -137,6 +137,9 @@
 #define ETH_P_IEEE802154 0x00F6		/* IEEE802.15.4 frame		*/
 #define ETH_P_CAIF	0x00F7		/* ST-Ericsson CAIF protocol	*/
 #define ETH_P_XDSA	0x00F8		/* Multiplexed DSA protocol	*/
+#define ETH_P_MAP	0x00F9		/* Qualcomm multiplexing and
+					 * aggregation protocol
+					 */
 
 /*
  *	This is an Ethernet frame header.
-- 
1.9.1

^ permalink raw reply related

* [PATCH net-next 2/3 v10] net: arp: Add support for raw IP device
From: Subash Abhinov Kasiviswanathan @ 2017-08-30  0:47 UTC (permalink / raw)
  To: netdev, davem, fengguang.wu, dcbw, jiri, stephen, David.Laight,
	marcel, andrew
  Cc: Subash Abhinov Kasiviswanathan
In-Reply-To: <1504054078-10173-1-git-send-email-subashab@codeaurora.org>

Define the raw IP type. This is needed for raw IP net devices
like rmnet.

Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
---
 include/uapi/linux/if_arp.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/uapi/linux/if_arp.h b/include/uapi/linux/if_arp.h
index cf73510..a2a6356 100644
--- a/include/uapi/linux/if_arp.h
+++ b/include/uapi/linux/if_arp.h
@@ -59,6 +59,7 @@
 #define ARPHRD_LAPB	516		/* LAPB				*/
 #define ARPHRD_DDCMP    517		/* Digital's DDCMP protocol     */
 #define ARPHRD_RAWHDLC	518		/* Raw HDLC			*/
+#define ARPHRD_RAWIP    519		/* Raw IP                       */
 
 #define ARPHRD_TUNNEL	768		/* IPIP tunnel			*/
 #define ARPHRD_TUNNEL6	769		/* IP6IP6 tunnel       		*/
-- 
1.9.1

^ permalink raw reply related

* [PATCH net-next 3/3 v10] drivers: net: ethernet: qualcomm: rmnet: Initial implementation
From: Subash Abhinov Kasiviswanathan @ 2017-08-30  0:47 UTC (permalink / raw)
  To: netdev, davem, fengguang.wu, dcbw, jiri, stephen, David.Laight,
	marcel, andrew
  Cc: Subash Abhinov Kasiviswanathan
In-Reply-To: <1504054078-10173-1-git-send-email-subashab@codeaurora.org>

RmNet driver provides a transport agnostic MAP (multiplexing and
aggregation protocol) support in embedded module. Module provides
virtual network devices which can be attached to any IP-mode
physical device. This will be used to provide all MAP functionality
on future hardware in a single consistent location.

Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
---
 Documentation/networking/rmnet.txt                 |  82 ++++
 drivers/net/ethernet/qualcomm/Kconfig              |   2 +
 drivers/net/ethernet/qualcomm/Makefile             |   2 +
 drivers/net/ethernet/qualcomm/rmnet/Kconfig        |  12 +
 drivers/net/ethernet/qualcomm/rmnet/Makefile       |  10 +
 drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c | 419 +++++++++++++++++++++
 drivers/net/ethernet/qualcomm/rmnet/rmnet_config.h |  56 +++
 .../net/ethernet/qualcomm/rmnet/rmnet_handlers.c   | 271 +++++++++++++
 .../net/ethernet/qualcomm/rmnet/rmnet_handlers.h   |  26 ++
 drivers/net/ethernet/qualcomm/rmnet/rmnet_map.h    |  88 +++++
 .../ethernet/qualcomm/rmnet/rmnet_map_command.c    | 107 ++++++
 .../net/ethernet/qualcomm/rmnet/rmnet_map_data.c   | 105 ++++++
 .../net/ethernet/qualcomm/rmnet/rmnet_private.h    |  45 +++
 drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.c    | 170 +++++++++
 drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.h    |  29 ++
 15 files changed, 1424 insertions(+)
 create mode 100644 Documentation/networking/rmnet.txt
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/Kconfig
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/Makefile
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/rmnet_config.h
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/rmnet_handlers.c
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/rmnet_handlers.h
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/rmnet_map.h
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/rmnet_map_command.c
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/rmnet_map_data.c
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/rmnet_private.h
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.c
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.h

diff --git a/Documentation/networking/rmnet.txt b/Documentation/networking/rmnet.txt
new file mode 100644
index 0000000..6b341ea
--- /dev/null
+++ b/Documentation/networking/rmnet.txt
@@ -0,0 +1,82 @@
+1. Introduction
+
+rmnet driver is used for supporting the Multiplexing and aggregation
+Protocol (MAP). This protocol is used by all recent chipsets using Qualcomm
+Technologies, Inc. modems.
+
+This driver can be used to register onto any physical network device in
+IP mode. Physical transports include USB, HSIC, PCIe and IP accelerator.
+
+Multiplexing allows for creation of logical netdevices (rmnet devices) to
+handle multiple private data networks (PDN) like a default internet, tethering,
+multimedia messaging service (MMS) or IP media subsystem (IMS). Hardware sends
+packets with MAP headers to rmnet. Based on the multiplexer id, rmnet
+routes to the appropriate PDN after removing the MAP header.
+
+Aggregation is required to achieve high data rates. This involves hardware
+sending aggregated bunch of MAP frames. rmnet driver will de-aggregate
+these MAP frames and send them to appropriate PDN's.
+
+2. Packet format
+
+a. MAP packet (data / control)
+
+MAP header has the same endianness of the IP packet.
+
+Packet format -
+
+Bit             0             1           2-7      8 - 15           16 - 31
+Function   Command / Data   Reserved     Pad   Multiplexer ID    Payload length
+Bit            32 - x
+Function     Raw  Bytes
+
+Command (1)/ Data (0) bit value is to indicate if the packet is a MAP command
+or data packet. Control packet is used for transport level flow control. Data
+packets are standard IP packets.
+
+Reserved bits are usually zeroed out and to be ignored by receiver.
+
+Padding is number of bytes to be added for 4 byte alignment if required by
+hardware.
+
+Multiplexer ID is to indicate the PDN on which data has to be sent.
+
+Payload length includes the padding length but does not include MAP header
+length.
+
+b. MAP packet (command specific)
+
+Bit             0             1           2-7      8 - 15           16 - 31
+Function   Command         Reserved     Pad   Multiplexer ID    Payload length
+Bit          32 - 39        40 - 45    46 - 47       48 - 63
+Function   Command name    Reserved   Command Type   Reserved
+Bit          64 - 95
+Function   Transaction ID
+Bit          96 - 127
+Function   Command data
+
+Command 1 indicates disabling flow while 2 is enabling flow
+
+Command types -
+0 for MAP command request
+1 is to acknowledge the receipt of a command
+2 is for unsupported commands
+3 is for error during processing of commands
+
+c. Aggregation
+
+Aggregation is multiple MAP packets (can be data or command) delivered to
+rmnet in a single linear skb. rmnet will process the individual
+packets and either ACK the MAP command or deliver the IP packet to the
+network stack as needed
+
+MAP header|IP Packet|Optional padding|MAP header|IP Packet|Optional padding....
+MAP header|IP Packet|Optional padding|MAP header|Command Packet|Optional pad...
+
+3. Userspace configuration
+
+rmnet userspace configuration is done through netlink library librmnetctl
+and command line utility rmnetcli. Utility is hosted in codeaurora forum git.
+The driver uses rtnl_link_ops for communication.
+
+https://source.codeaurora.org/quic/la/platform/vendor/qcom-opensource/dataservices/tree/rmnetctl
diff --git a/drivers/net/ethernet/qualcomm/Kconfig b/drivers/net/ethernet/qualcomm/Kconfig
index 877675a..f520071 100644
--- a/drivers/net/ethernet/qualcomm/Kconfig
+++ b/drivers/net/ethernet/qualcomm/Kconfig
@@ -59,4 +59,6 @@ config QCOM_EMAC
 	  low power, Receive-Side Scaling (RSS), and IEEE 1588-2008
 	  Precision Clock Synchronization Protocol.
 
+source "drivers/net/ethernet/qualcomm/rmnet/Kconfig"
+
 endif # NET_VENDOR_QUALCOMM
diff --git a/drivers/net/ethernet/qualcomm/Makefile b/drivers/net/ethernet/qualcomm/Makefile
index 92fa7c4..1847350 100644
--- a/drivers/net/ethernet/qualcomm/Makefile
+++ b/drivers/net/ethernet/qualcomm/Makefile
@@ -9,3 +9,5 @@ obj-$(CONFIG_QCA7000_UART) += qcauart.o
 qcauart-objs := qca_uart.o
 
 obj-y += emac/
+
+obj-$(CONFIG_RMNET) += rmnet/
diff --git a/drivers/net/ethernet/qualcomm/rmnet/Kconfig b/drivers/net/ethernet/qualcomm/rmnet/Kconfig
new file mode 100644
index 0000000..4948f14
--- /dev/null
+++ b/drivers/net/ethernet/qualcomm/rmnet/Kconfig
@@ -0,0 +1,12 @@
+#
+# RMNET MAP driver
+#
+
+menuconfig RMNET
+	depends on NETDEVICES
+	bool "RmNet MAP driver"
+	default n
+	---help---
+	  If you say Y here, then the rmnet module will be statically
+	  compiled into the kernel. The rmnet module provides MAP
+	  functionality for embedded and bridged traffic.
diff --git a/drivers/net/ethernet/qualcomm/rmnet/Makefile b/drivers/net/ethernet/qualcomm/rmnet/Makefile
new file mode 100644
index 0000000..01bddf2
--- /dev/null
+++ b/drivers/net/ethernet/qualcomm/rmnet/Makefile
@@ -0,0 +1,10 @@
+#
+# Makefile for the RMNET module
+#
+
+rmnet-y		 := rmnet_config.o
+rmnet-y		 += rmnet_vnd.o
+rmnet-y		 += rmnet_handlers.o
+rmnet-y		 += rmnet_map_data.o
+rmnet-y		 += rmnet_map_command.o
+obj-$(CONFIG_RMNET) += rmnet.o
diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c b/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c
new file mode 100644
index 0000000..e836d26
--- /dev/null
+++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c
@@ -0,0 +1,419 @@
+/* Copyright (c) 2013-2017, The Linux Foundation. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 and
+ * only version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * RMNET configuration engine
+ *
+ */
+
+#include <net/sock.h>
+#include <linux/module.h>
+#include <linux/netlink.h>
+#include <linux/netdevice.h>
+#include "rmnet_config.h"
+#include "rmnet_handlers.h"
+#include "rmnet_vnd.h"
+#include "rmnet_private.h"
+
+/* Locking scheme -
+ * The shared resource which needs to be protected is realdev->rx_handler_data.
+ * For the writer path, this is using rtnl_lock(). The writer paths are
+ * rmnet_newlink(), rmnet_dellink() and rmnet_force_unassociate_device(). These
+ * paths are already called with rtnl_lock() acquired in. There is also an
+ * ASSERT_RTNL() to ensure that we are calling with rtnl acquired. For
+ * dereference here, we will need to use rtnl_dereference(). Dev list writing
+ * needs to happen with rtnl_lock() acquired for netdev_master_upper_dev_link().
+ * For the reader path, the real_dev->rx_handler_data is called in the TX / RX
+ * path. We only need rcu_read_lock() for these scenarios. In these cases,
+ * the rcu_read_lock() is held in __dev_queue_xmit() and
+ * netif_receive_skb_internal(), so readers need to use rcu_dereference_rtnl()
+ * to get the relevant information. For dev list reading, we again acquire
+ * rcu_read_lock() in rmnet_dellink() for netdev_master_upper_dev_get_rcu().
+ * We also use unregister_netdevice_many() to free all rmnet devices in
+ * rmnet_force_unassociate_device() so we dont lose the rtnl_lock() and free in
+ * same context.
+ */
+
+/* Local Definitions and Declarations */
+#define RMNET_LOCAL_LOGICAL_ENDPOINT -1
+
+struct rmnet_walk_data {
+	struct net_device *real_dev;
+	struct list_head *head;
+	struct rmnet_real_dev_info *real_dev_info;
+};
+
+static int rmnet_is_real_dev_registered(const struct net_device *real_dev)
+{
+	rx_handler_func_t *rx_handler;
+
+	rx_handler = rcu_dereference(real_dev->rx_handler);
+	return (rx_handler == rmnet_rx_handler);
+}
+
+/* Needs either rcu_read_lock() or rtnl lock */
+static struct rmnet_real_dev_info*
+__rmnet_get_real_dev_info(const struct net_device *real_dev)
+{
+	if (rmnet_is_real_dev_registered(real_dev))
+		return rcu_dereference_rtnl(real_dev->rx_handler_data);
+	else
+		return NULL;
+}
+
+/* Needs rtnl lock */
+static struct rmnet_real_dev_info*
+rmnet_get_real_dev_info_rtnl(const struct net_device *real_dev)
+{
+	return rtnl_dereference(real_dev->rx_handler_data);
+}
+
+static struct rmnet_endpoint*
+rmnet_get_endpoint(struct net_device *dev, int config_id)
+{
+	struct rmnet_real_dev_info *r;
+	struct rmnet_endpoint *ep;
+
+	if (!rmnet_is_real_dev_registered(dev)) {
+		ep = rmnet_vnd_get_endpoint(dev);
+	} else {
+		r = __rmnet_get_real_dev_info(dev);
+
+		if (!r)
+			return NULL;
+
+		if (config_id == RMNET_LOCAL_LOGICAL_ENDPOINT)
+			ep = &r->local_ep;
+		else
+			ep = &r->muxed_ep[config_id];
+	}
+
+	return ep;
+}
+
+static int rmnet_unregister_real_device(struct net_device *real_dev,
+					struct rmnet_real_dev_info *r)
+{
+	if (r->nr_rmnet_devs)
+		return -EINVAL;
+
+	kfree(r);
+
+	netdev_rx_handler_unregister(real_dev);
+
+	/* release reference on real_dev */
+	dev_put(real_dev);
+
+	netdev_dbg(real_dev, "Removed from rmnet\n");
+	return 0;
+}
+
+static int rmnet_register_real_device(struct net_device *real_dev)
+{
+	struct rmnet_real_dev_info *r;
+	int rc;
+
+	ASSERT_RTNL();
+
+	if (rmnet_is_real_dev_registered(real_dev))
+		return 0;
+
+	r = kzalloc(sizeof(*r), GFP_ATOMIC);
+	if (!r)
+		return -ENOMEM;
+
+	r->dev = real_dev;
+	rc = netdev_rx_handler_register(real_dev, rmnet_rx_handler, r);
+	if (rc) {
+		kfree(r);
+		return -EBUSY;
+	}
+
+	/* hold on to real dev for MAP data */
+	dev_hold(real_dev);
+
+	netdev_dbg(real_dev, "registered with rmnet\n");
+	return 0;
+}
+
+static int rmnet_set_ingress_data_format(struct net_device *dev, u32 idf)
+{
+	struct rmnet_real_dev_info *r;
+
+	netdev_dbg(dev, "Ingress format 0x%08X\n", idf);
+
+	r = __rmnet_get_real_dev_info(dev);
+
+	r->ingress_data_format = idf;
+
+	return 0;
+}
+
+static int rmnet_set_egress_data_format(struct net_device *dev, u32 edf,
+					u16 agg_size, u16 agg_count)
+{
+	struct rmnet_real_dev_info *r;
+
+	netdev_dbg(dev, "Egress format 0x%08X agg size %d cnt %d\n",
+		   edf, agg_size, agg_count);
+
+	r = __rmnet_get_real_dev_info(dev);
+
+	r->egress_data_format = edf;
+
+	return 0;
+}
+
+static int __rmnet_set_endpoint_config(struct net_device *dev, int config_id,
+				       struct rmnet_endpoint *ep)
+{
+	struct rmnet_endpoint *dev_ep;
+
+	dev_ep = rmnet_get_endpoint(dev, config_id);
+
+	if (!dev_ep)
+		return -EINVAL;
+
+	memcpy(dev_ep, ep, sizeof(struct rmnet_endpoint));
+	if (config_id == RMNET_LOCAL_LOGICAL_ENDPOINT)
+		dev_ep->mux_id = 0;
+	else
+		dev_ep->mux_id = config_id;
+
+	return 0;
+}
+
+static int rmnet_set_endpoint_config(struct net_device *dev,
+				     int config_id, u8 rmnet_mode,
+				     struct net_device *egress_dev)
+{
+	struct rmnet_endpoint ep;
+
+	netdev_dbg(dev, "id %d mode %d dev %s\n",
+		   config_id, rmnet_mode, egress_dev->name);
+
+	if (config_id < RMNET_LOCAL_LOGICAL_ENDPOINT ||
+	    config_id >= RMNET_MAX_LOGICAL_EP)
+		return -EINVAL;
+
+	/* This config is cleared on every set, so its ok to not
+	 * clear it on a device delete.
+	 */
+	memset(&ep, 0, sizeof(struct rmnet_endpoint));
+	ep.rmnet_mode = rmnet_mode;
+	ep.egress_dev = egress_dev;
+
+	return __rmnet_set_endpoint_config(dev, config_id, &ep);
+}
+
+static int rmnet_newlink(struct net *src_net, struct net_device *dev,
+			 struct nlattr *tb[], struct nlattr *data[],
+			 struct netlink_ext_ack *extack)
+{
+	int ingress_format = RMNET_INGRESS_FORMAT_DEMUXING |
+			     RMNET_INGRESS_FORMAT_DEAGGREGATION |
+			     RMNET_INGRESS_FORMAT_MAP;
+	int egress_format = RMNET_EGRESS_FORMAT_MUXING |
+			    RMNET_EGRESS_FORMAT_MAP;
+	struct rmnet_real_dev_info *r;
+	struct net_device *real_dev;
+	int mode = RMNET_EPMODE_VND;
+	int err = 0;
+	u16 mux_id;
+
+	real_dev = __dev_get_by_index(src_net, nla_get_u32(tb[IFLA_LINK]));
+	if (!real_dev || !dev)
+		return -ENODEV;
+
+	if (!data[IFLA_VLAN_ID])
+		return -EINVAL;
+
+	mux_id = nla_get_u16(data[IFLA_VLAN_ID]);
+
+	err = rmnet_register_real_device(real_dev);
+	if (err)
+		goto err0;
+
+	r = rmnet_get_real_dev_info_rtnl(real_dev);
+	err = rmnet_vnd_newlink(mux_id, dev, r);
+	if (err)
+		goto err1;
+
+	err = netdev_master_upper_dev_link(dev, real_dev, NULL, NULL);
+	if (err)
+		goto err2;
+
+	rmnet_vnd_set_mux(dev, mux_id);
+	rmnet_set_egress_data_format(real_dev, egress_format, 0, 0);
+	rmnet_set_ingress_data_format(real_dev, ingress_format);
+	rmnet_set_endpoint_config(real_dev, mux_id, mode, dev);
+	rmnet_set_endpoint_config(dev, mux_id, mode, real_dev);
+	return 0;
+
+err2:
+	rmnet_vnd_dellink(mux_id, r);
+err1:
+	rmnet_unregister_real_device(real_dev, r);
+err0:
+	return err;
+}
+
+static void rmnet_dellink(struct net_device *dev, struct list_head *head)
+{
+	struct rmnet_real_dev_info *r;
+	struct net_device *real_dev;
+	u8 mux_id;
+
+	rcu_read_lock();
+	real_dev = netdev_master_upper_dev_get_rcu(dev);
+	rcu_read_unlock();
+
+	if (!real_dev || !rmnet_is_real_dev_registered(real_dev))
+		return;
+
+	r = rmnet_get_real_dev_info_rtnl(real_dev);
+
+	mux_id = rmnet_vnd_get_mux(dev);
+	rmnet_vnd_dellink(mux_id, r);
+	netdev_upper_dev_unlink(dev, real_dev);
+	rmnet_unregister_real_device(real_dev, r);
+
+	unregister_netdevice_queue(dev, head);
+}
+
+static int rmnet_dev_walk_unreg(struct net_device *rmnet_dev, void *data)
+{
+	struct rmnet_walk_data *d = data;
+	u8 mux_id;
+
+	mux_id = rmnet_vnd_get_mux(rmnet_dev);
+
+	rmnet_vnd_dellink(mux_id, d->real_dev_info);
+	netdev_upper_dev_unlink(rmnet_dev, d->real_dev);
+	unregister_netdevice_queue(rmnet_dev, d->head);
+
+	return 0;
+}
+
+static void rmnet_force_unassociate_device(struct net_device *dev)
+{
+	struct net_device *real_dev = dev;
+	struct rmnet_real_dev_info *r;
+	struct rmnet_walk_data d;
+	LIST_HEAD(list);
+
+	if (!rmnet_is_real_dev_registered(real_dev))
+		return;
+
+	ASSERT_RTNL();
+
+	d.real_dev = real_dev;
+	d.head = &list;
+
+	r = rmnet_get_real_dev_info_rtnl(dev);
+	d.real_dev_info = r;
+
+	rcu_read_lock();
+	netdev_walk_all_lower_dev_rcu(real_dev, rmnet_dev_walk_unreg, &d);
+	rcu_read_unlock();
+	unregister_netdevice_many(&list);
+
+	rmnet_unregister_real_device(real_dev, r);
+}
+
+static int rmnet_config_notify_cb(struct notifier_block *nb,
+				  unsigned long event, void *data)
+{
+	struct net_device *dev = netdev_notifier_info_to_dev(data);
+
+	if (!dev)
+		return NOTIFY_DONE;
+
+	switch (event) {
+	case NETDEV_UNREGISTER:
+		netdev_dbg(dev, "Kernel unregister\n");
+		rmnet_force_unassociate_device(dev);
+		break;
+
+	default:
+		break;
+	}
+
+	return NOTIFY_DONE;
+}
+
+static struct notifier_block rmnet_dev_notifier __read_mostly = {
+	.notifier_call = rmnet_config_notify_cb,
+};
+
+static int rmnet_rtnl_validate(struct nlattr *tb[], struct nlattr *data[],
+			       struct netlink_ext_ack *extack)
+{
+	u16 mux_id;
+
+	if (!data || !data[IFLA_VLAN_ID])
+		return -EINVAL;
+
+	mux_id = nla_get_u16(data[IFLA_VLAN_ID]);
+	if (mux_id > (RMNET_MAX_LOGICAL_EP - 1))
+		return -ERANGE;
+
+	return 0;
+}
+
+static size_t rmnet_get_size(const struct net_device *dev)
+{
+	return nla_total_size(2); /* IFLA_VLAN_ID */
+}
+
+struct rtnl_link_ops rmnet_link_ops __read_mostly = {
+	.kind		= "rmnet",
+	.maxtype	= __IFLA_VLAN_MAX,
+	.priv_size	= sizeof(struct rmnet_priv),
+	.setup		= rmnet_vnd_setup,
+	.validate	= rmnet_rtnl_validate,
+	.newlink	= rmnet_newlink,
+	.dellink	= rmnet_dellink,
+	.get_size	= rmnet_get_size,
+};
+
+struct rmnet_real_dev_info*
+rmnet_get_real_dev_info(struct net_device *real_dev)
+{
+	return __rmnet_get_real_dev_info(real_dev);
+}
+
+/* Startup/Shutdown */
+
+static int __init rmnet_init(void)
+{
+	int rc;
+
+	rc = register_netdevice_notifier(&rmnet_dev_notifier);
+	if (rc != 0)
+		return rc;
+
+	rc = rtnl_link_register(&rmnet_link_ops);
+	if (rc != 0) {
+		unregister_netdevice_notifier(&rmnet_dev_notifier);
+		return rc;
+	}
+	return rc;
+}
+
+static void __exit rmnet_exit(void)
+{
+	unregister_netdevice_notifier(&rmnet_dev_notifier);
+	rtnl_link_unregister(&rmnet_link_ops);
+}
+
+module_init(rmnet_init)
+module_exit(rmnet_exit)
+MODULE_LICENSE("GPL v2");
diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.h b/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.h
new file mode 100644
index 0000000..985d372
--- /dev/null
+++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.h
@@ -0,0 +1,56 @@
+/* Copyright (c) 2013-2014, 2016-2017 The Linux Foundation. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 and
+ * only version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * RMNET Data configuration engine
+ *
+ */
+
+#include <linux/skbuff.h>
+
+#ifndef _RMNET_CONFIG_H_
+#define _RMNET_CONFIG_H_
+
+#define RMNET_MAX_LOGICAL_EP 255
+#define RMNET_MAX_VND        32
+
+/* Information about the next device to deliver the packet to.
+ * Exact usage of this parameter depends on the rmnet_mode.
+ */
+struct rmnet_endpoint {
+	u8 rmnet_mode;
+	u8 mux_id;
+	struct net_device *egress_dev;
+};
+
+/* One instance of this structure is instantiated for each real_dev associated
+ * with rmnet.
+ */
+struct rmnet_real_dev_info {
+	struct net_device *dev;
+	struct rmnet_endpoint local_ep;
+	struct rmnet_endpoint muxed_ep[RMNET_MAX_LOGICAL_EP];
+	u32 ingress_data_format;
+	u32 egress_data_format;
+	struct net_device *rmnet_devices[RMNET_MAX_VND];
+	u8 nr_rmnet_devs;
+};
+
+extern struct rtnl_link_ops rmnet_link_ops;
+
+struct rmnet_priv {
+	struct rmnet_endpoint local_ep;
+	u8 mux_id;
+};
+
+struct rmnet_real_dev_info*
+rmnet_get_real_dev_info(struct net_device *real_dev);
+
+#endif /* _RMNET_CONFIG_H_ */
diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_handlers.c b/drivers/net/ethernet/qualcomm/rmnet/rmnet_handlers.c
new file mode 100644
index 0000000..7dab3bb
--- /dev/null
+++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_handlers.c
@@ -0,0 +1,271 @@
+/* Copyright (c) 2013-2017, The Linux Foundation. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 and
+ * only version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * RMNET Data ingress/egress handler
+ *
+ */
+
+#include <linux/netdevice.h>
+#include <linux/netdev_features.h>
+#include "rmnet_private.h"
+#include "rmnet_config.h"
+#include "rmnet_vnd.h"
+#include "rmnet_map.h"
+#include "rmnet_handlers.h"
+
+#define RMNET_IP_VERSION_4 0x40
+#define RMNET_IP_VERSION_6 0x60
+
+/* Helper Functions */
+
+static void rmnet_set_skb_proto(struct sk_buff *skb)
+{
+	switch (skb->data[0] & 0xF0) {
+	case RMNET_IP_VERSION_4:
+		skb->protocol = htons(ETH_P_IP);
+		break;
+	case RMNET_IP_VERSION_6:
+		skb->protocol = htons(ETH_P_IPV6);
+		break;
+	default:
+		skb->protocol = htons(ETH_P_MAP);
+		break;
+	}
+}
+
+/* Generic handler */
+
+static rx_handler_result_t
+rmnet_bridge_handler(struct sk_buff *skb, struct rmnet_endpoint *ep)
+{
+	if (!ep->egress_dev)
+		kfree_skb(skb);
+	else
+		rmnet_egress_handler(skb, ep);
+
+	return RX_HANDLER_CONSUMED;
+}
+
+static rx_handler_result_t
+rmnet_deliver_skb(struct sk_buff *skb, struct rmnet_endpoint *ep)
+{
+	switch (ep->rmnet_mode) {
+	case RMNET_EPMODE_NONE:
+		return RX_HANDLER_PASS;
+
+	case RMNET_EPMODE_BRIDGE:
+		return rmnet_bridge_handler(skb, ep);
+
+	case RMNET_EPMODE_VND:
+		skb_reset_transport_header(skb);
+		skb_reset_network_header(skb);
+		rmnet_vnd_rx_fixup(skb, skb->dev);
+
+		skb->pkt_type = PACKET_HOST;
+		skb_set_mac_header(skb, 0);
+		netif_receive_skb(skb);
+		return RX_HANDLER_CONSUMED;
+
+	default:
+		kfree_skb(skb);
+		return RX_HANDLER_CONSUMED;
+	}
+}
+
+static rx_handler_result_t
+rmnet_ingress_deliver_packet(struct sk_buff *skb,
+			     struct rmnet_real_dev_info *r)
+{
+	if (!r) {
+		kfree_skb(skb);
+		return RX_HANDLER_CONSUMED;
+	}
+
+	skb->dev = r->local_ep.egress_dev;
+
+	return rmnet_deliver_skb(skb, &r->local_ep);
+}
+
+/* MAP handler */
+
+static rx_handler_result_t
+__rmnet_map_ingress_handler(struct sk_buff *skb,
+			    struct rmnet_real_dev_info *r)
+{
+	struct rmnet_endpoint *ep;
+	u8 mux_id;
+	u16 len;
+
+	if (RMNET_MAP_GET_CD_BIT(skb)) {
+		if (r->ingress_data_format
+		    & RMNET_INGRESS_FORMAT_MAP_COMMANDS)
+			return rmnet_map_command(skb, r);
+
+		kfree_skb(skb);
+		return RX_HANDLER_CONSUMED;
+	}
+
+	mux_id = RMNET_MAP_GET_MUX_ID(skb);
+	len = RMNET_MAP_GET_LENGTH(skb) - RMNET_MAP_GET_PAD(skb);
+
+	if (mux_id >= RMNET_MAX_LOGICAL_EP) {
+		kfree_skb(skb);
+		return RX_HANDLER_CONSUMED;
+	}
+
+	ep = &r->muxed_ep[mux_id];
+
+	if (r->ingress_data_format & RMNET_INGRESS_FORMAT_DEMUXING)
+		skb->dev = ep->egress_dev;
+
+	/* Subtract MAP header */
+	skb_pull(skb, sizeof(struct rmnet_map_header));
+	skb_trim(skb, len);
+	rmnet_set_skb_proto(skb);
+	return rmnet_deliver_skb(skb, ep);
+}
+
+static rx_handler_result_t
+rmnet_map_ingress_handler(struct sk_buff *skb,
+			  struct rmnet_real_dev_info *r)
+{
+	struct sk_buff *skbn;
+	int rc;
+
+	if (r->ingress_data_format & RMNET_INGRESS_FORMAT_DEAGGREGATION) {
+		while ((skbn = rmnet_map_deaggregate(skb, r)) != NULL)
+			__rmnet_map_ingress_handler(skbn, r);
+
+		consume_skb(skb);
+		rc = RX_HANDLER_CONSUMED;
+	} else {
+		rc = __rmnet_map_ingress_handler(skb, r);
+	}
+
+	return rc;
+}
+
+static int rmnet_map_egress_handler(struct sk_buff *skb,
+				    struct rmnet_real_dev_info *r,
+				    struct rmnet_endpoint *ep,
+				    struct net_device *orig_dev)
+{
+	int required_headroom, additional_header_len;
+	struct rmnet_map_header *map_header;
+
+	additional_header_len = 0;
+	required_headroom = sizeof(struct rmnet_map_header);
+
+	if (skb_headroom(skb) < required_headroom) {
+		if (pskb_expand_head(skb, required_headroom, 0, GFP_KERNEL))
+			return RMNET_MAP_CONSUMED;
+	}
+
+	map_header = rmnet_map_add_map_header(skb, additional_header_len, 0);
+	if (!map_header)
+		return RMNET_MAP_CONSUMED;
+
+	if (r->egress_data_format & RMNET_EGRESS_FORMAT_MUXING) {
+		if (ep->mux_id == 0xff)
+			map_header->mux_id = 0;
+		else
+			map_header->mux_id = ep->mux_id;
+	}
+
+	skb->protocol = htons(ETH_P_MAP);
+
+	return RMNET_MAP_SUCCESS;
+}
+
+/* Ingress / Egress Entry Points */
+
+/* Processes packet as per ingress data format for receiving device. Logical
+ * endpoint is determined from packet inspection. Packet is then sent to the
+ * egress device listed in the logical endpoint configuration.
+ */
+rx_handler_result_t rmnet_rx_handler(struct sk_buff **pskb)
+{
+	struct rmnet_real_dev_info *r;
+	struct sk_buff *skb = *pskb;
+	struct net_device *dev;
+	int rc;
+
+	if (!skb)
+		return RX_HANDLER_CONSUMED;
+
+	dev = skb->dev;
+	r = rmnet_get_real_dev_info(dev);
+
+	if (r->ingress_data_format & RMNET_INGRESS_FORMAT_MAP) {
+		rc = rmnet_map_ingress_handler(skb, r);
+	} else {
+		switch (ntohs(skb->protocol)) {
+		case ETH_P_MAP:
+			if (r->local_ep.rmnet_mode ==
+				RMNET_EPMODE_BRIDGE) {
+				rc = rmnet_ingress_deliver_packet(skb, r);
+			} else {
+				kfree_skb(skb);
+				rc = RX_HANDLER_CONSUMED;
+			}
+			break;
+
+		case ETH_P_IP:
+		case ETH_P_IPV6:
+			rc = rmnet_ingress_deliver_packet(skb, r);
+			break;
+
+		default:
+			rc = RX_HANDLER_PASS;
+		}
+	}
+
+	return rc;
+}
+
+/* Modifies packet as per logical endpoint configuration and egress data format
+ * for egress device configured in logical endpoint. Packet is then transmitted
+ * on the egress device.
+ */
+void rmnet_egress_handler(struct sk_buff *skb,
+			  struct rmnet_endpoint *ep)
+{
+	struct rmnet_real_dev_info *r;
+	struct net_device *orig_dev;
+
+	orig_dev = skb->dev;
+	skb->dev = ep->egress_dev;
+
+	r = rmnet_get_real_dev_info(skb->dev);
+	if (!r) {
+		kfree_skb(skb);
+		return;
+	}
+
+	if (r->egress_data_format & RMNET_EGRESS_FORMAT_MAP) {
+		switch (rmnet_map_egress_handler(skb, r, ep, orig_dev)) {
+		case RMNET_MAP_CONSUMED:
+			return;
+
+		case RMNET_MAP_SUCCESS:
+			break;
+
+		default:
+			kfree_skb(skb);
+			return;
+		}
+	}
+
+	if (ep->rmnet_mode == RMNET_EPMODE_VND)
+		rmnet_vnd_tx_fixup(skb, orig_dev);
+
+	dev_queue_xmit(skb);
+}
diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_handlers.h b/drivers/net/ethernet/qualcomm/rmnet/rmnet_handlers.h
new file mode 100644
index 0000000..f2638cf
--- /dev/null
+++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_handlers.h
@@ -0,0 +1,26 @@
+/* Copyright (c) 2013, 2016-2017 The Linux Foundation. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 and
+ * only version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * RMNET Data ingress/egress handler
+ *
+ */
+
+#ifndef _RMNET_HANDLERS_H_
+#define _RMNET_HANDLERS_H_
+
+#include "rmnet_config.h"
+
+void rmnet_egress_handler(struct sk_buff *skb,
+			  struct rmnet_endpoint *ep);
+
+rx_handler_result_t rmnet_rx_handler(struct sk_buff **pskb);
+
+#endif /* _RMNET_HANDLERS_H_ */
diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_map.h b/drivers/net/ethernet/qualcomm/rmnet/rmnet_map.h
new file mode 100644
index 0000000..2aabad2
--- /dev/null
+++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_map.h
@@ -0,0 +1,88 @@
+/* Copyright (c) 2013-2017, The Linux Foundation. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 and
+ * only version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#ifndef _RMNET_MAP_H_
+#define _RMNET_MAP_H_
+
+struct rmnet_map_control_command {
+	u8  command_name;
+	u8  cmd_type:2;
+	u8  reserved:6;
+	u16 reserved2;
+	u32 transaction_id;
+	union {
+		struct {
+			u16 ip_family:2;
+			u16 reserved:14;
+			u16 flow_control_seq_num;
+			u32 qos_id;
+		} flow_control;
+		u8 data[0];
+	};
+}  __aligned(1);
+
+enum rmnet_map_results {
+	RMNET_MAP_SUCCESS,
+	RMNET_MAP_CONSUMED,
+	RMNET_MAP_GENERAL_FAILURE,
+	RMNET_MAP_NOT_ENABLED,
+	RMNET_MAP_FAILED_AGGREGATION,
+	RMNET_MAP_FAILED_MUX
+};
+
+enum rmnet_map_commands {
+	RMNET_MAP_COMMAND_NONE,
+	RMNET_MAP_COMMAND_FLOW_DISABLE,
+	RMNET_MAP_COMMAND_FLOW_ENABLE,
+	/* These should always be the last 2 elements */
+	RMNET_MAP_COMMAND_UNKNOWN,
+	RMNET_MAP_COMMAND_ENUM_LENGTH
+};
+
+struct rmnet_map_header {
+	u8  pad_len:6;
+	u8  reserved_bit:1;
+	u8  cd_bit:1;
+	u8  mux_id;
+	u16 pkt_len;
+}  __aligned(1);
+
+#define RMNET_MAP_GET_MUX_ID(Y) (((struct rmnet_map_header *) \
+				 (Y)->data)->mux_id)
+#define RMNET_MAP_GET_CD_BIT(Y) (((struct rmnet_map_header *) \
+				(Y)->data)->cd_bit)
+#define RMNET_MAP_GET_PAD(Y) (((struct rmnet_map_header *) \
+				(Y)->data)->pad_len)
+#define RMNET_MAP_GET_CMD_START(Y) ((struct rmnet_map_control_command *) \
+				    ((Y)->data + \
+				      sizeof(struct rmnet_map_header)))
+#define RMNET_MAP_GET_LENGTH(Y) (ntohs(((struct rmnet_map_header *) \
+					(Y)->data)->pkt_len))
+
+#define RMNET_MAP_COMMAND_REQUEST     0
+#define RMNET_MAP_COMMAND_ACK         1
+#define RMNET_MAP_COMMAND_UNSUPPORTED 2
+#define RMNET_MAP_COMMAND_INVALID     3
+
+#define RMNET_MAP_NO_PAD_BYTES        0
+#define RMNET_MAP_ADD_PAD_BYTES       1
+
+u8 rmnet_map_demultiplex(struct sk_buff *skb);
+struct sk_buff *rmnet_map_deaggregate(struct sk_buff *skb,
+				      struct rmnet_real_dev_info *rdinfo);
+
+struct rmnet_map_header *rmnet_map_add_map_header(struct sk_buff *skb,
+						  int hdrlen, int pad);
+rx_handler_result_t rmnet_map_command(struct sk_buff *skb,
+				      struct rmnet_real_dev_info *rdinfo);
+
+#endif /* _RMNET_MAP_H_ */
diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_map_command.c b/drivers/net/ethernet/qualcomm/rmnet/rmnet_map_command.c
new file mode 100644
index 0000000..ccded40
--- /dev/null
+++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_map_command.c
@@ -0,0 +1,107 @@
+/* Copyright (c) 2013-2017, The Linux Foundation. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 and
+ * only version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include <linux/netdevice.h>
+#include "rmnet_config.h"
+#include "rmnet_map.h"
+#include "rmnet_private.h"
+#include "rmnet_vnd.h"
+
+static u8 rmnet_map_do_flow_control(struct sk_buff *skb,
+				    struct rmnet_real_dev_info *rdinfo,
+				    int enable)
+{
+	struct rmnet_map_control_command *cmd;
+	struct rmnet_endpoint *ep;
+	struct net_device *vnd;
+	u16 ip_family;
+	u16 fc_seq;
+	u32 qos_id;
+	u8 mux_id;
+	int r;
+
+	mux_id = RMNET_MAP_GET_MUX_ID(skb);
+	cmd = RMNET_MAP_GET_CMD_START(skb);
+
+	if (mux_id >= RMNET_MAX_LOGICAL_EP) {
+		kfree_skb(skb);
+		return RX_HANDLER_CONSUMED;
+	}
+
+	ep = &rdinfo->muxed_ep[mux_id];
+	vnd = ep->egress_dev;
+
+	ip_family = cmd->flow_control.ip_family;
+	fc_seq = ntohs(cmd->flow_control.flow_control_seq_num);
+	qos_id = ntohl(cmd->flow_control.qos_id);
+
+	/* Ignore the ip family and pass the sequence number for both v4 and v6
+	 * sequence. User space does not support creating dedicated flows for
+	 * the 2 protocols
+	 */
+	r = rmnet_vnd_do_flow_control(vnd, enable);
+	if (r) {
+		kfree_skb(skb);
+		return RMNET_MAP_COMMAND_UNSUPPORTED;
+	} else {
+		return RMNET_MAP_COMMAND_ACK;
+	}
+}
+
+static void rmnet_map_send_ack(struct sk_buff *skb,
+			       unsigned char type,
+			       struct rmnet_real_dev_info *rdinfo)
+{
+	struct rmnet_map_control_command *cmd;
+	int xmit_status;
+
+	skb->protocol = htons(ETH_P_MAP);
+
+	cmd = RMNET_MAP_GET_CMD_START(skb);
+	cmd->cmd_type = type & 0x03;
+
+	netif_tx_lock(skb->dev);
+	xmit_status = skb->dev->netdev_ops->ndo_start_xmit(skb, skb->dev);
+	netif_tx_unlock(skb->dev);
+}
+
+/* Process MAP command frame and send N/ACK message as appropriate. Message cmd
+ * name is decoded here and appropriate handler is called.
+ */
+rx_handler_result_t rmnet_map_command(struct sk_buff *skb,
+				      struct rmnet_real_dev_info *rdinfo)
+{
+	struct rmnet_map_control_command *cmd;
+	unsigned char command_name;
+	unsigned char rc = 0;
+
+	cmd = RMNET_MAP_GET_CMD_START(skb);
+	command_name = cmd->command_name;
+
+	switch (command_name) {
+	case RMNET_MAP_COMMAND_FLOW_ENABLE:
+		rc = rmnet_map_do_flow_control(skb, rdinfo, 1);
+		break;
+
+	case RMNET_MAP_COMMAND_FLOW_DISABLE:
+		rc = rmnet_map_do_flow_control(skb, rdinfo, 0);
+		break;
+
+	default:
+		rc = RMNET_MAP_COMMAND_UNSUPPORTED;
+		kfree_skb(skb);
+		break;
+	}
+	if (rc == RMNET_MAP_COMMAND_ACK)
+		rmnet_map_send_ack(skb, rc, rdinfo);
+	return RX_HANDLER_CONSUMED;
+}
diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_map_data.c b/drivers/net/ethernet/qualcomm/rmnet/rmnet_map_data.c
new file mode 100644
index 0000000..a29c476
--- /dev/null
+++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_map_data.c
@@ -0,0 +1,105 @@
+/* Copyright (c) 2013-2017, The Linux Foundation. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 and
+ * only version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * RMNET Data MAP protocol
+ *
+ */
+
+#include <linux/netdevice.h>
+#include "rmnet_config.h"
+#include "rmnet_map.h"
+#include "rmnet_private.h"
+
+#define RMNET_MAP_DEAGGR_SPACING  64
+#define RMNET_MAP_DEAGGR_HEADROOM (RMNET_MAP_DEAGGR_SPACING / 2)
+
+/* Adds MAP header to front of skb->data
+ * Padding is calculated and set appropriately in MAP header. Mux ID is
+ * initialized to 0.
+ */
+struct rmnet_map_header *rmnet_map_add_map_header(struct sk_buff *skb,
+						  int hdrlen, int pad)
+{
+	struct rmnet_map_header *map_header;
+	u32 padding, map_datalen;
+	u8 *padbytes;
+
+	if (skb_headroom(skb) < sizeof(struct rmnet_map_header))
+		return NULL;
+
+	map_datalen = skb->len - hdrlen;
+	map_header = (struct rmnet_map_header *)
+			skb_push(skb, sizeof(struct rmnet_map_header));
+	memset(map_header, 0, sizeof(struct rmnet_map_header));
+
+	if (pad == RMNET_MAP_NO_PAD_BYTES) {
+		map_header->pkt_len = htons(map_datalen);
+		return map_header;
+	}
+
+	padding = ALIGN(map_datalen, 4) - map_datalen;
+
+	if (padding == 0)
+		goto done;
+
+	if (skb_tailroom(skb) < padding)
+		return NULL;
+
+	padbytes = (u8 *)skb_put(skb, padding);
+	memset(padbytes, 0, padding);
+
+done:
+	map_header->pkt_len = htons(map_datalen + padding);
+	map_header->pad_len = padding & 0x3F;
+
+	return map_header;
+}
+
+/* Deaggregates a single packet
+ * A whole new buffer is allocated for each portion of an aggregated frame.
+ * Caller should keep calling deaggregate() on the source skb until 0 is
+ * returned, indicating that there are no more packets to deaggregate. Caller
+ * is responsible for freeing the original skb.
+ */
+struct sk_buff *rmnet_map_deaggregate(struct sk_buff *skb,
+				      struct rmnet_real_dev_info *rdinfo)
+{
+	struct rmnet_map_header *maph;
+	struct sk_buff *skbn;
+	u32 packet_len;
+
+	if (skb->len == 0)
+		return NULL;
+
+	maph = (struct rmnet_map_header *)skb->data;
+	packet_len = ntohs(maph->pkt_len) + sizeof(struct rmnet_map_header);
+
+	if (((int)skb->len - (int)packet_len) < 0)
+		return NULL;
+
+	skbn = alloc_skb(packet_len + RMNET_MAP_DEAGGR_SPACING, GFP_ATOMIC);
+	if (!skbn)
+		return NULL;
+
+	skbn->dev = skb->dev;
+	skb_reserve(skbn, RMNET_MAP_DEAGGR_HEADROOM);
+	skb_put(skbn, packet_len);
+	memcpy(skbn->data, skb->data, packet_len);
+	skb_pull(skb, packet_len);
+
+	/* Some hardware can send us empty frames. Catch them */
+	if (ntohs(maph->pkt_len) == 0) {
+		kfree_skb(skb);
+		return NULL;
+	}
+
+	return skbn;
+}
diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_private.h b/drivers/net/ethernet/qualcomm/rmnet/rmnet_private.h
new file mode 100644
index 0000000..ed820b5
--- /dev/null
+++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_private.h
@@ -0,0 +1,45 @@
+/* Copyright (c) 2013-2014, 2016-2017 The Linux Foundation. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 and
+ * only version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#ifndef _RMNET_PRIVATE_H_
+#define _RMNET_PRIVATE_H_
+
+#define RMNET_MAX_VND              32
+#define RMNET_MAX_PACKET_SIZE      16384
+#define RMNET_DFLT_PACKET_SIZE     1500
+#define RMNET_NEEDED_HEADROOM      16
+#define RMNET_TX_QUEUE_LEN         1000
+
+/* Constants */
+#define RMNET_EGRESS_FORMAT__RESERVED__         BIT(0)
+#define RMNET_EGRESS_FORMAT_MAP                 BIT(1)
+#define RMNET_EGRESS_FORMAT_AGGREGATION         BIT(2)
+#define RMNET_EGRESS_FORMAT_MUXING              BIT(3)
+#define RMNET_EGRESS_FORMAT_MAP_CKSUMV3         BIT(4)
+#define RMNET_EGRESS_FORMAT_MAP_CKSUMV4         BIT(5)
+
+#define RMNET_INGRESS_FIX_ETHERNET              BIT(0)
+#define RMNET_INGRESS_FORMAT_MAP                BIT(1)
+#define RMNET_INGRESS_FORMAT_DEAGGREGATION      BIT(2)
+#define RMNET_INGRESS_FORMAT_DEMUXING           BIT(3)
+#define RMNET_INGRESS_FORMAT_MAP_COMMANDS       BIT(4)
+#define RMNET_INGRESS_FORMAT_MAP_CKSUMV3        BIT(5)
+#define RMNET_INGRESS_FORMAT_MAP_CKSUMV4        BIT(6)
+
+/* Pass the frame up the stack with no modifications to skb->dev */
+#define RMNET_EPMODE_NONE (0)
+/* Replace skb->dev to a virtual rmnet device and pass up the stack */
+#define RMNET_EPMODE_VND (1)
+/* Pass the frame directly to another device with dev_queue_xmit() */
+#define RMNET_EPMODE_BRIDGE (2)
+
+#endif /* _RMNET_PRIVATE_H_ */
diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.c b/drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.c
new file mode 100644
index 0000000..c8b573d
--- /dev/null
+++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.c
@@ -0,0 +1,170 @@
+/* Copyright (c) 2013-2017, The Linux Foundation. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 and
+ * only version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ *
+ * RMNET Data virtual network driver
+ *
+ */
+
+#include <linux/etherdevice.h>
+#include <linux/if_arp.h>
+#include <net/pkt_sched.h>
+#include "rmnet_config.h"
+#include "rmnet_handlers.h"
+#include "rmnet_private.h"
+#include "rmnet_map.h"
+#include "rmnet_vnd.h"
+
+/* RX/TX Fixup */
+
+void rmnet_vnd_rx_fixup(struct sk_buff *skb, struct net_device *dev)
+{
+	dev->stats.rx_packets++;
+	dev->stats.rx_bytes += skb->len;
+}
+
+void rmnet_vnd_tx_fixup(struct sk_buff *skb, struct net_device *dev)
+{
+	dev->stats.tx_packets++;
+	dev->stats.tx_bytes += skb->len;
+}
+
+/* Network Device Operations */
+
+static netdev_tx_t rmnet_vnd_start_xmit(struct sk_buff *skb,
+					struct net_device *dev)
+{
+	struct rmnet_priv *priv;
+
+	priv = netdev_priv(dev);
+	if (priv->local_ep.egress_dev) {
+		rmnet_egress_handler(skb, &priv->local_ep);
+	} else {
+		dev->stats.tx_dropped++;
+		kfree_skb(skb);
+	}
+	return NETDEV_TX_OK;
+}
+
+static int rmnet_vnd_change_mtu(struct net_device *rmnet_dev, int new_mtu)
+{
+	if (new_mtu < 0 || new_mtu > RMNET_MAX_PACKET_SIZE)
+		return -EINVAL;
+
+	rmnet_dev->mtu = new_mtu;
+	return 0;
+}
+
+static const struct net_device_ops rmnet_vnd_ops = {
+	.ndo_start_xmit = rmnet_vnd_start_xmit,
+	.ndo_change_mtu = rmnet_vnd_change_mtu,
+};
+
+/* Called by kernel whenever a new rmnet<n> device is created. Sets MTU,
+ * flags, ARP type, needed headroom, etc...
+ */
+void rmnet_vnd_setup(struct net_device *rmnet_dev)
+{
+	struct rmnet_priv *priv;
+
+	priv = netdev_priv(rmnet_dev);
+	netdev_dbg(rmnet_dev, "Setting up device %s\n", rmnet_dev->name);
+
+	rmnet_dev->netdev_ops = &rmnet_vnd_ops;
+	rmnet_dev->mtu = RMNET_DFLT_PACKET_SIZE;
+	rmnet_dev->needed_headroom = RMNET_NEEDED_HEADROOM;
+	random_ether_addr(rmnet_dev->dev_addr);
+	rmnet_dev->tx_queue_len = RMNET_TX_QUEUE_LEN;
+
+	/* Raw IP mode */
+	rmnet_dev->header_ops = NULL;  /* No header */
+	rmnet_dev->type = ARPHRD_RAWIP;
+	rmnet_dev->hard_header_len = 0;
+	rmnet_dev->flags &= ~(IFF_BROADCAST | IFF_MULTICAST);
+
+	rmnet_dev->needs_free_netdev = true;
+}
+
+/* Exposed API */
+
+int rmnet_vnd_newlink(u8 id, struct net_device *rmnet_dev,
+		      struct rmnet_real_dev_info *r)
+{
+	int rc;
+
+	if (r->rmnet_devices[id])
+		return -EINVAL;
+
+	rc = register_netdevice(rmnet_dev);
+	if (!rc) {
+		r->rmnet_devices[id] = rmnet_dev;
+		r->nr_rmnet_devs++;
+		rmnet_dev->rtnl_link_ops = &rmnet_link_ops;
+	}
+
+	return rc;
+}
+
+int rmnet_vnd_dellink(u8 id, struct rmnet_real_dev_info *r)
+{
+	if (id >= RMNET_MAX_VND || !r->rmnet_devices[id])
+		return -EINVAL;
+
+	r->rmnet_devices[id] = NULL;
+	r->nr_rmnet_devs--;
+	return 0;
+}
+
+u8 rmnet_vnd_get_mux(struct net_device *rmnet_dev)
+{
+	struct rmnet_priv *priv;
+
+	priv = netdev_priv(rmnet_dev);
+	return priv->mux_id;
+}
+
+void rmnet_vnd_set_mux(struct net_device *rmnet_dev, u8 mux_id)
+{
+	struct rmnet_priv *priv;
+
+	priv = netdev_priv(rmnet_dev);
+	priv->mux_id = mux_id;
+}
+
+/* Gets the logical endpoint configuration for a RmNet virtual network device
+ * node. Caller should confirm that devices is a RmNet VND before calling.
+ */
+struct rmnet_endpoint *rmnet_vnd_get_endpoint(struct net_device *rmnet_dev)
+{
+	struct rmnet_priv *priv;
+
+	if (!rmnet_dev)
+		return NULL;
+
+	priv = netdev_priv(rmnet_dev);
+
+	return &priv->local_ep;
+}
+
+int rmnet_vnd_do_flow_control(struct net_device *rmnet_dev, int enable)
+{
+	netdev_dbg(rmnet_dev, "Setting VND TX queue state to %d\n", enable);
+	/* Although we expect similar number of enable/disable
+	 * commands, optimize for the disable. That is more
+	 * latency sensitive than enable
+	 */
+	if (unlikely(enable))
+		netif_wake_queue(rmnet_dev);
+	else
+		netif_stop_queue(rmnet_dev);
+
+	return 0;
+}
diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.h b/drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.h
new file mode 100644
index 0000000..b102b42
--- /dev/null
+++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.h
@@ -0,0 +1,29 @@
+/* Copyright (c) 2013-2017, The Linux Foundation. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 and
+ * only version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * RMNET Data Virtual Network Device APIs
+ *
+ */
+
+#ifndef _RMNET_VND_H_
+#define _RMNET_VND_H_
+
+int rmnet_vnd_do_flow_control(struct net_device *dev, int enable);
+struct rmnet_endpoint *rmnet_vnd_get_endpoint(struct net_device *dev);
+int rmnet_vnd_newlink(u8 id, struct net_device *rmnet_dev,
+		      struct rmnet_real_dev_info *r);
+int rmnet_vnd_dellink(u8 id, struct rmnet_real_dev_info *r);
+void rmnet_vnd_rx_fixup(struct sk_buff *skb, struct net_device *dev);
+void rmnet_vnd_tx_fixup(struct sk_buff *skb, struct net_device *dev);
+u8 rmnet_vnd_get_mux(struct net_device *rmnet_dev);
+void rmnet_vnd_set_mux(struct net_device *rmnet_dev, u8 mux_id);
+void rmnet_vnd_setup(struct net_device *dev);
+#endif /* _RMNET_VND_H_ */
-- 
1.9.1

^ permalink raw reply related

* Re: [PATCH v2 net-next 2/6] udp: Constify skb argument in lookup functions
From: David Miller @ 2017-08-30  0:58 UTC (permalink / raw)
  To: tom; +Cc: netdev
In-Reply-To: <20170829232711.1465-3-tom@quantonium.net>

From: Tom Herbert <tom@quantonium.net>
Date: Tue, 29 Aug 2017 16:27:07 -0700

> For UDP socket lookup functions, and associateed functions that take an
> skbuf as argument, declare the skb argument as constant.
> 
> One caveat is that reuseport_select_sock can be called from the UDP
> lookup functions with an skb argument. This function temporarily
> modifies the skbuff data pointer (in bpf_run via a pull/push sequence).
> To resolve compiler warning I added a local skbuf declaration that is
> not const and assigned to the skb argument with an explicit cast.
> 
> Signed-off-by: Tom Herbert <tom@quantonium.net>

Please don't do this.

If reuseport_select_sock() modifies anything in the SKB, especially
skb->data, it infects the entire call chain.  So you can't mark it
const in this family of calls.

^ permalink raw reply

* Re: [PATCH v2 net-next 3/6] flow_dissector: Add protocol specific flow dissection offload
From: David Miller @ 2017-08-30  1:00 UTC (permalink / raw)
  To: tom; +Cc: netdev
In-Reply-To: <20170829232711.1465-4-tom@quantonium.net>

From: Tom Herbert <tom@quantonium.net>
Date: Tue, 29 Aug 2017 16:27:08 -0700

> +#define GOTO_BY_RESULT(ret) do {				\
> +	switch (ret) {						\
> +	case FLOW_DISSECT_RET_OUT_GOOD:				\
> +		goto out_good;					\
> +	case FLOW_DISSECT_RET_PROTO_AGAIN:			\
> +		goto proto_again;				\
> +	case FLOW_DISSECT_RET_IPPROTO_AGAIN:			\
> +		goto ip_proto_again;				\
> +	case FLOW_DISSECT_RET_OUT_BAD:				\
> +	default:						\
> +		goto out_bad;					\
> +	}							\
> +} while (0)
> +
> +#define GOTO_OR_CONT_BY_RESULT(ret) do {			\
> +	enum flow_dissect_ret __ret = (ret);			\
> +								\
> +	if (__ret != FLOW_DISSECT_RET_CONTINUE)			\
> +		GOTO_BY_RESULT(__ret);				\
> +} while (0)

Please don't hide major control flow changes inside of a macro.  This
means returns and gotos.

It makes code impossible to audit.

Yes, this applies even if the macro has the word "GOTO" in it :-)

^ permalink raw reply

* Re: [PATCH v2 net-next 1/8] bpf: Add support for recursively running cgroup sock filters
From: David Ahern @ 2017-08-30  1:03 UTC (permalink / raw)
  To: Alexei Starovoitov; +Cc: netdev, daniel, ast, tj, davem, luto
In-Reply-To: <20170829041118.m6bsjvif2bxwtk6g@ast-mbp>

On 8/28/17 10:11 PM, Alexei Starovoitov wrote:
> 
> Agree on the above, but you're mixing semantics of the new recurse
> flag and implementation of it. Ex: we don't have to copy this flag
> from prog->attr into cgroup. So this reset or non-reset discussion
> only makes sense in the context of your current implementation.
> We can implement the logic differently. Like don't copy that flag
> at all and at attach time walk parent->parent->parent and see
> what programs are attached. All of them should have prog->attr & recurse_bit set
> In such implementation detach from 'b' is a nop from reset/non-reset
> point of view. When socket creation in 'c' is invoked the program
> 'c' is called first then the code keeps walking parents until root
> invoking 'a' along the way.

So you are suggesting there is no recursive flag per cgroup? How do you
know you need to walk cgroups? How do you know when to stop running
programs?

> I'm not saying it will be an efficient implementation. The point
> is to discuss UAPI independent of implementation.
> 
>> ###
>>
>> Also, let's agree on this intention. Based on the new ground rule, I
>> want to point out this example:
>>
>> If 'a' gets a program installed with no recurse flag set, ONLY processes
>> in 'a' have the 'a' program run. Processes in groups 'b', 'c' and 'd'
>> all stop at cgroup 'b' program.
> 
> I'm proposing that such situation should not be allowed to happen.
> In a->b->c->d cgroup scenario if override+recurse prog attached to 'b'
> then only the same override+recurse can be attached to c, d, a.
> So at detach time there can be gaps (like only 'b' and 'd' have
> override+recurse progs), but walking up until root from any point
> will guarantee that only override+recurse programs are seen.
> 

That seems very limiting to me. Seems like you are suggesting the entire
cgroup tree is recursive or non-recursive, but never a mix.

^ permalink raw reply

* Re: [PATCH net-next 0/3 v10] Add support for rmnet driver
From: David Miller @ 2017-08-30  1:05 UTC (permalink / raw)
  To: subashab
  Cc: netdev, fengguang.wu, dcbw, jiri, stephen, David.Laight, marcel,
	andrew
In-Reply-To: <1504054078-10173-1-git-send-email-subashab@codeaurora.org>

From: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Date: Tue, 29 Aug 2017 18:47:55 -0600

> I have updated the locking scheme as follows -

Series applied, but this is not how you write a header posting for a
patch set.

This posting is where you say at a high level what the patch series is
doing, how it is doing it, and why it is doing it that way.

You can explain what changes happened, and why, but that belongs
in the changelog at the end of this posting.  Here you've made
an explaination for one change the entire content of the text.

You not even saying what rmnet is, why we would want to add it to the
kernel, and what it's all about.  So now when someone tries to read
the merge commit that contains this text, they will have no context
about you and me talking about locking and they will thus ask
themselves "what is this person talking about here?  it's not
explaining the patch series at all"

^ permalink raw reply

* Re: [PATCH net-next 0/3 v10] Add support for rmnet driver
From: David Miller @ 2017-08-30  1:12 UTC (permalink / raw)
  To: subashab
  Cc: netdev, fengguang.wu, dcbw, jiri, stephen, David.Laight, marcel,
	andrew
In-Reply-To: <20170829.180557.1826943085634265902.davem@davemloft.net>


Sigh, I had to revert.

You only allow RMNET to take on the values 'y' and 'n'.

You must allow for it to be 'm' and modular as well.

^ permalink raw reply

* Re: [PATCH net-next 0/3 v10] Add support for rmnet driver
From: Subash Abhinov Kasiviswanathan @ 2017-08-30  1:18 UTC (permalink / raw)
  To: David Miller
  Cc: netdev, fengguang.wu, dcbw, jiri, stephen, David.Laight, marcel,
	andrew
In-Reply-To: <20170829.181244.658164638453646342.davem@davemloft.net>

On 2017-08-29 19:12, David Miller wrote:
> Sigh, I had to revert.
> 
> You only allow RMNET to take on the values 'y' and 'n'.
> 
> You must allow for it to be 'm' and modular as well.

Hi David

I'll fix this now.

Sorry about the cover letter. I'll explain it better in subsequent 
submission.

^ permalink raw reply

* Re: [PATCH net] sch_hhf: fix null pointer dereference on init failure
From: Cong Wang @ 2017-08-30  1:24 UTC (permalink / raw)
  To: Nikolay Aleksandrov
  Cc: Linux Kernel Network Developers, Eric Dumazet, Jamal Hadi Salim,
	Jiri Pirko, Roopa Prabhu
In-Reply-To: <1504033335-19098-1-git-send-email-nikolay@cumulusnetworks.com>

On Tue, Aug 29, 2017 at 12:02 PM, Nikolay Aleksandrov
<nikolay@cumulusnetworks.com> wrote:
> First I did it with the check in the for () conditional, but this is more
> visible and explicit. Let me know if you'd like the shorter version. :-)

Or, if you want to make the patch size smaller, just check NULL
before for():

if (!q->hh_flows)
    return;

for (...)

Up to you, I have no strong opinion here, slightly prefer a smaller
one for backport.

^ permalink raw reply

* Re: [PATCH net-next] virtio-net: invoke zerocopy callback on xmit path if no tx napi
From: Jason Wang @ 2017-08-30  1:45 UTC (permalink / raw)
  To: Willem de Bruijn, Michael S. Tsirkin
  Cc: Koichiro Den, virtualization, Network Development
In-Reply-To: <CAF=yD-+Wk9sc9dXMUq1+x_hh=3ThTXa6BnZkygP3tgVpjbp93g@mail.gmail.com>



On 2017年08月30日 03:35, Willem de Bruijn wrote:
> On Fri, Aug 25, 2017 at 9:03 PM, Willem de Bruijn
> <willemdebruijn.kernel@gmail.com> wrote:
>> On Fri, Aug 25, 2017 at 7:32 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
>>> On Fri, Aug 25, 2017 at 06:44:36PM -0400, Willem de Bruijn wrote:
>>>>>>>>> We don't enable network watchdog on virtio but we could and maybe
>>>>>>>>> should.
>>>>>>>> Can you elaborate?
>>>>>>> The issue is that holding onto buffers for very long times makes guests
>>>>>>> think they are stuck. This is funamentally because from guest point of
>>>>>>> view this is a NIC, so it is supposed to transmit things out in
>>>>>>> a timely manner. If host backs the virtual NIC by something that is not
>>>>>>> a NIC, with traffic shaping etc introducing unbounded latencies,
>>>>>>> guest will be confused.
>>>>>> That assumes that guests are fragile in this regard. A linux guest
>>>>>> does not make such assumptions.
>>>>> Yes it does. Examples above:
>>>>>          > > - a single slow flow can occupy the whole ring, you will not
>>>>>          > >   be able to make any new buffers available for the fast flow
>>>> Oh, right. Though those are due to vring_desc pool exhaustion
>>>> rather than an upper bound on latency of any single packet.
>>>>
>>>> Limiting the number of zerocopy packets in flight to some fraction
>>>> of the ring ensures that fast flows can always grab a slot.
>>>> Running
>>>> out of ubuf_info slots reverts to copy, so indirectly does this. But
>>>> I read it correclty the zerocopy pool may be equal to or larger than
>>>> the descriptor pool. Should we refine the zcopy_used test
>>>>
>>>>      (nvq->upend_idx + 1) % UIO_MAXIOV != nvq->done_idx
>>>>
>>>> to also return false if the number of outstanding ubuf_info is greater
>>>> than, say, vq->num >> 1?
>>>
>>> We'll need to think about where to put the threshold, but I think it's
>>> a good idea.
>>>
>>> Maybe even a fixed number, e.g. max(vq->num >> 1, X) to limit host
>>> resources.
>>>
>>> In a sense it still means once you run out of slots zcopt gets disabled possibly permanently.
>>>
>>> Need to experiment with some numbers.
>> I can take a stab with two flows, one delayed in a deep host qdisc
>> queue. See how this change affects the other flow and also how
>> sensitive that is to the chosen threshold value.
> Incomplete results at this stage, but I do see this correlation between
> flows. It occurs even while not running out of zerocopy descriptors,
> which I cannot yet explain.
>
> Running two threads in a guest, each with a udp socket, each
> sending up to 100 datagrams, or until EAGAIN, every msec.
>
> Sender A sends 1B datagrams.
> Sender B sends VHOST_GOODCOPY_LEN, which is enough
> to trigger zcopy_used in vhost net.
>
> A local receive process on the host receives both flows. To avoid
> a deep copy when looping the packet onto the receive path,
> changed skb_orphan_frags_rx to always return false (gross hack).
>
> The flow with the larger packets is redirected through netem on ifb0:
>
>    modprobe ifb
>    ip link set dev ifb0 up
>    tc qdisc add dev ifb0 root netem limit $LIMIT rate 1MBit
>
>    tc qdisc add dev tap0 ingress
>    tc filter add dev tap0 parent ffff: protocol ip \
>        u32 match ip dport 8000 0xffff \
>        action mirred egress redirect dev ifb0
>
> For 10 second run, packet count with various ifb0 queue lengths $LIMIT:
>
> no filter
>    rx.A: ~840,000
>    rx.B: ~840,000

Just to make sure I understand the case here. What did rx.B mean here? I 
thought all traffic sent by Sender B has been redirected to ifb0?

>
> limit 1
>    rx.A: ~500,000
>    rx.B: ~3100
>    ifb0: 3273 sent, 371141 dropped
>
> limit 100
>    rx.A: ~9000
>    rx.B: ~4200
>    ifb0: 4630 sent, 1491 dropped
>
> limit 1000
>    rx.A: ~6800
>    rx.B: ~4200
>    ifb0: 4651 sent, 0 dropped
>
> Sender B is always correctly rate limited to 1 MBps or less. With a
> short queue, it ends up dropping a lot and sending even less.
>
> When a queue builds up for sender B, sender A throughput is strongly
> correlated with queue length. With queue length 1, it can send almost
> at unthrottled speed. But even at limit 100 its throughput is on the
> same order as sender B.
>
> What is surprising to me is that this happens even though the number
> of ubuf_info in use at limit 100 is around 100 at all times. In other words,
> it does not exhaust the pool.
>
> When forcing zcopy_used to be false for all packets, this effect of
> sender A throughput being correlated with sender B does not happen.
>
> no filter
>    rx.A: ~850,000
>    rx.B: ~850,000
>
> limit 100
>    rx.A: ~850,000
>    rx.B: ~4200
>    ifb0: 4518 sent, 876182 dropped
>
> Also relevant is that with zerocopy, the sender processes back off
> and report the same count as the receiver. Without zerocopy,
> both senders send at full speed, even if only 4200 packets from flow
> B arrive at the receiver.
>
> This is with the default virtio_net driver, so without napi-tx.

What kind of qdisc do you use in guest? I suspect we should use 
something which could do fair queueing (e.g sfq).

>
> It appears that the zerocopy notifications are pausing the guest.
> Will look at that now.

Another factor is the tx interrupt coalescing parameters of ifb0, maybe 
we should disable it during the test.

Thanks

>
> By the way, I have had an unrelated patch outstanding for a while
> to have virtio-net support the VIRTIO_CONFIG_S_NEEDS_RESET
> command. Will send that as RFC.

^ permalink raw reply

* RE: Question about ip_defrag
From: liujian (CE) @ 2017-08-30  1:52 UTC (permalink / raw)
  To: Florian Westphal
  Cc: Jesper Dangaard Brouer, netdev@vger.kernel.org,
	Wangkefeng (Kevin), weiyongjun (A)
In-Reply-To: <20170829134635.GB9993@breakpoint.cc>




Best Regards,
liujian


> -----Original Message-----
> From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org]
> On Behalf Of Florian Westphal
> Sent: Tuesday, August 29, 2017 9:47 PM
> To: liujian (CE)
> Cc: Florian Westphal; Jesper Dangaard Brouer; netdev@vger.kernel.org;
> Wangkefeng (Kevin); weiyongjun (A)
> Subject: Re: Question about ip_defrag
> 
> liujian (CE) <liujian56@huawei.com> wrote:
> 
> [ trimming cc list ]
> 
> > Now, I have not the real environment.
> > I use iperf generate fragment packets; and I always change NIC rx
> > irq's affinity cpu, to make sure frag_mem_limit reach to thresh.
> > my test machine, CPU num is 384.
> 
> Oh well, that explains it.
> 
> > > > +	if (frag_mem_limit(nf) > nf->low_thresh) {
> > > >  		inet_frag_schedule_worker(f);
> > > > +		update_frag_mem_limit(nf, SKB_TRUESIZE(1500) * 16);
> > > > +	}
> 
> You need to reduce this to a lower value.
> Your cpu count * batch_value needs to be less than low_thresh to avoid
> problems.
> 
> Wtih 384 cpus its close to 12 mbyte...
> 
> Perhaps do this:
> 
> update_frag_mem_limit(nf, 2 * 1024*1024 / NR_CPUS);
> 
> 
> However, I think its better to revert the percpu counter change and move back
> to a single atomic_t count.

Ok. 
Florian and Jesper, many thanks for this issue. 

^ permalink raw reply

* Re: [PATCH v2 net-next 1/8] bpf: Add support for recursively running cgroup sock filters
From: Alexei Starovoitov @ 2017-08-30  2:58 UTC (permalink / raw)
  To: David Ahern; +Cc: netdev, daniel, ast, tj, davem, luto
In-Reply-To: <ab9b2874-fadc-5ab6-7710-2ccb1d65bb2c@gmail.com>

On Tue, Aug 29, 2017 at 07:03:43PM -0600, David Ahern wrote:
> On 8/28/17 10:11 PM, Alexei Starovoitov wrote:
> > 
> > Agree on the above, but you're mixing semantics of the new recurse
> > flag and implementation of it. Ex: we don't have to copy this flag
> > from prog->attr into cgroup. So this reset or non-reset discussion
> > only makes sense in the context of your current implementation.
> > We can implement the logic differently. Like don't copy that flag
> > at all and at attach time walk parent->parent->parent and see
> > what programs are attached. All of them should have prog->attr & recurse_bit set
> > In such implementation detach from 'b' is a nop from reset/non-reset
> > point of view. When socket creation in 'c' is invoked the program
> > 'c' is called first then the code keeps walking parents until root
> > invoking 'a' along the way.
> 
> So you are suggesting there is no recursive flag per cgroup? How do you
> know you need to walk cgroups? How do you know when to stop running
> programs?

you're talking about implementation, right?
My 'proposed' implemenation of walking from cgroup all the way to the root
is just an example. It's not efficient. More below...

> > I'm not saying it will be an efficient implementation. The point
> > is to discuss UAPI independent of implementation.
> > 
> >> ###
> >>
> >> Also, let's agree on this intention. Based on the new ground rule, I
> >> want to point out this example:
> >>
> >> If 'a' gets a program installed with no recurse flag set, ONLY processes
> >> in 'a' have the 'a' program run. Processes in groups 'b', 'c' and 'd'
> >> all stop at cgroup 'b' program.
> > 
> > I'm proposing that such situation should not be allowed to happen.
> > In a->b->c->d cgroup scenario if override+recurse prog attached to 'b'
> > then only the same override+recurse can be attached to c, d, a.
> > So at detach time there can be gaps (like only 'b' and 'd' have
> > override+recurse progs), but walking up until root from any point
> > will guarantee that only override+recurse programs are seen.
> > 
> 
> That seems very limiting to me. Seems like you are suggesting the entire
> cgroup tree is recursive or non-recursive, but never a mix.

Entire cgroup subtree. Yes. It's the simplest uapi I could think of.
Easy to understand and argue about and I think it's solving your use case.
It's also easily extendable. New combination and features won't break
the users. It feels you're in rush to get this stuff for this merge
window, therefore I want to agree on something that is simple,
non-controversial and extensible.
If you're not in rush (I'm not), we can come up with more flexible uapi.
For example: another way of thinking about your 'recursive' requirement
is to think that all 'program to be run' should be present as a link list
in a given cgroup. So no walking a chain of parents.
Instead of 'recursive' let's call this new flag 'multiprog'.
Now in a->b->c->d scenario. We can install 'multiprog' prog in 'b'.
The kernel will automatically propage it (like it does right now
with css_for_each_descendant_pre() loop) to 'c' and to 'd'.
Now we allow users to attach another 'multiprog' program to 'c'.
The kernel will maintain a link list of programs in every cgroup,
so there will be a link list of two programs in 'c' and 'd'
and invocation of the programs will be faster than walking
cgroup->parent->parent and checking some flags at every step,
since there will be less pointer dereferences and no flags to check.
Just invoke all programs in the current cgroup. Kernel took care
of ordering at the time of attach/detach.
I believe Andy proposed something like this back in Jan/Feb.

^ permalink raw reply

* Re: [PATCH v2 net-next 2/6] udp: Constify skb argument in lookup functions
From: Tom Herbert @ 2017-08-30  3:09 UTC (permalink / raw)
  To: David Miller; +Cc: Linux Kernel Network Developers
In-Reply-To: <20170829.175832.437965683493864458.davem@davemloft.net>

On Tue, Aug 29, 2017 at 5:58 PM, David Miller <davem@davemloft.net> wrote:
> From: Tom Herbert <tom@quantonium.net>
> Date: Tue, 29 Aug 2017 16:27:07 -0700
>
>> For UDP socket lookup functions, and associateed functions that take an
>> skbuf as argument, declare the skb argument as constant.
>>
>> One caveat is that reuseport_select_sock can be called from the UDP
>> lookup functions with an skb argument. This function temporarily
>> modifies the skbuff data pointer (in bpf_run via a pull/push sequence).
>> To resolve compiler warning I added a local skbuf declaration that is
>> not const and assigned to the skb argument with an explicit cast.
>>
>> Signed-off-by: Tom Herbert <tom@quantonium.net>
>
> Please don't do this.
>
> If reuseport_select_sock() modifies anything in the SKB, especially
> skb->data, it infects the entire call chain.  So you can't mark it
> const in this family of calls.
>
reuseport_select_sock calls run_bpf that calls pskb_pull to
"temporarily advance data past protocol header" and it calls
bpf_prog_run_save_cb which takes non-constant skb argument. This is
the only instance in all the udp lookup functions where non-constant
is needed. It's logical that constant skbuf makes sense for socket
lookup-- I doubt any caller would expect the skbuf to be modified as a
side effect. It's also an implicit characteristic since
reuseport_select_sock may just clone the socket before calling BPF.

The problem is that all the flow dissector functions operate on const
skbs (again that's logical :-) ). So if we want to be able to call
lookup functions or even BPF to do flow dissection, then I think
something needs to change. I really don't want to unconsitify the flow
dissector functions. We could just always do the skb before calling
BPF, but I suppose that is a potential performance hit. Is there a
better way to resolve this?

Thanks,
Tom

^ permalink raw reply

* Re: [PATCH net-next] virtio-net: invoke zerocopy callback on xmit path if no tx napi
From: Willem de Bruijn @ 2017-08-30  3:11 UTC (permalink / raw)
  To: Jason Wang
  Cc: Michael S. Tsirkin, Koichiro Den, virtualization,
	Network Development
In-Reply-To: <b8893b72-4d09-2492-0d31-5135286e6874@redhat.com>

On Tue, Aug 29, 2017 at 9:45 PM, Jason Wang <jasowang@redhat.com> wrote:
>
>
> On 2017年08月30日 03:35, Willem de Bruijn wrote:
>>
>> On Fri, Aug 25, 2017 at 9:03 PM, Willem de Bruijn
>> <willemdebruijn.kernel@gmail.com> wrote:
>>>
>>> On Fri, Aug 25, 2017 at 7:32 PM, Michael S. Tsirkin <mst@redhat.com>
>>> wrote:
>>>>
>>>> On Fri, Aug 25, 2017 at 06:44:36PM -0400, Willem de Bruijn wrote:
>>>>>>>>>>
>>>>>>>>>> We don't enable network watchdog on virtio but we could and maybe
>>>>>>>>>> should.
>>>>>>>>>
>>>>>>>>> Can you elaborate?
>>>>>>>>
>>>>>>>> The issue is that holding onto buffers for very long times makes
>>>>>>>> guests
>>>>>>>> think they are stuck. This is funamentally because from guest point
>>>>>>>> of
>>>>>>>> view this is a NIC, so it is supposed to transmit things out in
>>>>>>>> a timely manner. If host backs the virtual NIC by something that is
>>>>>>>> not
>>>>>>>> a NIC, with traffic shaping etc introducing unbounded latencies,
>>>>>>>> guest will be confused.
>>>>>>>
>>>>>>> That assumes that guests are fragile in this regard. A linux guest
>>>>>>> does not make such assumptions.
>>>>>>
>>>>>> Yes it does. Examples above:
>>>>>>          > > - a single slow flow can occupy the whole ring, you will
>>>>>> not
>>>>>>          > >   be able to make any new buffers available for the fast
>>>>>> flow
>>>>>
>>>>> Oh, right. Though those are due to vring_desc pool exhaustion
>>>>> rather than an upper bound on latency of any single packet.
>>>>>
>>>>> Limiting the number of zerocopy packets in flight to some fraction
>>>>> of the ring ensures that fast flows can always grab a slot.
>>>>> Running
>>>>> out of ubuf_info slots reverts to copy, so indirectly does this. But
>>>>> I read it correclty the zerocopy pool may be equal to or larger than
>>>>> the descriptor pool. Should we refine the zcopy_used test
>>>>>
>>>>>      (nvq->upend_idx + 1) % UIO_MAXIOV != nvq->done_idx
>>>>>
>>>>> to also return false if the number of outstanding ubuf_info is greater
>>>>> than, say, vq->num >> 1?
>>>>
>>>>
>>>> We'll need to think about where to put the threshold, but I think it's
>>>> a good idea.
>>>>
>>>> Maybe even a fixed number, e.g. max(vq->num >> 1, X) to limit host
>>>> resources.
>>>>
>>>> In a sense it still means once you run out of slots zcopt gets disabled
>>>> possibly permanently.
>>>>
>>>> Need to experiment with some numbers.
>>>
>>> I can take a stab with two flows, one delayed in a deep host qdisc
>>> queue. See how this change affects the other flow and also how
>>> sensitive that is to the chosen threshold value.
>>
>> Incomplete results at this stage, but I do see this correlation between
>> flows. It occurs even while not running out of zerocopy descriptors,
>> which I cannot yet explain.
>>
>> Running two threads in a guest, each with a udp socket, each
>> sending up to 100 datagrams, or until EAGAIN, every msec.
>>
>> Sender A sends 1B datagrams.
>> Sender B sends VHOST_GOODCOPY_LEN, which is enough
>> to trigger zcopy_used in vhost net.
>>
>> A local receive process on the host receives both flows. To avoid
>> a deep copy when looping the packet onto the receive path,
>> changed skb_orphan_frags_rx to always return false (gross hack).
>>
>> The flow with the larger packets is redirected through netem on ifb0:
>>
>>    modprobe ifb
>>    ip link set dev ifb0 up
>>    tc qdisc add dev ifb0 root netem limit $LIMIT rate 1MBit
>>
>>    tc qdisc add dev tap0 ingress
>>    tc filter add dev tap0 parent ffff: protocol ip \
>>        u32 match ip dport 8000 0xffff \
>>        action mirred egress redirect dev ifb0
>>
>> For 10 second run, packet count with various ifb0 queue lengths $LIMIT:
>>
>> no filter
>>    rx.A: ~840,000
>>    rx.B: ~840,000
>
>
> Just to make sure I understand the case here. What did rx.B mean here? I
> thought all traffic sent by Sender B has been redirected to ifb0?

It has been, but the packet still arrives at the destination socket.
IFB is a special virtual device that applies traffic shaping and
then reinjects it back at the point it was intercept by mirred.

rx.B is indeed arrival rate at the receiver, similar to rx.A.

>>
>> limit 1
>>    rx.A: ~500,000
>>    rx.B: ~3100
>>    ifb0: 3273 sent, 371141 dropped
>>
>> limit 100
>>    rx.A: ~9000
>>    rx.B: ~4200
>>    ifb0: 4630 sent, 1491 dropped
>>
>> limit 1000
>>    rx.A: ~6800
>>    rx.B: ~4200
>>    ifb0: 4651 sent, 0 dropped
>>
>> Sender B is always correctly rate limited to 1 MBps or less. With a
>> short queue, it ends up dropping a lot and sending even less.
>>
>> When a queue builds up for sender B, sender A throughput is strongly
>> correlated with queue length. With queue length 1, it can send almost
>> at unthrottled speed. But even at limit 100 its throughput is on the
>> same order as sender B.
>>
>> What is surprising to me is that this happens even though the number
>> of ubuf_info in use at limit 100 is around 100 at all times. In other
>> words,
>> it does not exhaust the pool.
>>
>> When forcing zcopy_used to be false for all packets, this effect of
>> sender A throughput being correlated with sender B does not happen.
>>
>> no filter
>>    rx.A: ~850,000
>>    rx.B: ~850,000
>>
>> limit 100
>>    rx.A: ~850,000
>>    rx.B: ~4200
>>    ifb0: 4518 sent, 876182 dropped
>>
>> Also relevant is that with zerocopy, the sender processes back off
>> and report the same count as the receiver. Without zerocopy,
>> both senders send at full speed, even if only 4200 packets from flow
>> B arrive at the receiver.
>>
>> This is with the default virtio_net driver, so without napi-tx.
>
>
> What kind of qdisc do you use in guest? I suspect we should use something
> which could do fair queueing (e.g sfq).

Or fq. The test was using the default, pfifo_fast.

This particular two flow test probably would not be affected,
as something else is delaying both flows equally once some
completions are delayed.

One obvious candidate would be hitting VHOST_MAX_PEND.
But I instrumented that and handle_tx is never throttled by
vhost_exceeds_maxpend.

Btw, vhost_exceeds_maxpend implements almost what I
suggested earlier and was planning to test here: ensure that
only part of the descriptor pool is filled with zerocopy requests.

Only, it currently breaks out of the loop when the max is
reached. I think that we should move it into the main
zcopy_used calculation, so that hitting this threshold reverts to
copy-based transmission, instead of delaying all tx until
zerocopy budget opens up.


>>
>> It appears that the zerocopy notifications are pausing the guest.
>> Will look at that now.
>
>
> Another factor is the tx interrupt coalescing parameters of ifb0, maybe we
> should disable it during the test.
>
> Thanks
>
>
>>
>> By the way, I have had an unrelated patch outstanding for a while
>> to have virtio-net support the VIRTIO_CONFIG_S_NEEDS_RESET
>> command. Will send that as RFC.
>
>

^ permalink raw reply

* Re: [PATCH net-next] ipv6: Use rt6i_idev index for echo replies to a local address
From: Eric Dumazet @ 2017-08-30  3:18 UTC (permalink / raw)
  To: David Ahern; +Cc: netdev, tariqt
In-Reply-To: <1503953614-32395-1-git-send-email-dsahern@gmail.com>

On Mon, 2017-08-28 at 13:53 -0700, David Ahern wrote:
> Tariq repored local pings to linklocal address is failing:
> $ ifconfig ens8
> ens8: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
>         inet 11.141.16.6  netmask 255.255.0.0  broadcast 11.141.255.255
>         inet6 fe80::7efe:90ff:fecb:7502  prefixlen 64  scopeid 0x20<link>
>         ether 7c:fe:90:cb:75:02  txqueuelen 1000  (Ethernet)
>         RX packets 12  bytes 1164 (1.1 KiB)
>         RX errors 0  dropped 0  overruns 0  frame 0
>         TX packets 30  bytes 2484 (2.4 KiB)
>         TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
> 
> $  /bin/ping6 -c 3 fe80::7efe:90ff:fecb:7502%ens8
> PING fe80::7efe:90ff:fecb:7502%ens8(fe80::7efe:90ff:fecb:7502) 56 data bytes
> 
> --- fe80::7efe:90ff:fecb:7502%ens8 ping statistics ---

Note that the presence of this leading --- had the effect of truncating
the merged patch from this point. (all tags were ignored)

> 3 packets transmitted, 0 received, 100% packet loss, time 2043ms
> 
> icmpv6_echo_reply needs to use the rt6i_idev dev index for local traffic
> similar to how icmp6_send does. Convert the change for icmp6_send into a
> helper that can be used in both places. Add the long over due
> skb_rt6_info helper to convert dst on an skb to rt6_info similar to
> skb_rtable for ipv4.
> 
> Fixes: 4832c30d5458 ("net: ipv6: put host and anycast routes on
>        device with address")
> Reported-by: Tariq Toukan <tariqt@mellanox.com>
> Signed-off-by: David Ahern <dsahern@gmail.com>
> ---

^ permalink raw reply

* Re: [PATCH v2 net-next 1/8] bpf: Add support for recursively running cgroup sock filters
From: David Ahern @ 2017-08-30  3:38 UTC (permalink / raw)
  To: Alexei Starovoitov; +Cc: netdev, daniel, ast, tj, davem, luto
In-Reply-To: <20170830025850.wlp2hp7d4kreodif@ast-mbp>

On 8/29/17 8:58 PM, Alexei Starovoitov wrote:
> On Tue, Aug 29, 2017 at 07:03:43PM -0600, David Ahern wrote:
>> On 8/28/17 10:11 PM, Alexei Starovoitov wrote:
>>>
>>> Agree on the above, but you're mixing semantics of the new recurse
>>> flag and implementation of it. Ex: we don't have to copy this flag
>>> from prog->attr into cgroup. So this reset or non-reset discussion
>>> only makes sense in the context of your current implementation.
>>> We can implement the logic differently. Like don't copy that flag
>>> at all and at attach time walk parent->parent->parent and see
>>> what programs are attached. All of them should have prog->attr & recurse_bit set
>>> In such implementation detach from 'b' is a nop from reset/non-reset
>>> point of view. When socket creation in 'c' is invoked the program
>>> 'c' is called first then the code keeps walking parents until root
>>> invoking 'a' along the way.
>>
>> So you are suggesting there is no recursive flag per cgroup? How do you
>> know you need to walk cgroups? How do you know when to stop running
>> programs?
> 
> you're talking about implementation, right?
> My 'proposed' implemenation of walking from cgroup all the way to the root
> is just an example. It's not efficient. More below...
> 
>>> I'm not saying it will be an efficient implementation. The point
>>> is to discuss UAPI independent of implementation.
>>>
>>>> ###
>>>>
>>>> Also, let's agree on this intention. Based on the new ground rule, I
>>>> want to point out this example:
>>>>
>>>> If 'a' gets a program installed with no recurse flag set, ONLY processes
>>>> in 'a' have the 'a' program run. Processes in groups 'b', 'c' and 'd'
>>>> all stop at cgroup 'b' program.
>>>
>>> I'm proposing that such situation should not be allowed to happen.
>>> In a->b->c->d cgroup scenario if override+recurse prog attached to 'b'
>>> then only the same override+recurse can be attached to c, d, a.
>>> So at detach time there can be gaps (like only 'b' and 'd' have
>>> override+recurse progs), but walking up until root from any point
>>> will guarantee that only override+recurse programs are seen.
>>>
>>
>> That seems very limiting to me. Seems like you are suggesting the entire
>> cgroup tree is recursive or non-recursive, but never a mix.
> 
> Entire cgroup subtree. Yes. It's the simplest uapi I could think of.

So 10 email exchanges later you agree on the UAPI I implemented in this
patch: user opts in to recursive behavior via a new flag at attach time,
and once a recursive program is installed at some point in the cgroup
tree it applies to all descendant cgroups.

So all of these exchanges weren't about the UAPI, but your disagreement
in my implementation. The only user visible change here is only programs
marked recursive are run versus going back to the first cgroup marked
non-recursive.

> Easy to understand and argue about and I think it's solving your use case.
> It's also easily extendable. New combination and features won't break
> the users. It feels you're in rush to get this stuff for this merge
> window, therefore I want to agree on something that is simple,
> non-controversial and extensible.

I am in no-rush, but this does not to fall by the wayside like the net
namespace specification.

Given the pending release of 4.13 net-next will close which gives a 2+
week window to work on v3 before the next merge window. Plenty of time
for me to work it into the many other things on my plate.

^ permalink raw reply

* [net-next:master 427/429] drivers/net/ethernet/broadcom/genet/bcmgenet.h:687:10: note: in expansion of macro '__raw_writel'
From: kbuild test robot @ 2017-08-30  3:38 UTC (permalink / raw)
  To: Florian Fainelli; +Cc: kbuild-all, netdev

[-- Attachment #1: Type: text/plain, Size: 8749 bytes --]

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git master
head:   eaa72dc47488d599439cd0fd0f8c4f1bcb3906bb
commit: 69d2ea9c798983c4a7157278ec84ff969d1cd8e8 [427/429] net: bcmgenet: Use correct I/O accessors
config: blackfin-allyesconfig (attached as .config)
compiler: bfin-uclinux-gcc (GCC) 6.2.0
reproduce:
        wget https://raw.githubusercontent.com/01org/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        git checkout 69d2ea9c798983c4a7157278ec84ff969d1cd8e8
        # save the attached .config to linux build tree
        make.cross ARCH=blackfin 

All error/warnings (new ones prefixed by >>):

   In file included from arch/blackfin/mach-bf533/include/mach/blackfin.h:15:0,
                    from arch/blackfin/include/asm/irqflags.h:11,
                    from include/linux/irqflags.h:15,
                    from arch/blackfin/include/asm/bitops.h:33,
                    from include/linux/bitops.h:36,
                    from include/linux/kernel.h:10,
                    from drivers/net/ethernet/broadcom/genet/bcmgenet.c:13:
   drivers/net/ethernet/broadcom/genet/bcmgenet.h: In function 'bcmgenet_ext_writel':
>> arch/blackfin/include/asm/def_LPBlackfin.h:38:2: error: expected expression before '__asm__'
     __asm__ __volatile__( \
     ^
>> arch/blackfin/include/asm/def_LPBlackfin.h:51:33: note: in expansion of macro '_bfin_writeX'
    #define bfin_write32(addr, val) _bfin_writeX(addr, val, 32,  )
                                    ^~~~~~~~~~~~
>> arch/blackfin/include/asm/io.h:20:33: note: in expansion of macro 'bfin_write32'
    #define __raw_writel(val, addr) bfin_write32(addr, val)
                                    ^~~~~~~~~~~~
>> drivers/net/ethernet/broadcom/genet/bcmgenet.h:687:10: note: in expansion of macro '__raw_writel'
      return __raw_writel(val, priv->base + offset + off); \
             ^~~~~~~~~~~~
   drivers/net/ethernet/broadcom/genet/bcmgenet.h:692:1: note: in expansion of macro 'GENET_IO_MACRO'
    GENET_IO_MACRO(ext, GENET_EXT_OFF);
    ^~~~~~~~~~~~~~
>> arch/blackfin/include/asm/def_LPBlackfin.h:38:2: warning: 'return' with a value, in function returning void
     __asm__ __volatile__( \
     ^
>> arch/blackfin/include/asm/def_LPBlackfin.h:51:33: note: in expansion of macro '_bfin_writeX'
    #define bfin_write32(addr, val) _bfin_writeX(addr, val, 32,  )
                                    ^~~~~~~~~~~~
>> arch/blackfin/include/asm/io.h:20:33: note: in expansion of macro 'bfin_write32'
    #define __raw_writel(val, addr) bfin_write32(addr, val)
                                    ^~~~~~~~~~~~
>> drivers/net/ethernet/broadcom/genet/bcmgenet.h:687:10: note: in expansion of macro '__raw_writel'
      return __raw_writel(val, priv->base + offset + off); \
             ^~~~~~~~~~~~
   drivers/net/ethernet/broadcom/genet/bcmgenet.h:692:1: note: in expansion of macro 'GENET_IO_MACRO'
    GENET_IO_MACRO(ext, GENET_EXT_OFF);
    ^~~~~~~~~~~~~~
   In file included from drivers/net/ethernet/broadcom/genet/bcmgenet.c:49:0:
   drivers/net/ethernet/broadcom/genet/bcmgenet.h:683:20: note: declared here
    static inline void bcmgenet_##name##_writel(struct bcmgenet_priv *priv, \
                       ^
   drivers/net/ethernet/broadcom/genet/bcmgenet.h:692:1: note: in expansion of macro 'GENET_IO_MACRO'
    GENET_IO_MACRO(ext, GENET_EXT_OFF);
    ^~~~~~~~~~~~~~
   In file included from arch/blackfin/mach-bf533/include/mach/blackfin.h:15:0,
                    from arch/blackfin/include/asm/irqflags.h:11,
                    from include/linux/irqflags.h:15,
                    from arch/blackfin/include/asm/bitops.h:33,
                    from include/linux/bitops.h:36,
                    from include/linux/kernel.h:10,
                    from drivers/net/ethernet/broadcom/genet/bcmgenet.c:13:
   drivers/net/ethernet/broadcom/genet/bcmgenet.h: In function 'bcmgenet_umac_writel':
>> arch/blackfin/include/asm/def_LPBlackfin.h:38:2: error: expected expression before '__asm__'
     __asm__ __volatile__( \
     ^
>> arch/blackfin/include/asm/def_LPBlackfin.h:51:33: note: in expansion of macro '_bfin_writeX'
    #define bfin_write32(addr, val) _bfin_writeX(addr, val, 32,  )
                                    ^~~~~~~~~~~~
>> arch/blackfin/include/asm/io.h:20:33: note: in expansion of macro 'bfin_write32'
    #define __raw_writel(val, addr) bfin_write32(addr, val)
                                    ^~~~~~~~~~~~
>> drivers/net/ethernet/broadcom/genet/bcmgenet.h:687:10: note: in expansion of macro '__raw_writel'
      return __raw_writel(val, priv->base + offset + off); \
             ^~~~~~~~~~~~
   drivers/net/ethernet/broadcom/genet/bcmgenet.h:693:1: note: in expansion of macro 'GENET_IO_MACRO'
    GENET_IO_MACRO(umac, GENET_UMAC_OFF);
    ^~~~~~~~~~~~~~
>> arch/blackfin/include/asm/def_LPBlackfin.h:38:2: warning: 'return' with a value, in function returning void
     __asm__ __volatile__( \
     ^
>> arch/blackfin/include/asm/def_LPBlackfin.h:51:33: note: in expansion of macro '_bfin_writeX'
    #define bfin_write32(addr, val) _bfin_writeX(addr, val, 32,  )
                                    ^~~~~~~~~~~~
>> arch/blackfin/include/asm/io.h:20:33: note: in expansion of macro 'bfin_write32'
    #define __raw_writel(val, addr) bfin_write32(addr, val)
                                    ^~~~~~~~~~~~
>> drivers/net/ethernet/broadcom/genet/bcmgenet.h:687:10: note: in expansion of macro '__raw_writel'
      return __raw_writel(val, priv->base + offset + off); \
             ^~~~~~~~~~~~
   drivers/net/ethernet/broadcom/genet/bcmgenet.h:693:1: note: in expansion of macro 'GENET_IO_MACRO'
    GENET_IO_MACRO(umac, GENET_UMAC_OFF);
    ^~~~~~~~~~~~~~
   In file included from drivers/net/ethernet/broadcom/genet/bcmgenet.c:49:0:
   drivers/net/ethernet/broadcom/genet/bcmgenet.h:683:20: note: declared here
    static inline void bcmgenet_##name##_writel(struct bcmgenet_priv *priv, \
                       ^
   drivers/net/ethernet/broadcom/genet/bcmgenet.h:693:1: note: in expansion of macro 'GENET_IO_MACRO'
    GENET_IO_MACRO(umac, GENET_UMAC_OFF);
    ^~~~~~~~~~~~~~
   In file included from arch/blackfin/mach-bf533/include/mach/blackfin.h:15:0,
                    from arch/blackfin/include/asm/irqflags.h:11,
                    from include/linux/irqflags.h:15,
                    from arch/blackfin/include/asm/bitops.h:33,
                    from include/linux/bitops.h:36,
                    from include/linux/kernel.h:10,
                    from drivers/net/ethernet/broadcom/genet/bcmgenet.c:13:
   drivers/net/ethernet/broadcom/genet/bcmgenet.h: In function 'bcmgenet_sys_writel':
>> arch/blackfin/include/asm/def_LPBlackfin.h:38:2: error: expected expression before '__asm__'
     __asm__ __volatile__( \
     ^
>> arch/blackfin/include/asm/def_LPBlackfin.h:51:33: note: in expansion of macro '_bfin_writeX'
    #define bfin_write32(addr, val) _bfin_writeX(addr, val, 32,  )
                                    ^~~~~~~~~~~~
>> arch/blackfin/include/asm/io.h:20:33: note: in expansion of macro 'bfin_write32'
    #define __raw_writel(val, addr) bfin_write32(addr, val)
                                    ^~~~~~~~~~~~
>> drivers/net/ethernet/broadcom/genet/bcmgenet.h:687:10: note: in expansion of macro '__raw_writel'
      return __raw_writel(val, priv->base + offset + off); \
             ^~~~~~~~~~~~
   drivers/net/ethernet/broadcom/genet/bcmgenet.h:694:1: note: in expansion of macro 'GENET_IO_MACRO'
    GENET_IO_MACRO(sys, GENET_SYS_OFF);
    ^~~~~~~~~~~~~~

vim +/__raw_writel +687 drivers/net/ethernet/broadcom/genet/bcmgenet.h

   670	
   671	#define GENET_IO_MACRO(name, offset)					\
   672	static inline u32 bcmgenet_##name##_readl(struct bcmgenet_priv *priv,	\
   673						u32 off)			\
   674	{									\
   675		/* MIPS chips strapped for BE will automagically configure the	\
   676		 * peripheral registers for CPU-native byte order.		\
   677		 */								\
   678		if (IS_ENABLED(CONFIG_MIPS) && IS_ENABLED(CONFIG_CPU_BIG_ENDIAN)) \
   679			return __raw_readl(priv->base + offset + off);		\
   680		else								\
   681			return readl_relaxed(priv->base + offset + off);	\
   682	}									\
   683	static inline void bcmgenet_##name##_writel(struct bcmgenet_priv *priv,	\
   684						u32 val, u32 off)		\
   685	{									\
   686		if (IS_ENABLED(CONFIG_MIPS) && IS_ENABLED(CONFIG_CPU_BIG_ENDIAN)) \
 > 687			return __raw_writel(val, priv->base + offset + off);	\
   688		else								\
   689			writel_relaxed(val, priv->base + offset + off);		\
   690	}
   691	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 45711 bytes --]

^ permalink raw reply

* Re: multi-queue over IFF_NO_QUEUE "virtual" devices
From: Florian Fainelli @ 2017-08-30  3:49 UTC (permalink / raw)
  To: netdev, jiri, jhs, xiyou.wangcong, andrew; +Cc: davem, vivien.didelot
In-Reply-To: <21cdcfc2-0efc-c0df-6542-05bb8078d866@gmail.com>

Le 08/07/17 à 15:26, Florian Fainelli a écrit :
> Hi,
> 
> Most DSA supported Broadcom switches have multiple queues per ports
> (usually 8) and each of these queues can be configured with different
> pause, drop, hysteresis thresholds and so on in order to make use of the
> switch's internal buffering scheme and have some queues achieve some
> kind of lossless behavior (e.g: LAN to LAN traffic for Q7 has a higher
> priority than LAN to WAN for Q0).
> 
> This is obviously very workload specific, so I'd want maximum
> programmability as much as possible.
> 
> This brings me to a few questions:
> 
> 1) If we have the DSA slave network devices currently flagged with
> IFF_NO_QUEUE becoming multi-queue (on TX) aware such that an application
> can control exactly which switch egress queue is used on a per-flow
> basis, would that be a problem (this is the dynamic selection of the TX
> queue)?

So I have this part figured out, with a bunch of changes network devices
created by DSA are now multiqueue aware and the Broadcom tag layer is
capable of extracting the queue index, passing it in the tag where
expected and having the switch forward to the appropriate switch port
and queue within that port. It also sets the queue mapping in the SKB
for later consumption by the master network device driver: bcmsysport.c
because of 2).

> 
> 2) The conduit interface (CPU) port network interface has a congestion
> control scheme which requires each of its TX queues (32 or 16) to be
> statically mapped to each of the underlying switch port queues because
> the congestion/ HW needs to inspect the queue depths of the switch to
> accept/reject a packet at the CPU's TX ring level. Do we have a good way
> with tc to map a virtual/stacked device's queue(s) on-top of its
> physical/underlying device's queues (this is the static queue mapping
> necessary for congestion to work)?

That part I have not figured out yet, with some static mapping I can
obtain the results that I want and was even considering the possibility
of doing something like this:

- register a network device notifier with bcmsysport.c (master network
device) for this setup
- expose a helper function allowing me to obtain a given DSA network
device port index
- whenever DSA creates network devices reconfigure the ring and queue
mapping of the TX queues managed by bcmsysport.c with the DSA network
device port index that has just been registered and just do a 1-1
mapping of the 8 queues

You would end-up with something like:

gphy (port 0) queues 0-7 mapped to systemport queues 0-7
rgmii_1 (port 1) queues 0-7 mapped to systemport queues 8-15
rgmii_2 (port 2) queues 0-7 mapped to systemport queues 16 through 23
moca (port 7) queues 0-7 mapped to systemport queues 24-31

This should be working because bcmsysport's TX queues are not under
direct control by the user, they are used via DSA created network
devices which indicate the queue they want to use. When the DSA
interfaces are brought down, their respective systemport queues now
become unused. This also works because the number of physical ports of
the switch times the number of queues is matching the number of TX
queues from systemport (like if someone designed it with that exact
purpose in mind ;)).

The only problem with that approach of course is that it embeds a policy
within the systemport driver.

Ideally I would really like to configure this via tc by setting up a
mapping between queues of one network devices to queues of another
network device, is that a possible thing, Jamal, Cong, Jiri, do you know?
-- 
Florian

^ permalink raw reply

* Re: [PATCH v2 net-next 1/8] bpf: Add support for recursively running cgroup sock filters
From: Alexei Starovoitov @ 2017-08-30  4:11 UTC (permalink / raw)
  To: David Ahern; +Cc: netdev, daniel, ast, tj, davem, luto
In-Reply-To: <79498e9c-2b22-d4d7-4b51-8a08e62ba237@gmail.com>

On Tue, Aug 29, 2017 at 09:38:16PM -0600, David Ahern wrote:
> On 8/29/17 8:58 PM, Alexei Starovoitov wrote:
> > On Tue, Aug 29, 2017 at 07:03:43PM -0600, David Ahern wrote:
> >> On 8/28/17 10:11 PM, Alexei Starovoitov wrote:
> >>>
> >>> Agree on the above, but you're mixing semantics of the new recurse
> >>> flag and implementation of it. Ex: we don't have to copy this flag
> >>> from prog->attr into cgroup. So this reset or non-reset discussion
> >>> only makes sense in the context of your current implementation.
> >>> We can implement the logic differently. Like don't copy that flag
> >>> at all and at attach time walk parent->parent->parent and see
> >>> what programs are attached. All of them should have prog->attr & recurse_bit set
> >>> In such implementation detach from 'b' is a nop from reset/non-reset
> >>> point of view. When socket creation in 'c' is invoked the program
> >>> 'c' is called first then the code keeps walking parents until root
> >>> invoking 'a' along the way.
> >>
> >> So you are suggesting there is no recursive flag per cgroup? How do you
> >> know you need to walk cgroups? How do you know when to stop running
> >> programs?
> > 
> > you're talking about implementation, right?
> > My 'proposed' implemenation of walking from cgroup all the way to the root
> > is just an example. It's not efficient. More below...
> > 
> >>> I'm not saying it will be an efficient implementation. The point
> >>> is to discuss UAPI independent of implementation.
> >>>
> >>>> ###
> >>>>
> >>>> Also, let's agree on this intention. Based on the new ground rule, I
> >>>> want to point out this example:
> >>>>
> >>>> If 'a' gets a program installed with no recurse flag set, ONLY processes
> >>>> in 'a' have the 'a' program run. Processes in groups 'b', 'c' and 'd'
> >>>> all stop at cgroup 'b' program.
> >>>
> >>> I'm proposing that such situation should not be allowed to happen.
> >>> In a->b->c->d cgroup scenario if override+recurse prog attached to 'b'
> >>> then only the same override+recurse can be attached to c, d, a.
> >>> So at detach time there can be gaps (like only 'b' and 'd' have
> >>> override+recurse progs), but walking up until root from any point
> >>> will guarantee that only override+recurse programs are seen.
> >>>
> >>
> >> That seems very limiting to me. Seems like you are suggesting the entire
> >> cgroup tree is recursive or non-recursive, but never a mix.
> > 
> > Entire cgroup subtree. Yes. It's the simplest uapi I could think of.
> 
> So 10 email exchanges later you agree on the UAPI I implemented in this
> patch: user opts in to recursive behavior via a new flag at attach time,
> and once a recursive program is installed at some point in the cgroup
> tree it applies to all descendant cgroups.
> 
> So all of these exchanges weren't about the UAPI, but your disagreement
> in my implementation. The only user visible change here is only programs
> marked recursive are run versus going back to the first cgroup marked
> non-recursive.

you cannot be serious. Your implemention is not at all what i'm proposing
above as a simplest uapi. Should we all go back and re-read from the beginning?

> > Easy to understand and argue about and I think it's solving your use case.
> > It's also easily extendable. New combination and features won't break
> > the users. It feels you're in rush to get this stuff for this merge
> > window, therefore I want to agree on something that is simple,
> > non-controversial and extensible.
> 
> I am in no-rush, but this does not to fall by the wayside like the net
> namespace specification.

It's more than that! I think you only looking at it from vrf perspective
whereas cgroup-bpf became a corner stone feature and enabler for tcp-bpf
which in turn became a stepping stone for bpf_sk_redirect.
So no, I'm not going to let something half baked that touches
the core idea of cgroup-bpf slide in.
Tejun and Andy also need to take a look, so yes it will take long
time for everyone to agree on this core uapi.

> Given the pending release of 4.13 net-next will close which gives a 2+
> week window to work on v3 before the next merge window. Plenty of time
> for me to work it into the many other things on my plate.

As I proposed several emails ago, please repost patches 2 and 3 that
I already acked and we can continue discussing this patch without
the agitation.

^ permalink raw reply

* Re: [PATCH net-next v3 0/3] NCSI VLAN Filtering Support
From: Samuel Mendoza-Jonas @ 2017-08-30  4:37 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel, openbmc, joel, benh, gwshan, ratagupt
In-Reply-To: <20170828.165001.1442421762159241248.davem@davemloft.net>

On Mon, 2017-08-28 at 16:50 -0700, David Miller wrote:
> From: Samuel Mendoza-Jonas <sam@mendozajonas.com>
> Date: Mon, 28 Aug 2017 16:18:40 +1000
> 
> > This series (mainly patch 2) adds VLAN filtering to the NCSI implementation.
> > A fair amount of code already exists in the NCSI stack for VLAN filtering but
> > none of it is actually hooked up. This goes the final mile and fixes a few
> > bugs in the existing code found along the way (patch 1).
> > 
> > Patch 3 adds the appropriate flag and callbacks to the ftgmac100 driver to
> > enable filtering as it's a large consumer of NCSI (and what I've been
> > testing on).
> > 
> > v3:   - Add comment describing change to ncsi_find_filter()
> >       - Catch NULL in clear_one_vid() from ncsi_get_filter()
> >       - Simplify state changes when kicking updated channel
> 
> Series applied.

Thanks David,

The kbuild bot caught a build error where the add/kill callbacks aren't
defined without CONFIG_NET_NCSI:

>> ERROR: "ncsi_vlan_rx_kill_vid" [drivers/net/ethernet/faraday/ftgmac100.ko] undefined!
>> ERROR: "ncsi_vlan_rx_add_vid" [drivers/net/ethernet/faraday/ftgmac100.ko] undefined!

It's a quick fixup to patch 3 as below, would you like me to send it as a v4?


diff --git a/include/net/ncsi.h b/include/net/ncsi.h
index 1f96af46df49..2b13b6b91a4d 100644
--- a/include/net/ncsi.h
+++ b/include/net/ncsi.h
@@ -36,6 +36,14 @@ int ncsi_start_dev(struct ncsi_dev *nd);
 void ncsi_stop_dev(struct ncsi_dev *nd);
 void ncsi_unregister_dev(struct ncsi_dev *nd);
 #else /* !CONFIG_NET_NCSI */
+int ncsi_vlan_rx_add_vid(struct net_device *dev, __be16 proto, u16 vid)
+{
+       return -ENOTTY;
+}
+int ncsi_vlan_rx_kill_vid(struct net_device *dev, __be16 proto, u16 vid)
+{
+       return -ENOTTY;
+}
 static inline struct ncsi_dev *ncsi_register_dev(struct net_device *dev,
                                        void (*notifier)(struct ncsi_dev *nd))
 {

^ permalink raw reply related

* [PATCH net-next 0/3 v11] Add support for rmnet driver
From: Subash Abhinov Kasiviswanathan @ 2017-08-30  4:44 UTC (permalink / raw)
  To: netdev, davem, fengguang.wu, dcbw, jiri, stephen, David.Laight,
	marcel, andrew
  Cc: Subash Abhinov Kasiviswanathan

This patch series adds support for the rmnet driver which is required to
support recent chipsets using Qualcomm Technologies, Inc. modems. The data
from hardware follows the multiplexing and aggregation protocol (MAP).

This driver can be used to register onto any physical network device in
IP mode. Physical transports include USB, HSIC, PCIe and IP accelerator.

rmnet driver helps to decode these packets and queue them to network
stack (and encode and transmit it to the physical device).

v1: Same as the RFC patch with some minor fixes for issues reported by
kbuild test robot.

v1->v2: Change datatypes and remove config IOCTL as mentioned by David.
Also fix checkpatch issues and remove some unused code.

v2->v3: Move location to drivers/net and rename to rmnet. Change the
userspace - netlink communication from custom netlink to rtnl_link_ops.
Refactor some code. Use a fixed config for ingress and egress.

v3->v4: Move location to drivers/net/ethernet/qualcomm/.
Fix comments from Stephen and Jiri -
Split the ether and arp type changes into seperate patches.
Remove debug and custom logging and switch to standard netdevice log.
Remove module parameters. Refactor and change some code style issues.

v4->v5: Rename some structs and variables. Move the initializer
before the for loop start. Put the arp type in correct sequence.

v5->v6: Fix comments from Dan -
Use the upper link API. As a result, remove all the refcounting logic.
Device refcount is explicitly held on real_dev on rx_handler
registration only. Modifiy the flow control struct. Remove the unused
ethernet mode handling.

v6->v7: Fix comments from David - Add newline to end of Makefile. Remove
inline from .c files. Move the module init/exit to rmnet config. Fix an
error reported by kbuild test robot for an unused file.

v7->v8: Use a smaller value for ETH_P_MAP as mentioned by David. Change
netdev_info to netdev_dbg as mentioned by Andew. Fix comments from
Stephen regarding netdev_priv and sparse related errors of using 0 as NULL

v8->v9: Fix comments from David - Remove the CFLAG rule. Change the way
rmnet devices are freed. Instead of using a workqueue to unregister devices
individually, go through the list and free all devices within the rtnl_lock().

v9->v10: Actually fix the locking as mentioned by David. The locking scheme is
mentioned in a comment in rmnet_config.c. Change comment near MAP type
definition as mentioned by Dan. Refactor some code.

v10->v11: Allow RMNET to compile as a module as mentioned by David

Subash Abhinov Kasiviswanathan (3):
  net: ether: Add support for multiplexing and aggregation type
  net: arp: Add support for raw IP device
  drivers: net: ethernet: qualcomm: rmnet: Initial implementation

 Documentation/networking/rmnet.txt                 |  82 ++++
 drivers/net/ethernet/qualcomm/Kconfig              |   2 +
 drivers/net/ethernet/qualcomm/Makefile             |   2 +
 drivers/net/ethernet/qualcomm/rmnet/Kconfig        |  12 +
 drivers/net/ethernet/qualcomm/rmnet/Makefile       |  10 +
 drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c | 419 +++++++++++++++++++++
 drivers/net/ethernet/qualcomm/rmnet/rmnet_config.h |  56 +++
 .../net/ethernet/qualcomm/rmnet/rmnet_handlers.c   | 271 +++++++++++++
 .../net/ethernet/qualcomm/rmnet/rmnet_handlers.h   |  26 ++
 drivers/net/ethernet/qualcomm/rmnet/rmnet_map.h    |  88 +++++
 .../ethernet/qualcomm/rmnet/rmnet_map_command.c    | 107 ++++++
 .../net/ethernet/qualcomm/rmnet/rmnet_map_data.c   | 105 ++++++
 .../net/ethernet/qualcomm/rmnet/rmnet_private.h    |  45 +++
 drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.c    | 170 +++++++++
 drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.h    |  29 ++
 include/uapi/linux/if_arp.h                        |   1 +
 include/uapi/linux/if_ether.h                      |   3 +
 17 files changed, 1428 insertions(+)
 create mode 100644 Documentation/networking/rmnet.txt
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/Kconfig
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/Makefile
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/rmnet_config.h
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/rmnet_handlers.c
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/rmnet_handlers.h
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/rmnet_map.h
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/rmnet_map_command.c
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/rmnet_map_data.c
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/rmnet_private.h
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.c
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.h

-- 
1.9.1

^ permalink raw reply

* [PATCH net-next 1/3 v11] net: ether: Add support for multiplexing and aggregation type
From: Subash Abhinov Kasiviswanathan @ 2017-08-30  4:44 UTC (permalink / raw)
  To: netdev, davem, fengguang.wu, dcbw, jiri, stephen, David.Laight,
	marcel, andrew
  Cc: Subash Abhinov Kasiviswanathan
In-Reply-To: <1504068258-16982-1-git-send-email-subashab@codeaurora.org>

Define the Qualcomm multiplexing and aggregation (MAP) ether type 0x00F9.
This is needed for receiving data in the MAP protocol like RMNET. This is
not an officially registered ID.

Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
---
 include/uapi/linux/if_ether.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/include/uapi/linux/if_ether.h b/include/uapi/linux/if_ether.h
index 5bc9bfd..30526db 100644
--- a/include/uapi/linux/if_ether.h
+++ b/include/uapi/linux/if_ether.h
@@ -137,6 +137,9 @@
 #define ETH_P_IEEE802154 0x00F6		/* IEEE802.15.4 frame		*/
 #define ETH_P_CAIF	0x00F7		/* ST-Ericsson CAIF protocol	*/
 #define ETH_P_XDSA	0x00F8		/* Multiplexed DSA protocol	*/
+#define ETH_P_MAP	0x00F9		/* Qualcomm multiplexing and
+					 * aggregation protocol
+					 */
 
 /*
  *	This is an Ethernet frame header.
-- 
1.9.1

^ permalink raw reply related

* [PATCH net-next 2/3 v11] net: arp: Add support for raw IP device
From: Subash Abhinov Kasiviswanathan @ 2017-08-30  4:44 UTC (permalink / raw)
  To: netdev, davem, fengguang.wu, dcbw, jiri, stephen, David.Laight,
	marcel, andrew
  Cc: Subash Abhinov Kasiviswanathan
In-Reply-To: <1504068258-16982-1-git-send-email-subashab@codeaurora.org>

Define the raw IP type. This is needed for raw IP net devices
like rmnet.

Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
---
 include/uapi/linux/if_arp.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/uapi/linux/if_arp.h b/include/uapi/linux/if_arp.h
index cf73510..a2a6356 100644
--- a/include/uapi/linux/if_arp.h
+++ b/include/uapi/linux/if_arp.h
@@ -59,6 +59,7 @@
 #define ARPHRD_LAPB	516		/* LAPB				*/
 #define ARPHRD_DDCMP    517		/* Digital's DDCMP protocol     */
 #define ARPHRD_RAWHDLC	518		/* Raw HDLC			*/
+#define ARPHRD_RAWIP    519		/* Raw IP                       */
 
 #define ARPHRD_TUNNEL	768		/* IPIP tunnel			*/
 #define ARPHRD_TUNNEL6	769		/* IP6IP6 tunnel       		*/
-- 
1.9.1

^ permalink raw reply related

* [PATCH net-next 3/3 v11] drivers: net: ethernet: qualcomm: rmnet: Initial implementation
From: Subash Abhinov Kasiviswanathan @ 2017-08-30  4:44 UTC (permalink / raw)
  To: netdev, davem, fengguang.wu, dcbw, jiri, stephen, David.Laight,
	marcel, andrew
  Cc: Subash Abhinov Kasiviswanathan
In-Reply-To: <1504068258-16982-1-git-send-email-subashab@codeaurora.org>

RmNet driver provides a transport agnostic MAP (multiplexing and
aggregation protocol) support in embedded module. Module provides
virtual network devices which can be attached to any IP-mode
physical device. This will be used to provide all MAP functionality
on future hardware in a single consistent location.

Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
---
 Documentation/networking/rmnet.txt                 |  82 ++++
 drivers/net/ethernet/qualcomm/Kconfig              |   2 +
 drivers/net/ethernet/qualcomm/Makefile             |   2 +
 drivers/net/ethernet/qualcomm/rmnet/Kconfig        |  12 +
 drivers/net/ethernet/qualcomm/rmnet/Makefile       |  10 +
 drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c | 419 +++++++++++++++++++++
 drivers/net/ethernet/qualcomm/rmnet/rmnet_config.h |  56 +++
 .../net/ethernet/qualcomm/rmnet/rmnet_handlers.c   | 271 +++++++++++++
 .../net/ethernet/qualcomm/rmnet/rmnet_handlers.h   |  26 ++
 drivers/net/ethernet/qualcomm/rmnet/rmnet_map.h    |  88 +++++
 .../ethernet/qualcomm/rmnet/rmnet_map_command.c    | 107 ++++++
 .../net/ethernet/qualcomm/rmnet/rmnet_map_data.c   | 105 ++++++
 .../net/ethernet/qualcomm/rmnet/rmnet_private.h    |  45 +++
 drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.c    | 170 +++++++++
 drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.h    |  29 ++
 15 files changed, 1424 insertions(+)
 create mode 100644 Documentation/networking/rmnet.txt
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/Kconfig
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/Makefile
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/rmnet_config.h
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/rmnet_handlers.c
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/rmnet_handlers.h
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/rmnet_map.h
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/rmnet_map_command.c
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/rmnet_map_data.c
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/rmnet_private.h
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.c
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.h

diff --git a/Documentation/networking/rmnet.txt b/Documentation/networking/rmnet.txt
new file mode 100644
index 0000000..6b341ea
--- /dev/null
+++ b/Documentation/networking/rmnet.txt
@@ -0,0 +1,82 @@
+1. Introduction
+
+rmnet driver is used for supporting the Multiplexing and aggregation
+Protocol (MAP). This protocol is used by all recent chipsets using Qualcomm
+Technologies, Inc. modems.
+
+This driver can be used to register onto any physical network device in
+IP mode. Physical transports include USB, HSIC, PCIe and IP accelerator.
+
+Multiplexing allows for creation of logical netdevices (rmnet devices) to
+handle multiple private data networks (PDN) like a default internet, tethering,
+multimedia messaging service (MMS) or IP media subsystem (IMS). Hardware sends
+packets with MAP headers to rmnet. Based on the multiplexer id, rmnet
+routes to the appropriate PDN after removing the MAP header.
+
+Aggregation is required to achieve high data rates. This involves hardware
+sending aggregated bunch of MAP frames. rmnet driver will de-aggregate
+these MAP frames and send them to appropriate PDN's.
+
+2. Packet format
+
+a. MAP packet (data / control)
+
+MAP header has the same endianness of the IP packet.
+
+Packet format -
+
+Bit             0             1           2-7      8 - 15           16 - 31
+Function   Command / Data   Reserved     Pad   Multiplexer ID    Payload length
+Bit            32 - x
+Function     Raw  Bytes
+
+Command (1)/ Data (0) bit value is to indicate if the packet is a MAP command
+or data packet. Control packet is used for transport level flow control. Data
+packets are standard IP packets.
+
+Reserved bits are usually zeroed out and to be ignored by receiver.
+
+Padding is number of bytes to be added for 4 byte alignment if required by
+hardware.
+
+Multiplexer ID is to indicate the PDN on which data has to be sent.
+
+Payload length includes the padding length but does not include MAP header
+length.
+
+b. MAP packet (command specific)
+
+Bit             0             1           2-7      8 - 15           16 - 31
+Function   Command         Reserved     Pad   Multiplexer ID    Payload length
+Bit          32 - 39        40 - 45    46 - 47       48 - 63
+Function   Command name    Reserved   Command Type   Reserved
+Bit          64 - 95
+Function   Transaction ID
+Bit          96 - 127
+Function   Command data
+
+Command 1 indicates disabling flow while 2 is enabling flow
+
+Command types -
+0 for MAP command request
+1 is to acknowledge the receipt of a command
+2 is for unsupported commands
+3 is for error during processing of commands
+
+c. Aggregation
+
+Aggregation is multiple MAP packets (can be data or command) delivered to
+rmnet in a single linear skb. rmnet will process the individual
+packets and either ACK the MAP command or deliver the IP packet to the
+network stack as needed
+
+MAP header|IP Packet|Optional padding|MAP header|IP Packet|Optional padding....
+MAP header|IP Packet|Optional padding|MAP header|Command Packet|Optional pad...
+
+3. Userspace configuration
+
+rmnet userspace configuration is done through netlink library librmnetctl
+and command line utility rmnetcli. Utility is hosted in codeaurora forum git.
+The driver uses rtnl_link_ops for communication.
+
+https://source.codeaurora.org/quic/la/platform/vendor/qcom-opensource/dataservices/tree/rmnetctl
diff --git a/drivers/net/ethernet/qualcomm/Kconfig b/drivers/net/ethernet/qualcomm/Kconfig
index 877675a..f520071 100644
--- a/drivers/net/ethernet/qualcomm/Kconfig
+++ b/drivers/net/ethernet/qualcomm/Kconfig
@@ -59,4 +59,6 @@ config QCOM_EMAC
 	  low power, Receive-Side Scaling (RSS), and IEEE 1588-2008
 	  Precision Clock Synchronization Protocol.
 
+source "drivers/net/ethernet/qualcomm/rmnet/Kconfig"
+
 endif # NET_VENDOR_QUALCOMM
diff --git a/drivers/net/ethernet/qualcomm/Makefile b/drivers/net/ethernet/qualcomm/Makefile
index 92fa7c4..1847350 100644
--- a/drivers/net/ethernet/qualcomm/Makefile
+++ b/drivers/net/ethernet/qualcomm/Makefile
@@ -9,3 +9,5 @@ obj-$(CONFIG_QCA7000_UART) += qcauart.o
 qcauart-objs := qca_uart.o
 
 obj-y += emac/
+
+obj-$(CONFIG_RMNET) += rmnet/
diff --git a/drivers/net/ethernet/qualcomm/rmnet/Kconfig b/drivers/net/ethernet/qualcomm/rmnet/Kconfig
new file mode 100644
index 0000000..6e2587a
--- /dev/null
+++ b/drivers/net/ethernet/qualcomm/rmnet/Kconfig
@@ -0,0 +1,12 @@
+#
+# RMNET MAP driver
+#
+
+menuconfig RMNET
+	tristate "RmNet MAP driver"
+	default n
+	---help---
+	  If you select this, you will enable the RMNET module which is used
+	  for handling data in the multiplexing and aggregation protocol (MAP)
+	  format in the embedded data path. RMNET devices can be attached to
+	  any IP mode physical device.
diff --git a/drivers/net/ethernet/qualcomm/rmnet/Makefile b/drivers/net/ethernet/qualcomm/rmnet/Makefile
new file mode 100644
index 0000000..01bddf2
--- /dev/null
+++ b/drivers/net/ethernet/qualcomm/rmnet/Makefile
@@ -0,0 +1,10 @@
+#
+# Makefile for the RMNET module
+#
+
+rmnet-y		 := rmnet_config.o
+rmnet-y		 += rmnet_vnd.o
+rmnet-y		 += rmnet_handlers.o
+rmnet-y		 += rmnet_map_data.o
+rmnet-y		 += rmnet_map_command.o
+obj-$(CONFIG_RMNET) += rmnet.o
diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c b/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c
new file mode 100644
index 0000000..e836d26
--- /dev/null
+++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c
@@ -0,0 +1,419 @@
+/* Copyright (c) 2013-2017, The Linux Foundation. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 and
+ * only version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * RMNET configuration engine
+ *
+ */
+
+#include <net/sock.h>
+#include <linux/module.h>
+#include <linux/netlink.h>
+#include <linux/netdevice.h>
+#include "rmnet_config.h"
+#include "rmnet_handlers.h"
+#include "rmnet_vnd.h"
+#include "rmnet_private.h"
+
+/* Locking scheme -
+ * The shared resource which needs to be protected is realdev->rx_handler_data.
+ * For the writer path, this is using rtnl_lock(). The writer paths are
+ * rmnet_newlink(), rmnet_dellink() and rmnet_force_unassociate_device(). These
+ * paths are already called with rtnl_lock() acquired in. There is also an
+ * ASSERT_RTNL() to ensure that we are calling with rtnl acquired. For
+ * dereference here, we will need to use rtnl_dereference(). Dev list writing
+ * needs to happen with rtnl_lock() acquired for netdev_master_upper_dev_link().
+ * For the reader path, the real_dev->rx_handler_data is called in the TX / RX
+ * path. We only need rcu_read_lock() for these scenarios. In these cases,
+ * the rcu_read_lock() is held in __dev_queue_xmit() and
+ * netif_receive_skb_internal(), so readers need to use rcu_dereference_rtnl()
+ * to get the relevant information. For dev list reading, we again acquire
+ * rcu_read_lock() in rmnet_dellink() for netdev_master_upper_dev_get_rcu().
+ * We also use unregister_netdevice_many() to free all rmnet devices in
+ * rmnet_force_unassociate_device() so we dont lose the rtnl_lock() and free in
+ * same context.
+ */
+
+/* Local Definitions and Declarations */
+#define RMNET_LOCAL_LOGICAL_ENDPOINT -1
+
+struct rmnet_walk_data {
+	struct net_device *real_dev;
+	struct list_head *head;
+	struct rmnet_real_dev_info *real_dev_info;
+};
+
+static int rmnet_is_real_dev_registered(const struct net_device *real_dev)
+{
+	rx_handler_func_t *rx_handler;
+
+	rx_handler = rcu_dereference(real_dev->rx_handler);
+	return (rx_handler == rmnet_rx_handler);
+}
+
+/* Needs either rcu_read_lock() or rtnl lock */
+static struct rmnet_real_dev_info*
+__rmnet_get_real_dev_info(const struct net_device *real_dev)
+{
+	if (rmnet_is_real_dev_registered(real_dev))
+		return rcu_dereference_rtnl(real_dev->rx_handler_data);
+	else
+		return NULL;
+}
+
+/* Needs rtnl lock */
+static struct rmnet_real_dev_info*
+rmnet_get_real_dev_info_rtnl(const struct net_device *real_dev)
+{
+	return rtnl_dereference(real_dev->rx_handler_data);
+}
+
+static struct rmnet_endpoint*
+rmnet_get_endpoint(struct net_device *dev, int config_id)
+{
+	struct rmnet_real_dev_info *r;
+	struct rmnet_endpoint *ep;
+
+	if (!rmnet_is_real_dev_registered(dev)) {
+		ep = rmnet_vnd_get_endpoint(dev);
+	} else {
+		r = __rmnet_get_real_dev_info(dev);
+
+		if (!r)
+			return NULL;
+
+		if (config_id == RMNET_LOCAL_LOGICAL_ENDPOINT)
+			ep = &r->local_ep;
+		else
+			ep = &r->muxed_ep[config_id];
+	}
+
+	return ep;
+}
+
+static int rmnet_unregister_real_device(struct net_device *real_dev,
+					struct rmnet_real_dev_info *r)
+{
+	if (r->nr_rmnet_devs)
+		return -EINVAL;
+
+	kfree(r);
+
+	netdev_rx_handler_unregister(real_dev);
+
+	/* release reference on real_dev */
+	dev_put(real_dev);
+
+	netdev_dbg(real_dev, "Removed from rmnet\n");
+	return 0;
+}
+
+static int rmnet_register_real_device(struct net_device *real_dev)
+{
+	struct rmnet_real_dev_info *r;
+	int rc;
+
+	ASSERT_RTNL();
+
+	if (rmnet_is_real_dev_registered(real_dev))
+		return 0;
+
+	r = kzalloc(sizeof(*r), GFP_ATOMIC);
+	if (!r)
+		return -ENOMEM;
+
+	r->dev = real_dev;
+	rc = netdev_rx_handler_register(real_dev, rmnet_rx_handler, r);
+	if (rc) {
+		kfree(r);
+		return -EBUSY;
+	}
+
+	/* hold on to real dev for MAP data */
+	dev_hold(real_dev);
+
+	netdev_dbg(real_dev, "registered with rmnet\n");
+	return 0;
+}
+
+static int rmnet_set_ingress_data_format(struct net_device *dev, u32 idf)
+{
+	struct rmnet_real_dev_info *r;
+
+	netdev_dbg(dev, "Ingress format 0x%08X\n", idf);
+
+	r = __rmnet_get_real_dev_info(dev);
+
+	r->ingress_data_format = idf;
+
+	return 0;
+}
+
+static int rmnet_set_egress_data_format(struct net_device *dev, u32 edf,
+					u16 agg_size, u16 agg_count)
+{
+	struct rmnet_real_dev_info *r;
+
+	netdev_dbg(dev, "Egress format 0x%08X agg size %d cnt %d\n",
+		   edf, agg_size, agg_count);
+
+	r = __rmnet_get_real_dev_info(dev);
+
+	r->egress_data_format = edf;
+
+	return 0;
+}
+
+static int __rmnet_set_endpoint_config(struct net_device *dev, int config_id,
+				       struct rmnet_endpoint *ep)
+{
+	struct rmnet_endpoint *dev_ep;
+
+	dev_ep = rmnet_get_endpoint(dev, config_id);
+
+	if (!dev_ep)
+		return -EINVAL;
+
+	memcpy(dev_ep, ep, sizeof(struct rmnet_endpoint));
+	if (config_id == RMNET_LOCAL_LOGICAL_ENDPOINT)
+		dev_ep->mux_id = 0;
+	else
+		dev_ep->mux_id = config_id;
+
+	return 0;
+}
+
+static int rmnet_set_endpoint_config(struct net_device *dev,
+				     int config_id, u8 rmnet_mode,
+				     struct net_device *egress_dev)
+{
+	struct rmnet_endpoint ep;
+
+	netdev_dbg(dev, "id %d mode %d dev %s\n",
+		   config_id, rmnet_mode, egress_dev->name);
+
+	if (config_id < RMNET_LOCAL_LOGICAL_ENDPOINT ||
+	    config_id >= RMNET_MAX_LOGICAL_EP)
+		return -EINVAL;
+
+	/* This config is cleared on every set, so its ok to not
+	 * clear it on a device delete.
+	 */
+	memset(&ep, 0, sizeof(struct rmnet_endpoint));
+	ep.rmnet_mode = rmnet_mode;
+	ep.egress_dev = egress_dev;
+
+	return __rmnet_set_endpoint_config(dev, config_id, &ep);
+}
+
+static int rmnet_newlink(struct net *src_net, struct net_device *dev,
+			 struct nlattr *tb[], struct nlattr *data[],
+			 struct netlink_ext_ack *extack)
+{
+	int ingress_format = RMNET_INGRESS_FORMAT_DEMUXING |
+			     RMNET_INGRESS_FORMAT_DEAGGREGATION |
+			     RMNET_INGRESS_FORMAT_MAP;
+	int egress_format = RMNET_EGRESS_FORMAT_MUXING |
+			    RMNET_EGRESS_FORMAT_MAP;
+	struct rmnet_real_dev_info *r;
+	struct net_device *real_dev;
+	int mode = RMNET_EPMODE_VND;
+	int err = 0;
+	u16 mux_id;
+
+	real_dev = __dev_get_by_index(src_net, nla_get_u32(tb[IFLA_LINK]));
+	if (!real_dev || !dev)
+		return -ENODEV;
+
+	if (!data[IFLA_VLAN_ID])
+		return -EINVAL;
+
+	mux_id = nla_get_u16(data[IFLA_VLAN_ID]);
+
+	err = rmnet_register_real_device(real_dev);
+	if (err)
+		goto err0;
+
+	r = rmnet_get_real_dev_info_rtnl(real_dev);
+	err = rmnet_vnd_newlink(mux_id, dev, r);
+	if (err)
+		goto err1;
+
+	err = netdev_master_upper_dev_link(dev, real_dev, NULL, NULL);
+	if (err)
+		goto err2;
+
+	rmnet_vnd_set_mux(dev, mux_id);
+	rmnet_set_egress_data_format(real_dev, egress_format, 0, 0);
+	rmnet_set_ingress_data_format(real_dev, ingress_format);
+	rmnet_set_endpoint_config(real_dev, mux_id, mode, dev);
+	rmnet_set_endpoint_config(dev, mux_id, mode, real_dev);
+	return 0;
+
+err2:
+	rmnet_vnd_dellink(mux_id, r);
+err1:
+	rmnet_unregister_real_device(real_dev, r);
+err0:
+	return err;
+}
+
+static void rmnet_dellink(struct net_device *dev, struct list_head *head)
+{
+	struct rmnet_real_dev_info *r;
+	struct net_device *real_dev;
+	u8 mux_id;
+
+	rcu_read_lock();
+	real_dev = netdev_master_upper_dev_get_rcu(dev);
+	rcu_read_unlock();
+
+	if (!real_dev || !rmnet_is_real_dev_registered(real_dev))
+		return;
+
+	r = rmnet_get_real_dev_info_rtnl(real_dev);
+
+	mux_id = rmnet_vnd_get_mux(dev);
+	rmnet_vnd_dellink(mux_id, r);
+	netdev_upper_dev_unlink(dev, real_dev);
+	rmnet_unregister_real_device(real_dev, r);
+
+	unregister_netdevice_queue(dev, head);
+}
+
+static int rmnet_dev_walk_unreg(struct net_device *rmnet_dev, void *data)
+{
+	struct rmnet_walk_data *d = data;
+	u8 mux_id;
+
+	mux_id = rmnet_vnd_get_mux(rmnet_dev);
+
+	rmnet_vnd_dellink(mux_id, d->real_dev_info);
+	netdev_upper_dev_unlink(rmnet_dev, d->real_dev);
+	unregister_netdevice_queue(rmnet_dev, d->head);
+
+	return 0;
+}
+
+static void rmnet_force_unassociate_device(struct net_device *dev)
+{
+	struct net_device *real_dev = dev;
+	struct rmnet_real_dev_info *r;
+	struct rmnet_walk_data d;
+	LIST_HEAD(list);
+
+	if (!rmnet_is_real_dev_registered(real_dev))
+		return;
+
+	ASSERT_RTNL();
+
+	d.real_dev = real_dev;
+	d.head = &list;
+
+	r = rmnet_get_real_dev_info_rtnl(dev);
+	d.real_dev_info = r;
+
+	rcu_read_lock();
+	netdev_walk_all_lower_dev_rcu(real_dev, rmnet_dev_walk_unreg, &d);
+	rcu_read_unlock();
+	unregister_netdevice_many(&list);
+
+	rmnet_unregister_real_device(real_dev, r);
+}
+
+static int rmnet_config_notify_cb(struct notifier_block *nb,
+				  unsigned long event, void *data)
+{
+	struct net_device *dev = netdev_notifier_info_to_dev(data);
+
+	if (!dev)
+		return NOTIFY_DONE;
+
+	switch (event) {
+	case NETDEV_UNREGISTER:
+		netdev_dbg(dev, "Kernel unregister\n");
+		rmnet_force_unassociate_device(dev);
+		break;
+
+	default:
+		break;
+	}
+
+	return NOTIFY_DONE;
+}
+
+static struct notifier_block rmnet_dev_notifier __read_mostly = {
+	.notifier_call = rmnet_config_notify_cb,
+};
+
+static int rmnet_rtnl_validate(struct nlattr *tb[], struct nlattr *data[],
+			       struct netlink_ext_ack *extack)
+{
+	u16 mux_id;
+
+	if (!data || !data[IFLA_VLAN_ID])
+		return -EINVAL;
+
+	mux_id = nla_get_u16(data[IFLA_VLAN_ID]);
+	if (mux_id > (RMNET_MAX_LOGICAL_EP - 1))
+		return -ERANGE;
+
+	return 0;
+}
+
+static size_t rmnet_get_size(const struct net_device *dev)
+{
+	return nla_total_size(2); /* IFLA_VLAN_ID */
+}
+
+struct rtnl_link_ops rmnet_link_ops __read_mostly = {
+	.kind		= "rmnet",
+	.maxtype	= __IFLA_VLAN_MAX,
+	.priv_size	= sizeof(struct rmnet_priv),
+	.setup		= rmnet_vnd_setup,
+	.validate	= rmnet_rtnl_validate,
+	.newlink	= rmnet_newlink,
+	.dellink	= rmnet_dellink,
+	.get_size	= rmnet_get_size,
+};
+
+struct rmnet_real_dev_info*
+rmnet_get_real_dev_info(struct net_device *real_dev)
+{
+	return __rmnet_get_real_dev_info(real_dev);
+}
+
+/* Startup/Shutdown */
+
+static int __init rmnet_init(void)
+{
+	int rc;
+
+	rc = register_netdevice_notifier(&rmnet_dev_notifier);
+	if (rc != 0)
+		return rc;
+
+	rc = rtnl_link_register(&rmnet_link_ops);
+	if (rc != 0) {
+		unregister_netdevice_notifier(&rmnet_dev_notifier);
+		return rc;
+	}
+	return rc;
+}
+
+static void __exit rmnet_exit(void)
+{
+	unregister_netdevice_notifier(&rmnet_dev_notifier);
+	rtnl_link_unregister(&rmnet_link_ops);
+}
+
+module_init(rmnet_init)
+module_exit(rmnet_exit)
+MODULE_LICENSE("GPL v2");
diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.h b/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.h
new file mode 100644
index 0000000..985d372
--- /dev/null
+++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.h
@@ -0,0 +1,56 @@
+/* Copyright (c) 2013-2014, 2016-2017 The Linux Foundation. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 and
+ * only version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * RMNET Data configuration engine
+ *
+ */
+
+#include <linux/skbuff.h>
+
+#ifndef _RMNET_CONFIG_H_
+#define _RMNET_CONFIG_H_
+
+#define RMNET_MAX_LOGICAL_EP 255
+#define RMNET_MAX_VND        32
+
+/* Information about the next device to deliver the packet to.
+ * Exact usage of this parameter depends on the rmnet_mode.
+ */
+struct rmnet_endpoint {
+	u8 rmnet_mode;
+	u8 mux_id;
+	struct net_device *egress_dev;
+};
+
+/* One instance of this structure is instantiated for each real_dev associated
+ * with rmnet.
+ */
+struct rmnet_real_dev_info {
+	struct net_device *dev;
+	struct rmnet_endpoint local_ep;
+	struct rmnet_endpoint muxed_ep[RMNET_MAX_LOGICAL_EP];
+	u32 ingress_data_format;
+	u32 egress_data_format;
+	struct net_device *rmnet_devices[RMNET_MAX_VND];
+	u8 nr_rmnet_devs;
+};
+
+extern struct rtnl_link_ops rmnet_link_ops;
+
+struct rmnet_priv {
+	struct rmnet_endpoint local_ep;
+	u8 mux_id;
+};
+
+struct rmnet_real_dev_info*
+rmnet_get_real_dev_info(struct net_device *real_dev);
+
+#endif /* _RMNET_CONFIG_H_ */
diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_handlers.c b/drivers/net/ethernet/qualcomm/rmnet/rmnet_handlers.c
new file mode 100644
index 0000000..7dab3bb
--- /dev/null
+++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_handlers.c
@@ -0,0 +1,271 @@
+/* Copyright (c) 2013-2017, The Linux Foundation. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 and
+ * only version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * RMNET Data ingress/egress handler
+ *
+ */
+
+#include <linux/netdevice.h>
+#include <linux/netdev_features.h>
+#include "rmnet_private.h"
+#include "rmnet_config.h"
+#include "rmnet_vnd.h"
+#include "rmnet_map.h"
+#include "rmnet_handlers.h"
+
+#define RMNET_IP_VERSION_4 0x40
+#define RMNET_IP_VERSION_6 0x60
+
+/* Helper Functions */
+
+static void rmnet_set_skb_proto(struct sk_buff *skb)
+{
+	switch (skb->data[0] & 0xF0) {
+	case RMNET_IP_VERSION_4:
+		skb->protocol = htons(ETH_P_IP);
+		break;
+	case RMNET_IP_VERSION_6:
+		skb->protocol = htons(ETH_P_IPV6);
+		break;
+	default:
+		skb->protocol = htons(ETH_P_MAP);
+		break;
+	}
+}
+
+/* Generic handler */
+
+static rx_handler_result_t
+rmnet_bridge_handler(struct sk_buff *skb, struct rmnet_endpoint *ep)
+{
+	if (!ep->egress_dev)
+		kfree_skb(skb);
+	else
+		rmnet_egress_handler(skb, ep);
+
+	return RX_HANDLER_CONSUMED;
+}
+
+static rx_handler_result_t
+rmnet_deliver_skb(struct sk_buff *skb, struct rmnet_endpoint *ep)
+{
+	switch (ep->rmnet_mode) {
+	case RMNET_EPMODE_NONE:
+		return RX_HANDLER_PASS;
+
+	case RMNET_EPMODE_BRIDGE:
+		return rmnet_bridge_handler(skb, ep);
+
+	case RMNET_EPMODE_VND:
+		skb_reset_transport_header(skb);
+		skb_reset_network_header(skb);
+		rmnet_vnd_rx_fixup(skb, skb->dev);
+
+		skb->pkt_type = PACKET_HOST;
+		skb_set_mac_header(skb, 0);
+		netif_receive_skb(skb);
+		return RX_HANDLER_CONSUMED;
+
+	default:
+		kfree_skb(skb);
+		return RX_HANDLER_CONSUMED;
+	}
+}
+
+static rx_handler_result_t
+rmnet_ingress_deliver_packet(struct sk_buff *skb,
+			     struct rmnet_real_dev_info *r)
+{
+	if (!r) {
+		kfree_skb(skb);
+		return RX_HANDLER_CONSUMED;
+	}
+
+	skb->dev = r->local_ep.egress_dev;
+
+	return rmnet_deliver_skb(skb, &r->local_ep);
+}
+
+/* MAP handler */
+
+static rx_handler_result_t
+__rmnet_map_ingress_handler(struct sk_buff *skb,
+			    struct rmnet_real_dev_info *r)
+{
+	struct rmnet_endpoint *ep;
+	u8 mux_id;
+	u16 len;
+
+	if (RMNET_MAP_GET_CD_BIT(skb)) {
+		if (r->ingress_data_format
+		    & RMNET_INGRESS_FORMAT_MAP_COMMANDS)
+			return rmnet_map_command(skb, r);
+
+		kfree_skb(skb);
+		return RX_HANDLER_CONSUMED;
+	}
+
+	mux_id = RMNET_MAP_GET_MUX_ID(skb);
+	len = RMNET_MAP_GET_LENGTH(skb) - RMNET_MAP_GET_PAD(skb);
+
+	if (mux_id >= RMNET_MAX_LOGICAL_EP) {
+		kfree_skb(skb);
+		return RX_HANDLER_CONSUMED;
+	}
+
+	ep = &r->muxed_ep[mux_id];
+
+	if (r->ingress_data_format & RMNET_INGRESS_FORMAT_DEMUXING)
+		skb->dev = ep->egress_dev;
+
+	/* Subtract MAP header */
+	skb_pull(skb, sizeof(struct rmnet_map_header));
+	skb_trim(skb, len);
+	rmnet_set_skb_proto(skb);
+	return rmnet_deliver_skb(skb, ep);
+}
+
+static rx_handler_result_t
+rmnet_map_ingress_handler(struct sk_buff *skb,
+			  struct rmnet_real_dev_info *r)
+{
+	struct sk_buff *skbn;
+	int rc;
+
+	if (r->ingress_data_format & RMNET_INGRESS_FORMAT_DEAGGREGATION) {
+		while ((skbn = rmnet_map_deaggregate(skb, r)) != NULL)
+			__rmnet_map_ingress_handler(skbn, r);
+
+		consume_skb(skb);
+		rc = RX_HANDLER_CONSUMED;
+	} else {
+		rc = __rmnet_map_ingress_handler(skb, r);
+	}
+
+	return rc;
+}
+
+static int rmnet_map_egress_handler(struct sk_buff *skb,
+				    struct rmnet_real_dev_info *r,
+				    struct rmnet_endpoint *ep,
+				    struct net_device *orig_dev)
+{
+	int required_headroom, additional_header_len;
+	struct rmnet_map_header *map_header;
+
+	additional_header_len = 0;
+	required_headroom = sizeof(struct rmnet_map_header);
+
+	if (skb_headroom(skb) < required_headroom) {
+		if (pskb_expand_head(skb, required_headroom, 0, GFP_KERNEL))
+			return RMNET_MAP_CONSUMED;
+	}
+
+	map_header = rmnet_map_add_map_header(skb, additional_header_len, 0);
+	if (!map_header)
+		return RMNET_MAP_CONSUMED;
+
+	if (r->egress_data_format & RMNET_EGRESS_FORMAT_MUXING) {
+		if (ep->mux_id == 0xff)
+			map_header->mux_id = 0;
+		else
+			map_header->mux_id = ep->mux_id;
+	}
+
+	skb->protocol = htons(ETH_P_MAP);
+
+	return RMNET_MAP_SUCCESS;
+}
+
+/* Ingress / Egress Entry Points */
+
+/* Processes packet as per ingress data format for receiving device. Logical
+ * endpoint is determined from packet inspection. Packet is then sent to the
+ * egress device listed in the logical endpoint configuration.
+ */
+rx_handler_result_t rmnet_rx_handler(struct sk_buff **pskb)
+{
+	struct rmnet_real_dev_info *r;
+	struct sk_buff *skb = *pskb;
+	struct net_device *dev;
+	int rc;
+
+	if (!skb)
+		return RX_HANDLER_CONSUMED;
+
+	dev = skb->dev;
+	r = rmnet_get_real_dev_info(dev);
+
+	if (r->ingress_data_format & RMNET_INGRESS_FORMAT_MAP) {
+		rc = rmnet_map_ingress_handler(skb, r);
+	} else {
+		switch (ntohs(skb->protocol)) {
+		case ETH_P_MAP:
+			if (r->local_ep.rmnet_mode ==
+				RMNET_EPMODE_BRIDGE) {
+				rc = rmnet_ingress_deliver_packet(skb, r);
+			} else {
+				kfree_skb(skb);
+				rc = RX_HANDLER_CONSUMED;
+			}
+			break;
+
+		case ETH_P_IP:
+		case ETH_P_IPV6:
+			rc = rmnet_ingress_deliver_packet(skb, r);
+			break;
+
+		default:
+			rc = RX_HANDLER_PASS;
+		}
+	}
+
+	return rc;
+}
+
+/* Modifies packet as per logical endpoint configuration and egress data format
+ * for egress device configured in logical endpoint. Packet is then transmitted
+ * on the egress device.
+ */
+void rmnet_egress_handler(struct sk_buff *skb,
+			  struct rmnet_endpoint *ep)
+{
+	struct rmnet_real_dev_info *r;
+	struct net_device *orig_dev;
+
+	orig_dev = skb->dev;
+	skb->dev = ep->egress_dev;
+
+	r = rmnet_get_real_dev_info(skb->dev);
+	if (!r) {
+		kfree_skb(skb);
+		return;
+	}
+
+	if (r->egress_data_format & RMNET_EGRESS_FORMAT_MAP) {
+		switch (rmnet_map_egress_handler(skb, r, ep, orig_dev)) {
+		case RMNET_MAP_CONSUMED:
+			return;
+
+		case RMNET_MAP_SUCCESS:
+			break;
+
+		default:
+			kfree_skb(skb);
+			return;
+		}
+	}
+
+	if (ep->rmnet_mode == RMNET_EPMODE_VND)
+		rmnet_vnd_tx_fixup(skb, orig_dev);
+
+	dev_queue_xmit(skb);
+}
diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_handlers.h b/drivers/net/ethernet/qualcomm/rmnet/rmnet_handlers.h
new file mode 100644
index 0000000..f2638cf
--- /dev/null
+++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_handlers.h
@@ -0,0 +1,26 @@
+/* Copyright (c) 2013, 2016-2017 The Linux Foundation. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 and
+ * only version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * RMNET Data ingress/egress handler
+ *
+ */
+
+#ifndef _RMNET_HANDLERS_H_
+#define _RMNET_HANDLERS_H_
+
+#include "rmnet_config.h"
+
+void rmnet_egress_handler(struct sk_buff *skb,
+			  struct rmnet_endpoint *ep);
+
+rx_handler_result_t rmnet_rx_handler(struct sk_buff **pskb);
+
+#endif /* _RMNET_HANDLERS_H_ */
diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_map.h b/drivers/net/ethernet/qualcomm/rmnet/rmnet_map.h
new file mode 100644
index 0000000..2aabad2
--- /dev/null
+++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_map.h
@@ -0,0 +1,88 @@
+/* Copyright (c) 2013-2017, The Linux Foundation. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 and
+ * only version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#ifndef _RMNET_MAP_H_
+#define _RMNET_MAP_H_
+
+struct rmnet_map_control_command {
+	u8  command_name;
+	u8  cmd_type:2;
+	u8  reserved:6;
+	u16 reserved2;
+	u32 transaction_id;
+	union {
+		struct {
+			u16 ip_family:2;
+			u16 reserved:14;
+			u16 flow_control_seq_num;
+			u32 qos_id;
+		} flow_control;
+		u8 data[0];
+	};
+}  __aligned(1);
+
+enum rmnet_map_results {
+	RMNET_MAP_SUCCESS,
+	RMNET_MAP_CONSUMED,
+	RMNET_MAP_GENERAL_FAILURE,
+	RMNET_MAP_NOT_ENABLED,
+	RMNET_MAP_FAILED_AGGREGATION,
+	RMNET_MAP_FAILED_MUX
+};
+
+enum rmnet_map_commands {
+	RMNET_MAP_COMMAND_NONE,
+	RMNET_MAP_COMMAND_FLOW_DISABLE,
+	RMNET_MAP_COMMAND_FLOW_ENABLE,
+	/* These should always be the last 2 elements */
+	RMNET_MAP_COMMAND_UNKNOWN,
+	RMNET_MAP_COMMAND_ENUM_LENGTH
+};
+
+struct rmnet_map_header {
+	u8  pad_len:6;
+	u8  reserved_bit:1;
+	u8  cd_bit:1;
+	u8  mux_id;
+	u16 pkt_len;
+}  __aligned(1);
+
+#define RMNET_MAP_GET_MUX_ID(Y) (((struct rmnet_map_header *) \
+				 (Y)->data)->mux_id)
+#define RMNET_MAP_GET_CD_BIT(Y) (((struct rmnet_map_header *) \
+				(Y)->data)->cd_bit)
+#define RMNET_MAP_GET_PAD(Y) (((struct rmnet_map_header *) \
+				(Y)->data)->pad_len)
+#define RMNET_MAP_GET_CMD_START(Y) ((struct rmnet_map_control_command *) \
+				    ((Y)->data + \
+				      sizeof(struct rmnet_map_header)))
+#define RMNET_MAP_GET_LENGTH(Y) (ntohs(((struct rmnet_map_header *) \
+					(Y)->data)->pkt_len))
+
+#define RMNET_MAP_COMMAND_REQUEST     0
+#define RMNET_MAP_COMMAND_ACK         1
+#define RMNET_MAP_COMMAND_UNSUPPORTED 2
+#define RMNET_MAP_COMMAND_INVALID     3
+
+#define RMNET_MAP_NO_PAD_BYTES        0
+#define RMNET_MAP_ADD_PAD_BYTES       1
+
+u8 rmnet_map_demultiplex(struct sk_buff *skb);
+struct sk_buff *rmnet_map_deaggregate(struct sk_buff *skb,
+				      struct rmnet_real_dev_info *rdinfo);
+
+struct rmnet_map_header *rmnet_map_add_map_header(struct sk_buff *skb,
+						  int hdrlen, int pad);
+rx_handler_result_t rmnet_map_command(struct sk_buff *skb,
+				      struct rmnet_real_dev_info *rdinfo);
+
+#endif /* _RMNET_MAP_H_ */
diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_map_command.c b/drivers/net/ethernet/qualcomm/rmnet/rmnet_map_command.c
new file mode 100644
index 0000000..ccded40
--- /dev/null
+++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_map_command.c
@@ -0,0 +1,107 @@
+/* Copyright (c) 2013-2017, The Linux Foundation. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 and
+ * only version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include <linux/netdevice.h>
+#include "rmnet_config.h"
+#include "rmnet_map.h"
+#include "rmnet_private.h"
+#include "rmnet_vnd.h"
+
+static u8 rmnet_map_do_flow_control(struct sk_buff *skb,
+				    struct rmnet_real_dev_info *rdinfo,
+				    int enable)
+{
+	struct rmnet_map_control_command *cmd;
+	struct rmnet_endpoint *ep;
+	struct net_device *vnd;
+	u16 ip_family;
+	u16 fc_seq;
+	u32 qos_id;
+	u8 mux_id;
+	int r;
+
+	mux_id = RMNET_MAP_GET_MUX_ID(skb);
+	cmd = RMNET_MAP_GET_CMD_START(skb);
+
+	if (mux_id >= RMNET_MAX_LOGICAL_EP) {
+		kfree_skb(skb);
+		return RX_HANDLER_CONSUMED;
+	}
+
+	ep = &rdinfo->muxed_ep[mux_id];
+	vnd = ep->egress_dev;
+
+	ip_family = cmd->flow_control.ip_family;
+	fc_seq = ntohs(cmd->flow_control.flow_control_seq_num);
+	qos_id = ntohl(cmd->flow_control.qos_id);
+
+	/* Ignore the ip family and pass the sequence number for both v4 and v6
+	 * sequence. User space does not support creating dedicated flows for
+	 * the 2 protocols
+	 */
+	r = rmnet_vnd_do_flow_control(vnd, enable);
+	if (r) {
+		kfree_skb(skb);
+		return RMNET_MAP_COMMAND_UNSUPPORTED;
+	} else {
+		return RMNET_MAP_COMMAND_ACK;
+	}
+}
+
+static void rmnet_map_send_ack(struct sk_buff *skb,
+			       unsigned char type,
+			       struct rmnet_real_dev_info *rdinfo)
+{
+	struct rmnet_map_control_command *cmd;
+	int xmit_status;
+
+	skb->protocol = htons(ETH_P_MAP);
+
+	cmd = RMNET_MAP_GET_CMD_START(skb);
+	cmd->cmd_type = type & 0x03;
+
+	netif_tx_lock(skb->dev);
+	xmit_status = skb->dev->netdev_ops->ndo_start_xmit(skb, skb->dev);
+	netif_tx_unlock(skb->dev);
+}
+
+/* Process MAP command frame and send N/ACK message as appropriate. Message cmd
+ * name is decoded here and appropriate handler is called.
+ */
+rx_handler_result_t rmnet_map_command(struct sk_buff *skb,
+				      struct rmnet_real_dev_info *rdinfo)
+{
+	struct rmnet_map_control_command *cmd;
+	unsigned char command_name;
+	unsigned char rc = 0;
+
+	cmd = RMNET_MAP_GET_CMD_START(skb);
+	command_name = cmd->command_name;
+
+	switch (command_name) {
+	case RMNET_MAP_COMMAND_FLOW_ENABLE:
+		rc = rmnet_map_do_flow_control(skb, rdinfo, 1);
+		break;
+
+	case RMNET_MAP_COMMAND_FLOW_DISABLE:
+		rc = rmnet_map_do_flow_control(skb, rdinfo, 0);
+		break;
+
+	default:
+		rc = RMNET_MAP_COMMAND_UNSUPPORTED;
+		kfree_skb(skb);
+		break;
+	}
+	if (rc == RMNET_MAP_COMMAND_ACK)
+		rmnet_map_send_ack(skb, rc, rdinfo);
+	return RX_HANDLER_CONSUMED;
+}
diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_map_data.c b/drivers/net/ethernet/qualcomm/rmnet/rmnet_map_data.c
new file mode 100644
index 0000000..a29c476
--- /dev/null
+++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_map_data.c
@@ -0,0 +1,105 @@
+/* Copyright (c) 2013-2017, The Linux Foundation. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 and
+ * only version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * RMNET Data MAP protocol
+ *
+ */
+
+#include <linux/netdevice.h>
+#include "rmnet_config.h"
+#include "rmnet_map.h"
+#include "rmnet_private.h"
+
+#define RMNET_MAP_DEAGGR_SPACING  64
+#define RMNET_MAP_DEAGGR_HEADROOM (RMNET_MAP_DEAGGR_SPACING / 2)
+
+/* Adds MAP header to front of skb->data
+ * Padding is calculated and set appropriately in MAP header. Mux ID is
+ * initialized to 0.
+ */
+struct rmnet_map_header *rmnet_map_add_map_header(struct sk_buff *skb,
+						  int hdrlen, int pad)
+{
+	struct rmnet_map_header *map_header;
+	u32 padding, map_datalen;
+	u8 *padbytes;
+
+	if (skb_headroom(skb) < sizeof(struct rmnet_map_header))
+		return NULL;
+
+	map_datalen = skb->len - hdrlen;
+	map_header = (struct rmnet_map_header *)
+			skb_push(skb, sizeof(struct rmnet_map_header));
+	memset(map_header, 0, sizeof(struct rmnet_map_header));
+
+	if (pad == RMNET_MAP_NO_PAD_BYTES) {
+		map_header->pkt_len = htons(map_datalen);
+		return map_header;
+	}
+
+	padding = ALIGN(map_datalen, 4) - map_datalen;
+
+	if (padding == 0)
+		goto done;
+
+	if (skb_tailroom(skb) < padding)
+		return NULL;
+
+	padbytes = (u8 *)skb_put(skb, padding);
+	memset(padbytes, 0, padding);
+
+done:
+	map_header->pkt_len = htons(map_datalen + padding);
+	map_header->pad_len = padding & 0x3F;
+
+	return map_header;
+}
+
+/* Deaggregates a single packet
+ * A whole new buffer is allocated for each portion of an aggregated frame.
+ * Caller should keep calling deaggregate() on the source skb until 0 is
+ * returned, indicating that there are no more packets to deaggregate. Caller
+ * is responsible for freeing the original skb.
+ */
+struct sk_buff *rmnet_map_deaggregate(struct sk_buff *skb,
+				      struct rmnet_real_dev_info *rdinfo)
+{
+	struct rmnet_map_header *maph;
+	struct sk_buff *skbn;
+	u32 packet_len;
+
+	if (skb->len == 0)
+		return NULL;
+
+	maph = (struct rmnet_map_header *)skb->data;
+	packet_len = ntohs(maph->pkt_len) + sizeof(struct rmnet_map_header);
+
+	if (((int)skb->len - (int)packet_len) < 0)
+		return NULL;
+
+	skbn = alloc_skb(packet_len + RMNET_MAP_DEAGGR_SPACING, GFP_ATOMIC);
+	if (!skbn)
+		return NULL;
+
+	skbn->dev = skb->dev;
+	skb_reserve(skbn, RMNET_MAP_DEAGGR_HEADROOM);
+	skb_put(skbn, packet_len);
+	memcpy(skbn->data, skb->data, packet_len);
+	skb_pull(skb, packet_len);
+
+	/* Some hardware can send us empty frames. Catch them */
+	if (ntohs(maph->pkt_len) == 0) {
+		kfree_skb(skb);
+		return NULL;
+	}
+
+	return skbn;
+}
diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_private.h b/drivers/net/ethernet/qualcomm/rmnet/rmnet_private.h
new file mode 100644
index 0000000..ed820b5
--- /dev/null
+++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_private.h
@@ -0,0 +1,45 @@
+/* Copyright (c) 2013-2014, 2016-2017 The Linux Foundation. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 and
+ * only version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#ifndef _RMNET_PRIVATE_H_
+#define _RMNET_PRIVATE_H_
+
+#define RMNET_MAX_VND              32
+#define RMNET_MAX_PACKET_SIZE      16384
+#define RMNET_DFLT_PACKET_SIZE     1500
+#define RMNET_NEEDED_HEADROOM      16
+#define RMNET_TX_QUEUE_LEN         1000
+
+/* Constants */
+#define RMNET_EGRESS_FORMAT__RESERVED__         BIT(0)
+#define RMNET_EGRESS_FORMAT_MAP                 BIT(1)
+#define RMNET_EGRESS_FORMAT_AGGREGATION         BIT(2)
+#define RMNET_EGRESS_FORMAT_MUXING              BIT(3)
+#define RMNET_EGRESS_FORMAT_MAP_CKSUMV3         BIT(4)
+#define RMNET_EGRESS_FORMAT_MAP_CKSUMV4         BIT(5)
+
+#define RMNET_INGRESS_FIX_ETHERNET              BIT(0)
+#define RMNET_INGRESS_FORMAT_MAP                BIT(1)
+#define RMNET_INGRESS_FORMAT_DEAGGREGATION      BIT(2)
+#define RMNET_INGRESS_FORMAT_DEMUXING           BIT(3)
+#define RMNET_INGRESS_FORMAT_MAP_COMMANDS       BIT(4)
+#define RMNET_INGRESS_FORMAT_MAP_CKSUMV3        BIT(5)
+#define RMNET_INGRESS_FORMAT_MAP_CKSUMV4        BIT(6)
+
+/* Pass the frame up the stack with no modifications to skb->dev */
+#define RMNET_EPMODE_NONE (0)
+/* Replace skb->dev to a virtual rmnet device and pass up the stack */
+#define RMNET_EPMODE_VND (1)
+/* Pass the frame directly to another device with dev_queue_xmit() */
+#define RMNET_EPMODE_BRIDGE (2)
+
+#endif /* _RMNET_PRIVATE_H_ */
diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.c b/drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.c
new file mode 100644
index 0000000..c8b573d
--- /dev/null
+++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.c
@@ -0,0 +1,170 @@
+/* Copyright (c) 2013-2017, The Linux Foundation. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 and
+ * only version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ *
+ * RMNET Data virtual network driver
+ *
+ */
+
+#include <linux/etherdevice.h>
+#include <linux/if_arp.h>
+#include <net/pkt_sched.h>
+#include "rmnet_config.h"
+#include "rmnet_handlers.h"
+#include "rmnet_private.h"
+#include "rmnet_map.h"
+#include "rmnet_vnd.h"
+
+/* RX/TX Fixup */
+
+void rmnet_vnd_rx_fixup(struct sk_buff *skb, struct net_device *dev)
+{
+	dev->stats.rx_packets++;
+	dev->stats.rx_bytes += skb->len;
+}
+
+void rmnet_vnd_tx_fixup(struct sk_buff *skb, struct net_device *dev)
+{
+	dev->stats.tx_packets++;
+	dev->stats.tx_bytes += skb->len;
+}
+
+/* Network Device Operations */
+
+static netdev_tx_t rmnet_vnd_start_xmit(struct sk_buff *skb,
+					struct net_device *dev)
+{
+	struct rmnet_priv *priv;
+
+	priv = netdev_priv(dev);
+	if (priv->local_ep.egress_dev) {
+		rmnet_egress_handler(skb, &priv->local_ep);
+	} else {
+		dev->stats.tx_dropped++;
+		kfree_skb(skb);
+	}
+	return NETDEV_TX_OK;
+}
+
+static int rmnet_vnd_change_mtu(struct net_device *rmnet_dev, int new_mtu)
+{
+	if (new_mtu < 0 || new_mtu > RMNET_MAX_PACKET_SIZE)
+		return -EINVAL;
+
+	rmnet_dev->mtu = new_mtu;
+	return 0;
+}
+
+static const struct net_device_ops rmnet_vnd_ops = {
+	.ndo_start_xmit = rmnet_vnd_start_xmit,
+	.ndo_change_mtu = rmnet_vnd_change_mtu,
+};
+
+/* Called by kernel whenever a new rmnet<n> device is created. Sets MTU,
+ * flags, ARP type, needed headroom, etc...
+ */
+void rmnet_vnd_setup(struct net_device *rmnet_dev)
+{
+	struct rmnet_priv *priv;
+
+	priv = netdev_priv(rmnet_dev);
+	netdev_dbg(rmnet_dev, "Setting up device %s\n", rmnet_dev->name);
+
+	rmnet_dev->netdev_ops = &rmnet_vnd_ops;
+	rmnet_dev->mtu = RMNET_DFLT_PACKET_SIZE;
+	rmnet_dev->needed_headroom = RMNET_NEEDED_HEADROOM;
+	random_ether_addr(rmnet_dev->dev_addr);
+	rmnet_dev->tx_queue_len = RMNET_TX_QUEUE_LEN;
+
+	/* Raw IP mode */
+	rmnet_dev->header_ops = NULL;  /* No header */
+	rmnet_dev->type = ARPHRD_RAWIP;
+	rmnet_dev->hard_header_len = 0;
+	rmnet_dev->flags &= ~(IFF_BROADCAST | IFF_MULTICAST);
+
+	rmnet_dev->needs_free_netdev = true;
+}
+
+/* Exposed API */
+
+int rmnet_vnd_newlink(u8 id, struct net_device *rmnet_dev,
+		      struct rmnet_real_dev_info *r)
+{
+	int rc;
+
+	if (r->rmnet_devices[id])
+		return -EINVAL;
+
+	rc = register_netdevice(rmnet_dev);
+	if (!rc) {
+		r->rmnet_devices[id] = rmnet_dev;
+		r->nr_rmnet_devs++;
+		rmnet_dev->rtnl_link_ops = &rmnet_link_ops;
+	}
+
+	return rc;
+}
+
+int rmnet_vnd_dellink(u8 id, struct rmnet_real_dev_info *r)
+{
+	if (id >= RMNET_MAX_VND || !r->rmnet_devices[id])
+		return -EINVAL;
+
+	r->rmnet_devices[id] = NULL;
+	r->nr_rmnet_devs--;
+	return 0;
+}
+
+u8 rmnet_vnd_get_mux(struct net_device *rmnet_dev)
+{
+	struct rmnet_priv *priv;
+
+	priv = netdev_priv(rmnet_dev);
+	return priv->mux_id;
+}
+
+void rmnet_vnd_set_mux(struct net_device *rmnet_dev, u8 mux_id)
+{
+	struct rmnet_priv *priv;
+
+	priv = netdev_priv(rmnet_dev);
+	priv->mux_id = mux_id;
+}
+
+/* Gets the logical endpoint configuration for a RmNet virtual network device
+ * node. Caller should confirm that devices is a RmNet VND before calling.
+ */
+struct rmnet_endpoint *rmnet_vnd_get_endpoint(struct net_device *rmnet_dev)
+{
+	struct rmnet_priv *priv;
+
+	if (!rmnet_dev)
+		return NULL;
+
+	priv = netdev_priv(rmnet_dev);
+
+	return &priv->local_ep;
+}
+
+int rmnet_vnd_do_flow_control(struct net_device *rmnet_dev, int enable)
+{
+	netdev_dbg(rmnet_dev, "Setting VND TX queue state to %d\n", enable);
+	/* Although we expect similar number of enable/disable
+	 * commands, optimize for the disable. That is more
+	 * latency sensitive than enable
+	 */
+	if (unlikely(enable))
+		netif_wake_queue(rmnet_dev);
+	else
+		netif_stop_queue(rmnet_dev);
+
+	return 0;
+}
diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.h b/drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.h
new file mode 100644
index 0000000..b102b42
--- /dev/null
+++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.h
@@ -0,0 +1,29 @@
+/* Copyright (c) 2013-2017, The Linux Foundation. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 and
+ * only version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * RMNET Data Virtual Network Device APIs
+ *
+ */
+
+#ifndef _RMNET_VND_H_
+#define _RMNET_VND_H_
+
+int rmnet_vnd_do_flow_control(struct net_device *dev, int enable);
+struct rmnet_endpoint *rmnet_vnd_get_endpoint(struct net_device *dev);
+int rmnet_vnd_newlink(u8 id, struct net_device *rmnet_dev,
+		      struct rmnet_real_dev_info *r);
+int rmnet_vnd_dellink(u8 id, struct rmnet_real_dev_info *r);
+void rmnet_vnd_rx_fixup(struct sk_buff *skb, struct net_device *dev);
+void rmnet_vnd_tx_fixup(struct sk_buff *skb, struct net_device *dev);
+u8 rmnet_vnd_get_mux(struct net_device *rmnet_dev);
+void rmnet_vnd_set_mux(struct net_device *rmnet_dev, u8 mux_id);
+void rmnet_vnd_setup(struct net_device *dev);
+#endif /* _RMNET_VND_H_ */
-- 
1.9.1

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox