Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH net-next 03/11] net: dsa: debugfs: add tree
From: Florian Fainelli @ 2017-08-21 22:06 UTC (permalink / raw)
  To: Vivien Didelot, netdev
  Cc: linux-kernel, kernel, David S. Miller, Andrew Lunn,
	Egil Hjelmeland, John Crispin, Woojung Huh, Sean Wang,
	Volodymyr Bendiuga, Nikita Yushchenko, Maxime Hadjinlian,
	Chris Healy, Maxim Uvarov, Stefan Eichenberger, Jason Cobham,
	Juergen Borleis, Tobias Waldekranz
In-Reply-To: <20170814222242.10643-4-vivien.didelot@savoirfairelinux.com>

On 08/14/2017 03:22 PM, Vivien Didelot wrote:
> This commit adds the boiler plate to create a DSA related debug
> filesystem entry as well as a "tree" file, containing the tree index.
> 
>     # cat switch1/tree
>     0
> 
> Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
-- 
Florian

^ permalink raw reply

* Re: [PATCH] mt7601u: check memory allocation failure
From: Christophe JAILLET @ 2017-08-21 22:08 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: kvalo, matthias.bgg, linux-wireless, netdev, linux-arm-kernel,
	linux-mediatek, linux-kernel, kernel-janitors
In-Reply-To: <20170821144136.477bf655@cakuba.netronome.com>

Le 21/08/2017 à 23:41, Jakub Kicinski a écrit :
> On Mon, 21 Aug 2017 14:34:30 -0700, Jakub Kicinski wrote:
>> On Mon, 21 Aug 2017 22:59:56 +0200, Christophe JAILLET wrote:
>>> Check memory allocation failure and return -ENOMEM in such a case, as
>>> already done a few lines below
>>>
>>> Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
>> Acked-by: Jakub Kicinski <kubakici@wp.pl>
> Wait, I take that back.  This code is a bit weird.  We would return an
> error, then mt7601u_dma_init() will call mt7601u_free_tx_queue() which
> doesn't check for tx_q == NULL condition.
>
> Looks like mt7601u_free_tx() has to check for dev->tx_q == NULL and
> return early if that's the case.  Or mt7601u_alloc_tx() should really
> clean things up on it's own on failure.  Ugh.
>
You are right. Thanks for the review.

I've sent a v2 which updates 'mt7601u_free_tx()'.
Doing so sounds more in line with the spirit of this code.

CJ

^ permalink raw reply

* RE: [PATCH net] net: dsa: skb_put_padto() already frees nskb
From: Woojung.Huh @ 2017-08-21 22:15 UTC (permalink / raw)
  To: f.fainelli, netdev; +Cc: davem, andrew, vivien.didelot
In-Reply-To: <20170821194143.27885-1-f.fainelli@gmail.com>

Florian,

> -----Original Message-----
> From: Florian Fainelli [mailto:f.fainelli@gmail.com]
> Sent: Monday, August 21, 2017 3:42 PM
> To: netdev@vger.kernel.org
> Cc: davem@davemloft.net; andrew@lunn.ch;
> vivien.didelot@savoirfairelinux.com; Woojung Huh - C21699; Florian Fainelli
> Subject: [PATCH net] net: dsa: skb_put_padto() already frees nskb
> 
> skb_put_padto() already frees the passed sk_buff reference upon error,
> so calling kfree_skb() on it again is not necessary.
> 
> Detected by CoverityScan, CID#1416687 ("USE_AFTER_FREE")
> 
> Fixes: e71cb9e00922 ("net: dsa: ksz: fix skb freeing")
> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
> ---
>  net/dsa/tag_ksz.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/net/dsa/tag_ksz.c b/net/dsa/tag_ksz.c
> index de66ca8e6201..107172c82107 100644
> --- a/net/dsa/tag_ksz.c
> +++ b/net/dsa/tag_ksz.c
> @@ -60,10 +60,8 @@ static struct sk_buff *ksz_xmit(struct sk_buff *skb,
> struct net_device *dev)
>  					 skb_transport_header(skb) - skb-
> >head);
>  		skb_copy_and_csum_dev(skb, skb_put(nskb, skb->len));
> 
> -		if (skb_put_padto(nskb, nskb->len + padlen)) {
> -			kfree_skb(nskb);
> +		if (skb_put_padto(nskb, nskb->len + padlen))
>  			return NULL;
> -		}
> 
>  		kfree_skb(skb);
>  	}
> --

Because skb_put_padto() frees skb when it fails,  below lines in e71cb9e00922
("net: dsa: ksz: fix skb freeing") will be an issue to.

	if (skb_tailroom(skb) >= padlen + KSZ_INGRESS_TAG_LEN) {
+		if (skb_put_padto(skb, skb->len + padlen))
+			return NULL;
+

When it fails skb will be freed twice in skb_put_padto() and
caller of dsa_slave_xmit().

Woojung

^ permalink raw reply

* Re: [PATCH net] net: dsa: skb_put_padto() already frees nskb
From: Florian Fainelli @ 2017-08-21 22:24 UTC (permalink / raw)
  To: Woojung.Huh, netdev; +Cc: davem, andrew, vivien.didelot
In-Reply-To: <9235D6609DB808459E95D78E17F2E43D40B048CD@CHN-SV-EXMX02.mchp-main.com>

On 08/21/2017 03:15 PM, Woojung.Huh@microchip.com wrote:
> Florian,
> 
>> -----Original Message-----
>> From: Florian Fainelli [mailto:f.fainelli@gmail.com]
>> Sent: Monday, August 21, 2017 3:42 PM
>> To: netdev@vger.kernel.org
>> Cc: davem@davemloft.net; andrew@lunn.ch;
>> vivien.didelot@savoirfairelinux.com; Woojung Huh - C21699; Florian Fainelli
>> Subject: [PATCH net] net: dsa: skb_put_padto() already frees nskb
>>
>> skb_put_padto() already frees the passed sk_buff reference upon error,
>> so calling kfree_skb() on it again is not necessary.
>>
>> Detected by CoverityScan, CID#1416687 ("USE_AFTER_FREE")
>>
>> Fixes: e71cb9e00922 ("net: dsa: ksz: fix skb freeing")
>> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
>> ---
>>  net/dsa/tag_ksz.c | 4 +---
>>  1 file changed, 1 insertion(+), 3 deletions(-)
>>
>> diff --git a/net/dsa/tag_ksz.c b/net/dsa/tag_ksz.c
>> index de66ca8e6201..107172c82107 100644
>> --- a/net/dsa/tag_ksz.c
>> +++ b/net/dsa/tag_ksz.c
>> @@ -60,10 +60,8 @@ static struct sk_buff *ksz_xmit(struct sk_buff *skb,
>> struct net_device *dev)
>>  					 skb_transport_header(skb) - skb-
>>> head);
>>  		skb_copy_and_csum_dev(skb, skb_put(nskb, skb->len));
>>
>> -		if (skb_put_padto(nskb, nskb->len + padlen)) {
>> -			kfree_skb(nskb);
>> +		if (skb_put_padto(nskb, nskb->len + padlen))
>>  			return NULL;
>> -		}
>>
>>  		kfree_skb(skb);
>>  	}
>> --
> 
> Because skb_put_padto() frees skb when it fails,  below lines in e71cb9e00922
> ("net: dsa: ksz: fix skb freeing") will be an issue to.
> 
> 	if (skb_tailroom(skb) >= padlen + KSZ_INGRESS_TAG_LEN) {
> +		if (skb_put_padto(skb, skb->len + padlen))
> +			return NULL;
> +
> 
> When it fails skb will be freed twice in skb_put_padto() and
> caller of dsa_slave_xmit().

You are right, I am not sure what is the best way to fix tag_ksz.c other
than somehow open coding skb_put_padto() minus the freeing on error part?
-- 
Florian

^ permalink raw reply

* Re: [PATCH v2] mt7601u: check memory allocation failure
From: Jakub Kicinski @ 2017-08-21 22:25 UTC (permalink / raw)
  To: Christophe JAILLET
  Cc: kvalo, matthias.bgg, linux-wireless, netdev, linux-arm-kernel,
	linux-mediatek, linux-kernel, kernel-janitors
In-Reply-To: <20170821220617.21513-1-christophe.jaillet@wanadoo.fr>

On Tue, 22 Aug 2017 00:06:17 +0200, Christophe JAILLET wrote:
> Check memory allocation failure and return -ENOMEM in such a case, as
> already done a few lines below.
> 
> As 'dev->tx_q' can be NULL, we also need to check for that in
> 'mt7601u_free_tx()', and return early.
> 
> Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>

Acked-by: Jakub Kicinski <kubakici@wp.pl>

^ permalink raw reply

* Re: XDP redirect measurements, gotchas and tracepoints
From: Alexei Starovoitov @ 2017-08-21 22:35 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: xdp-newbies@vger.kernel.org, John Fastabend, Daniel Borkmann,
	Andy Gospodarek, netdev@vger.kernel.org, Paweł Staszewski
In-Reply-To: <20170821212506.1cb0d5d6@redhat.com>

On Mon, Aug 21, 2017 at 09:25:06PM +0200, Jesper Dangaard Brouer wrote:
> 
> Third gotcha(3): You got this far, loaded xdp on both interfaces, and
> notice now that (with default setup) you can RX with 14Mpps but only
> TX with 6.9Mpps (and might have 5% idle cycles).  I debugged this via
> perf tracepoint event xdp:xdp_redirect, and found this was due to
> overrunning the xdp TX ring-queue size.

we should probably fix this somehow.
Once tx-ing netdev added to devmap we can enable xdp on it automatically?

^ permalink raw reply

* [PATCH net-next 0/3 v7] Add support for rmnet driver
From: Subash Abhinov Kasiviswanathan @ 2017-08-21 22:36 UTC (permalink / raw)
  To: netdev, davem, fengguang.wu, dcbw, jiri, stephen, David.Laight,
	marcel
  Cc: Subash Abhinov Kasiviswanathan

This patch adds support for the rmnet driver which is required to
support recent chipsets using Qualcomm Technologies, Inc. modems. The data
from hardware follows the multiplexing and aggregation protocol (MAP).

This driver can be used to register onto any physical network device in
IP mode. Physical transports include USB, HSIC, PCIe and IP accelerator.

rmnet driver helps to decode these packets and queue them to network
stack (and encode and transmit it to the physical device).

--
v1: Same as the RFC patch with some minor fixes for issues reported by
kbuild test robot.

v1->v2: Change datatypes and remove config IOCTL as mentioned by David.
Also fix checkpatch issues and remove some unused code.

v2->v3: Move location to drivers/net and rename to rmnet. Change the
userspace - netlink communication from custom netlink to rtnl_link_ops.
Refactor some code. Use a fixed config for ingress and egress.

v3->v4: Move location to drivers/net/ethernet/qualcomm/.
Fix comments from Stephen and Jiri -
Split the ether and arp type changes into seperate patches.
Remove debug and custom logging and switch to standard netdevice log.
Remove module parameters. Refactor and change some code style issues.

v4->v5: Rename some structs and variables. Move the initializer
before the for loop start. Put the arp type in correct sequence.

v5->v6: Fix comments from Dan -
Use the upper link API. As a result, remove all the refcounting logic.
Device refcount is explicitly held on real_dev on rx_handler
registration only. Modifiy the flow control struct. Remove the unused
ethernet mode handling.

v6->v7: Fix comments from David - Add newline to end of Makefile. Remove
inline from .c files. Move the module init/exit to rmnet config. Fix an
error reported by kbuild test robot for an unused file.

Subash Abhinov Kasiviswanathan (3):
  net: ether: Add support for multiplexing and aggregation type
  net: arp: Add support for raw IP device
  drivers: net: ethernet: qualcomm: rmnet: Initial implementation

 Documentation/networking/rmnet.txt                 |  82 ++++
 drivers/net/ethernet/qualcomm/Kconfig              |   2 +
 drivers/net/ethernet/qualcomm/Makefile             |   2 +
 drivers/net/ethernet/qualcomm/rmnet/Kconfig        |  12 +
 drivers/net/ethernet/qualcomm/rmnet/Makefile       |  12 +
 drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c | 418 +++++++++++++++++++++
 drivers/net/ethernet/qualcomm/rmnet/rmnet_config.h |  54 +++
 .../net/ethernet/qualcomm/rmnet/rmnet_handlers.c   | 276 ++++++++++++++
 .../net/ethernet/qualcomm/rmnet/rmnet_handlers.h   |  26 ++
 drivers/net/ethernet/qualcomm/rmnet/rmnet_map.h    |  88 +++++
 .../ethernet/qualcomm/rmnet/rmnet_map_command.c    | 116 ++++++
 .../net/ethernet/qualcomm/rmnet/rmnet_map_data.c   | 105 ++++++
 .../net/ethernet/qualcomm/rmnet/rmnet_private.h    |  45 +++
 drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.c    | 263 +++++++++++++
 drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.h    |  32 ++
 include/uapi/linux/if_arp.h                        |   1 +
 include/uapi/linux/if_ether.h                      |   4 +-
 17 files changed, 1537 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/networking/rmnet.txt
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/Kconfig
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/Makefile
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/rmnet_config.h
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/rmnet_handlers.c
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/rmnet_handlers.h
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/rmnet_map.h
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/rmnet_map_command.c
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/rmnet_map_data.c
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/rmnet_private.h
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.c
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.h

-- 
1.9.1

^ permalink raw reply

* [PATCH net-next 1/3 v7] net: ether: Add support for multiplexing and aggregation type
From: Subash Abhinov Kasiviswanathan @ 2017-08-21 22:36 UTC (permalink / raw)
  To: netdev, davem, fengguang.wu, dcbw, jiri, stephen, David.Laight,
	marcel
  Cc: Subash Abhinov Kasiviswanathan
In-Reply-To: <1503355019-12236-1-git-send-email-subashab@codeaurora.org>

Define the multiplexing and aggregation (MAP) ether type 0xDA1A. This
is needed for receiving data in the MAP protocol like RMNET. This is
not an officially registered ID.

Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
---
 include/uapi/linux/if_ether.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/include/uapi/linux/if_ether.h b/include/uapi/linux/if_ether.h
index 5bc9bfd..e80b03f 100644
--- a/include/uapi/linux/if_ether.h
+++ b/include/uapi/linux/if_ether.h
@@ -104,7 +104,9 @@
 #define ETH_P_QINQ3	0x9300		/* deprecated QinQ VLAN [ NOT AN OFFICIALLY REGISTERED ID ] */
 #define ETH_P_EDSA	0xDADA		/* Ethertype DSA [ NOT AN OFFICIALLY REGISTERED ID ] */
 #define ETH_P_AF_IUCV   0xFBFB		/* IBM af_iucv [ NOT AN OFFICIALLY REGISTERED ID ] */
-
+#define ETH_P_MAP       0xDA1A          /* Multiplexing and Aggregation Protocol
+					 *  NOT AN OFFICIALLY REGISTERED ID ]
+					 */
 #define ETH_P_802_3_MIN	0x0600		/* If the value in the ethernet type is less than this value
 					 * then the frame is Ethernet II. Else it is 802.3 */
 
-- 
1.9.1

^ permalink raw reply related

* [PATCH net-next 2/3 v7] net: arp: Add support for raw IP device
From: Subash Abhinov Kasiviswanathan @ 2017-08-21 22:36 UTC (permalink / raw)
  To: netdev, davem, fengguang.wu, dcbw, jiri, stephen, David.Laight,
	marcel
  Cc: Subash Abhinov Kasiviswanathan
In-Reply-To: <1503355019-12236-1-git-send-email-subashab@codeaurora.org>

Define the raw IP type. This is needed for raw IP net devices
like rmnet.

Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
---
 include/uapi/linux/if_arp.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/uapi/linux/if_arp.h b/include/uapi/linux/if_arp.h
index cf73510..a2a6356 100644
--- a/include/uapi/linux/if_arp.h
+++ b/include/uapi/linux/if_arp.h
@@ -59,6 +59,7 @@
 #define ARPHRD_LAPB	516		/* LAPB				*/
 #define ARPHRD_DDCMP    517		/* Digital's DDCMP protocol     */
 #define ARPHRD_RAWHDLC	518		/* Raw HDLC			*/
+#define ARPHRD_RAWIP    519		/* Raw IP                       */
 
 #define ARPHRD_TUNNEL	768		/* IPIP tunnel			*/
 #define ARPHRD_TUNNEL6	769		/* IP6IP6 tunnel       		*/
-- 
1.9.1

^ permalink raw reply related

* [PATCH net-next 3/3 v7] drivers: net: ethernet: qualcomm: rmnet: Initial implementation
From: Subash Abhinov Kasiviswanathan @ 2017-08-21 22:36 UTC (permalink / raw)
  To: netdev, davem, fengguang.wu, dcbw, jiri, stephen, David.Laight,
	marcel
  Cc: Subash Abhinov Kasiviswanathan
In-Reply-To: <1503355019-12236-1-git-send-email-subashab@codeaurora.org>

RmNet driver provides a transport agnostic MAP (multiplexing and
aggregation protocol) support in embedded module. Module provides
virtual network devices which can be attached to any IP-mode
physical device. This will be used to provide all MAP functionality
on future hardware in a single consistent location.

Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
---
 Documentation/networking/rmnet.txt                 |  82 ++++
 drivers/net/ethernet/qualcomm/Kconfig              |   2 +
 drivers/net/ethernet/qualcomm/Makefile             |   2 +
 drivers/net/ethernet/qualcomm/rmnet/Kconfig        |  12 +
 drivers/net/ethernet/qualcomm/rmnet/Makefile       |  12 +
 drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c | 418 +++++++++++++++++++++
 drivers/net/ethernet/qualcomm/rmnet/rmnet_config.h |  54 +++
 .../net/ethernet/qualcomm/rmnet/rmnet_handlers.c   | 276 ++++++++++++++
 .../net/ethernet/qualcomm/rmnet/rmnet_handlers.h   |  26 ++
 drivers/net/ethernet/qualcomm/rmnet/rmnet_map.h    |  88 +++++
 .../ethernet/qualcomm/rmnet/rmnet_map_command.c    | 116 ++++++
 .../net/ethernet/qualcomm/rmnet/rmnet_map_data.c   | 105 ++++++
 .../net/ethernet/qualcomm/rmnet/rmnet_private.h    |  45 +++
 drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.c    | 263 +++++++++++++
 drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.h    |  32 ++
 15 files changed, 1533 insertions(+)
 create mode 100644 Documentation/networking/rmnet.txt
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/Kconfig
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/Makefile
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/rmnet_config.h
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/rmnet_handlers.c
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/rmnet_handlers.h
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/rmnet_map.h
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/rmnet_map_command.c
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/rmnet_map_data.c
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/rmnet_private.h
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.c
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.h

diff --git a/Documentation/networking/rmnet.txt b/Documentation/networking/rmnet.txt
new file mode 100644
index 0000000..6b341ea
--- /dev/null
+++ b/Documentation/networking/rmnet.txt
@@ -0,0 +1,82 @@
+1. Introduction
+
+rmnet driver is used for supporting the Multiplexing and aggregation
+Protocol (MAP). This protocol is used by all recent chipsets using Qualcomm
+Technologies, Inc. modems.
+
+This driver can be used to register onto any physical network device in
+IP mode. Physical transports include USB, HSIC, PCIe and IP accelerator.
+
+Multiplexing allows for creation of logical netdevices (rmnet devices) to
+handle multiple private data networks (PDN) like a default internet, tethering,
+multimedia messaging service (MMS) or IP media subsystem (IMS). Hardware sends
+packets with MAP headers to rmnet. Based on the multiplexer id, rmnet
+routes to the appropriate PDN after removing the MAP header.
+
+Aggregation is required to achieve high data rates. This involves hardware
+sending aggregated bunch of MAP frames. rmnet driver will de-aggregate
+these MAP frames and send them to appropriate PDN's.
+
+2. Packet format
+
+a. MAP packet (data / control)
+
+MAP header has the same endianness of the IP packet.
+
+Packet format -
+
+Bit             0             1           2-7      8 - 15           16 - 31
+Function   Command / Data   Reserved     Pad   Multiplexer ID    Payload length
+Bit            32 - x
+Function     Raw  Bytes
+
+Command (1)/ Data (0) bit value is to indicate if the packet is a MAP command
+or data packet. Control packet is used for transport level flow control. Data
+packets are standard IP packets.
+
+Reserved bits are usually zeroed out and to be ignored by receiver.
+
+Padding is number of bytes to be added for 4 byte alignment if required by
+hardware.
+
+Multiplexer ID is to indicate the PDN on which data has to be sent.
+
+Payload length includes the padding length but does not include MAP header
+length.
+
+b. MAP packet (command specific)
+
+Bit             0             1           2-7      8 - 15           16 - 31
+Function   Command         Reserved     Pad   Multiplexer ID    Payload length
+Bit          32 - 39        40 - 45    46 - 47       48 - 63
+Function   Command name    Reserved   Command Type   Reserved
+Bit          64 - 95
+Function   Transaction ID
+Bit          96 - 127
+Function   Command data
+
+Command 1 indicates disabling flow while 2 is enabling flow
+
+Command types -
+0 for MAP command request
+1 is to acknowledge the receipt of a command
+2 is for unsupported commands
+3 is for error during processing of commands
+
+c. Aggregation
+
+Aggregation is multiple MAP packets (can be data or command) delivered to
+rmnet in a single linear skb. rmnet will process the individual
+packets and either ACK the MAP command or deliver the IP packet to the
+network stack as needed
+
+MAP header|IP Packet|Optional padding|MAP header|IP Packet|Optional padding....
+MAP header|IP Packet|Optional padding|MAP header|Command Packet|Optional pad...
+
+3. Userspace configuration
+
+rmnet userspace configuration is done through netlink library librmnetctl
+and command line utility rmnetcli. Utility is hosted in codeaurora forum git.
+The driver uses rtnl_link_ops for communication.
+
+https://source.codeaurora.org/quic/la/platform/vendor/qcom-opensource/dataservices/tree/rmnetctl
diff --git a/drivers/net/ethernet/qualcomm/Kconfig b/drivers/net/ethernet/qualcomm/Kconfig
index 877675a..f520071 100644
--- a/drivers/net/ethernet/qualcomm/Kconfig
+++ b/drivers/net/ethernet/qualcomm/Kconfig
@@ -59,4 +59,6 @@ config QCOM_EMAC
 	  low power, Receive-Side Scaling (RSS), and IEEE 1588-2008
 	  Precision Clock Synchronization Protocol.
 
+source "drivers/net/ethernet/qualcomm/rmnet/Kconfig"
+
 endif # NET_VENDOR_QUALCOMM
diff --git a/drivers/net/ethernet/qualcomm/Makefile b/drivers/net/ethernet/qualcomm/Makefile
index 92fa7c4..1847350 100644
--- a/drivers/net/ethernet/qualcomm/Makefile
+++ b/drivers/net/ethernet/qualcomm/Makefile
@@ -9,3 +9,5 @@ obj-$(CONFIG_QCA7000_UART) += qcauart.o
 qcauart-objs := qca_uart.o
 
 obj-y += emac/
+
+obj-$(CONFIG_RMNET) += rmnet/
diff --git a/drivers/net/ethernet/qualcomm/rmnet/Kconfig b/drivers/net/ethernet/qualcomm/rmnet/Kconfig
new file mode 100644
index 0000000..4948f14
--- /dev/null
+++ b/drivers/net/ethernet/qualcomm/rmnet/Kconfig
@@ -0,0 +1,12 @@
+#
+# RMNET MAP driver
+#
+
+menuconfig RMNET
+	depends on NETDEVICES
+	bool "RmNet MAP driver"
+	default n
+	---help---
+	  If you say Y here, then the rmnet module will be statically
+	  compiled into the kernel. The rmnet module provides MAP
+	  functionality for embedded and bridged traffic.
diff --git a/drivers/net/ethernet/qualcomm/rmnet/Makefile b/drivers/net/ethernet/qualcomm/rmnet/Makefile
new file mode 100644
index 0000000..1c43e2f
--- /dev/null
+++ b/drivers/net/ethernet/qualcomm/rmnet/Makefile
@@ -0,0 +1,12 @@
+#
+# Makefile for the RMNET module
+#
+
+rmnet-y		 := rmnet_config.o
+rmnet-y		 += rmnet_vnd.o
+rmnet-y		 += rmnet_handlers.o
+rmnet-y		 += rmnet_map_data.o
+rmnet-y		 += rmnet_map_command.o
+obj-$(CONFIG_RMNET) += rmnet.o
+
+CFLAGS_rmnet.o := -I$(src)
diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c b/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c
new file mode 100644
index 0000000..94fe73a
--- /dev/null
+++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c
@@ -0,0 +1,418 @@
+/* Copyright (c) 2013-2017, The Linux Foundation. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 and
+ * only version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * RMNET configuration engine
+ *
+ */
+
+#include <net/sock.h>
+#include <linux/module.h>
+#include <linux/netlink.h>
+#include <linux/netdevice.h>
+#include "rmnet_config.h"
+#include "rmnet_handlers.h"
+#include "rmnet_vnd.h"
+#include "rmnet_private.h"
+
+/* Local Definitions and Declarations */
+#define RMNET_LOCAL_LOGICAL_ENDPOINT -1
+
+struct rmnet_free_work {
+	struct work_struct work;
+	struct net_device *rmnet_dev;
+};
+
+static int rmnet_is_real_dev_registered(const struct net_device *real_dev)
+{
+	rx_handler_func_t *rx_handler;
+
+	rx_handler = rcu_dereference(real_dev->rx_handler);
+	return (rx_handler == rmnet_rx_handler);
+}
+
+static struct rmnet_real_dev_info*
+__rmnet_get_real_dev_info(const struct net_device *real_dev)
+{
+	if (rmnet_is_real_dev_registered(real_dev))
+		return (struct rmnet_real_dev_info *)
+			rcu_dereference(real_dev->rx_handler_data);
+	else
+		return 0;
+}
+
+static struct rmnet_endpoint*
+rmnet_get_endpoint(struct net_device *dev, int config_id)
+{
+	struct rmnet_real_dev_info *rdinfo;
+	struct rmnet_endpoint *ep;
+
+	if (!rmnet_is_real_dev_registered(dev)) {
+		ep = rmnet_vnd_get_endpoint(dev);
+	} else {
+		rdinfo = __rmnet_get_real_dev_info(dev);
+
+		if (!rdinfo)
+			return NULL;
+
+		if (config_id == RMNET_LOCAL_LOGICAL_ENDPOINT)
+			ep = &rdinfo->local_ep;
+		else
+			ep = &rdinfo->muxed_ep[config_id];
+	}
+
+	return ep;
+}
+
+static int rmnet_unregister_real_device(struct net_device *real_dev)
+{
+	struct rmnet_real_dev_info *rdinfo;
+	struct list_head *iter;
+
+	ASSERT_RTNL();
+
+	if (!rmnet_is_real_dev_registered(real_dev) ||
+	    netdev_lower_get_next(real_dev, &iter))
+		return -EINVAL;
+
+	rdinfo = __rmnet_get_real_dev_info(real_dev);
+	kfree(rdinfo);
+
+	netdev_rx_handler_unregister(real_dev);
+
+	/* release reference on real_dev */
+	dev_put(real_dev);
+
+	netdev_info(real_dev, "Removed from rmnet\n");
+	return 0;
+}
+
+static int rmnet_register_real_device(struct net_device *real_dev)
+{
+	struct rmnet_real_dev_info *rdinfo;
+	int rc;
+
+	ASSERT_RTNL();
+
+	if (rmnet_is_real_dev_registered(real_dev))
+		return -EINVAL;
+
+	rdinfo = kzalloc(sizeof(*rdinfo), GFP_ATOMIC);
+	if (!rdinfo)
+		return -ENOMEM;
+
+	rdinfo->dev = real_dev;
+	rc = netdev_rx_handler_register(real_dev, rmnet_rx_handler, rdinfo);
+
+	if (rc) {
+		kfree(rdinfo);
+		return -EBUSY;
+	}
+
+	/* hold on to real dev for MAP data */
+	dev_hold(real_dev);
+
+	netdev_info(real_dev, "registered with rmnet\n");
+	return 0;
+}
+
+static int rmnet_set_ingress_data_format(struct net_device *dev, u32 idf)
+{
+	struct rmnet_real_dev_info *rdinfo;
+
+	ASSERT_RTNL();
+
+	netdev_info(dev, "Ingress format 0x%08X\n", idf);
+
+	rdinfo = __rmnet_get_real_dev_info(dev);
+	if (!rdinfo)
+		return -EINVAL;
+
+	rdinfo->ingress_data_format = idf;
+
+	return 0;
+}
+
+static int rmnet_set_egress_data_format(struct net_device *dev, u32 edf,
+					u16 agg_size, u16 agg_count)
+{
+	struct rmnet_real_dev_info *rdinfo;
+
+	ASSERT_RTNL();
+
+	netdev_info(dev, "Egress format 0x%08X agg size %d cnt %d\n",
+		    edf, agg_size, agg_count);
+
+	rdinfo = __rmnet_get_real_dev_info(dev);
+	if (!rdinfo)
+		return -EINVAL;
+
+	rdinfo->egress_data_format = edf;
+
+	return 0;
+}
+
+static int __rmnet_set_endpoint_config(struct net_device *dev, int config_id,
+				       struct rmnet_endpoint *ep)
+{
+	struct rmnet_endpoint *dev_ep;
+
+	ASSERT_RTNL();
+
+	dev_ep = rmnet_get_endpoint(dev, config_id);
+
+	if (!dev_ep)
+		return -EINVAL;
+
+	memcpy(dev_ep, ep, sizeof(struct rmnet_endpoint));
+	if (config_id == RMNET_LOCAL_LOGICAL_ENDPOINT)
+		dev_ep->mux_id = 0;
+	else
+		dev_ep->mux_id = config_id;
+
+	return 0;
+}
+
+static int __rmnet_unset_endpoint_config(struct net_device *dev, int config_id)
+{
+	struct rmnet_endpoint *ep = 0;
+
+	ASSERT_RTNL();
+
+	ep = rmnet_get_endpoint(dev, config_id);
+	if (!ep)
+		return -EINVAL;
+
+	memset(ep, 0, sizeof(struct rmnet_endpoint));
+
+	return 0;
+}
+
+static int rmnet_set_endpoint_config(struct net_device *dev,
+				     int config_id, u8 rmnet_mode,
+				     struct net_device *egress_dev)
+{
+	struct rmnet_endpoint ep;
+
+	netdev_info(dev, "id %d mode %d dev %s\n",
+		    config_id, rmnet_mode, egress_dev->name);
+
+	if (config_id < RMNET_LOCAL_LOGICAL_ENDPOINT ||
+	    config_id >= RMNET_MAX_LOGICAL_EP)
+		return -EINVAL;
+
+	memset(&ep, 0, sizeof(struct rmnet_endpoint));
+	ep.rmnet_mode = rmnet_mode;
+	ep.egress_dev = egress_dev;
+
+	return __rmnet_set_endpoint_config(dev, config_id, &ep);
+}
+
+static int rmnet_unset_endpoint_config(struct net_device *dev, int config_id)
+{
+	netdev_info(dev, "id %d\n", config_id);
+
+	if (config_id < RMNET_LOCAL_LOGICAL_ENDPOINT ||
+	    config_id >= RMNET_MAX_LOGICAL_EP)
+		return -EINVAL;
+
+	return __rmnet_unset_endpoint_config(dev, config_id);
+}
+
+static int rmnet_newlink(struct net *src_net, struct net_device *dev,
+			 struct nlattr *tb[], struct nlattr *data[],
+			 struct netlink_ext_ack *extack)
+{
+	int ingress_format = RMNET_INGRESS_FORMAT_DEMUXING |
+			     RMNET_INGRESS_FORMAT_DEAGGREGATION |
+			     RMNET_INGRESS_FORMAT_MAP;
+	int egress_format = RMNET_EGRESS_FORMAT_MUXING |
+			    RMNET_EGRESS_FORMAT_MAP;
+	struct net_device *real_dev;
+	int mode = RMNET_EPMODE_VND;
+	u16 mux_id;
+
+	real_dev = __dev_get_by_index(src_net, nla_get_u32(tb[IFLA_LINK]));
+	if (!real_dev || !dev)
+		return -ENODEV;
+
+	if (!data[IFLA_VLAN_ID])
+		return -EINVAL;
+
+	mux_id = nla_get_u16(data[IFLA_VLAN_ID]);
+
+	rmnet_register_real_device(real_dev);
+
+	if (rmnet_vnd_newlink(real_dev, mux_id, dev))
+		return -EINVAL;
+
+	rmnet_set_egress_data_format(real_dev, egress_format, 0, 0);
+	rmnet_set_ingress_data_format(real_dev, ingress_format);
+	rmnet_set_endpoint_config(real_dev, mux_id, mode, dev);
+	rmnet_set_endpoint_config(dev, mux_id, mode, real_dev);
+	netdev_master_upper_dev_link(dev, real_dev, NULL, NULL);
+	return 0;
+}
+
+static void rmnet_delink(struct net_device *dev, struct list_head *head)
+{
+	struct net_device *real_dev;
+	int mux_id;
+
+	real_dev = netdev_master_upper_dev_get_rcu(dev);
+	if (real_dev) {
+		mux_id = rmnet_vnd_get_mux(real_dev, dev);
+
+		/* rmnet_vnd_get_mux() gives mux_id + 1,
+		 * so subtract 1 to get the correct mux_id
+		 */
+		mux_id--;
+		rmnet_unset_endpoint_config(real_dev, mux_id);
+		rmnet_unset_endpoint_config(dev, mux_id);
+		rmnet_vnd_remove_ref_dev(real_dev, mux_id);
+		netdev_upper_dev_unlink(dev, real_dev);
+		rmnet_unregister_real_device(real_dev);
+	}
+
+	unregister_netdevice_queue(dev, head);
+}
+
+static void rmnet_free_later(struct work_struct *work)
+{
+	struct rmnet_free_work *fwork;
+
+	fwork = container_of(work, struct rmnet_free_work, work);
+
+	rtnl_lock();
+	rmnet_delink(fwork->rmnet_dev, NULL);
+	rtnl_unlock();
+
+	kfree(fwork);
+}
+
+static int rmnet_dev_walk(struct net_device *lower_dev, void *data)
+{
+	struct net_device *real_dev = data;
+	struct rmnet_free_work *vnd_work;
+	int rc = 0;
+
+	netdev_upper_dev_unlink(lower_dev, real_dev);
+
+	vnd_work = kzalloc(sizeof(*vnd_work), GFP_KERNEL);
+	if (!vnd_work)
+		return -ENOMEM;
+
+	INIT_WORK(&vnd_work->work, rmnet_free_later);
+	vnd_work->rmnet_dev = lower_dev;
+	schedule_work(&vnd_work->work);
+
+	return rc;
+}
+
+static void rmnet_force_unassociate_device(struct net_device *dev)
+{
+	struct net_device *real_dev = dev;
+
+	if (!rmnet_is_real_dev_registered(real_dev))
+		return;
+
+	netdev_walk_all_lower_dev(real_dev, rmnet_dev_walk, real_dev);
+	rmnet_unregister_real_device(real_dev);
+}
+
+static int rmnet_config_notify_cb(struct notifier_block *nb,
+				  unsigned long event, void *data)
+{
+	struct net_device *dev = netdev_notifier_info_to_dev(data);
+
+	if (!dev)
+		return NOTIFY_DONE;
+
+	switch (event) {
+	case NETDEV_UNREGISTER_FINAL:
+	case NETDEV_UNREGISTER:
+		netdev_info(dev, "Kernel unregister\n");
+		rmnet_force_unassociate_device(dev);
+		break;
+
+	default:
+		break;
+	}
+	return NOTIFY_DONE;
+}
+
+static struct notifier_block rmnet_dev_notifier __read_mostly = {
+	.notifier_call = rmnet_config_notify_cb,
+};
+
+static int rmnet_rtnl_validate(struct nlattr *tb[], struct nlattr *data[],
+			       struct netlink_ext_ack *extack)
+{
+	u16 mux_id;
+
+	if (!data || !data[IFLA_VLAN_ID])
+		return -EINVAL;
+
+	mux_id = nla_get_u16(data[IFLA_VLAN_ID]);
+	if (!mux_id || mux_id > (RMNET_MAX_LOGICAL_EP - 1))
+		return -ERANGE;
+
+	return 0;
+}
+
+static size_t rmnet_get_size(const struct net_device *dev)
+{
+	return nla_total_size(2); /* IFLA_VLAN_ID */
+}
+
+struct rtnl_link_ops rmnet_link_ops __read_mostly = {
+	.kind		= "rmnet",
+	.maxtype	= __IFLA_VLAN_MAX,
+	.priv_size	= sizeof(struct rmnet_priv),
+	.setup		= rmnet_vnd_setup,
+	.validate	= rmnet_rtnl_validate,
+	.newlink	= rmnet_newlink,
+	.dellink	= rmnet_delink,
+	.get_size	= rmnet_get_size,
+};
+
+struct rmnet_real_dev_info*
+rmnet_get_real_dev_info(struct net_device *real_dev)
+{
+	return __rmnet_get_real_dev_info(real_dev);
+}
+
+/* Startup/Shutdown */
+
+static int __init rmnet_init(void)
+{
+	int rc;
+
+	rc = register_netdevice_notifier(&rmnet_dev_notifier);
+	if (rc != 0)
+		return rc;
+
+	rc = rtnl_link_register(&rmnet_link_ops);
+	if (rc != 0) {
+		unregister_netdevice_notifier(&rmnet_dev_notifier);
+		return rc;
+	}
+	return rc;
+}
+
+static void __exit rmnet_exit(void)
+{
+	unregister_netdevice_notifier(&rmnet_dev_notifier);
+	rtnl_link_unregister(&rmnet_link_ops);
+}
+
+module_init(rmnet_init)
+module_exit(rmnet_exit)
+MODULE_LICENSE("GPL v2");
diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.h b/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.h
new file mode 100644
index 0000000..8f5a073
--- /dev/null
+++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.h
@@ -0,0 +1,54 @@
+/* Copyright (c) 2013-2014, 2016-2017 The Linux Foundation. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 and
+ * only version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * RMNET Data configuration engine
+ *
+ */
+
+#include <linux/skbuff.h>
+
+#ifndef _RMNET_CONFIG_H_
+#define _RMNET_CONFIG_H_
+
+#define RMNET_MAX_LOGICAL_EP 255
+#define RMNET_MAX_VND        32
+
+/* Information about the next device to deliver the packet to.
+ * Exact usage of this parameter depends on the rmnet_mode.
+ */
+struct rmnet_endpoint {
+	u8 rmnet_mode;
+	u8 mux_id;
+	struct net_device *egress_dev;
+};
+
+/* One instance of this structure is instantiated for each real_dev associated
+ * with rmnet.
+ */
+struct rmnet_real_dev_info {
+	struct net_device *dev;
+	struct rmnet_endpoint local_ep;
+	struct rmnet_endpoint muxed_ep[RMNET_MAX_LOGICAL_EP];
+	u32 ingress_data_format;
+	u32 egress_data_format;
+	struct net_device *rmnet_devices[RMNET_MAX_VND];
+};
+
+extern struct rtnl_link_ops rmnet_link_ops;
+
+struct rmnet_priv {
+	struct rmnet_endpoint local_ep;
+};
+
+struct rmnet_real_dev_info*
+rmnet_get_real_dev_info(struct net_device *real_dev);
+
+#endif /* _RMNET_CONFIG_H_ */
diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_handlers.c b/drivers/net/ethernet/qualcomm/rmnet/rmnet_handlers.c
new file mode 100644
index 0000000..be2bd69
--- /dev/null
+++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_handlers.c
@@ -0,0 +1,276 @@
+/* Copyright (c) 2013-2017, The Linux Foundation. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 and
+ * only version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * RMNET Data ingress/egress handler
+ *
+ */
+
+#include <linux/netdevice.h>
+#include <linux/netdev_features.h>
+#include "rmnet_private.h"
+#include "rmnet_config.h"
+#include "rmnet_vnd.h"
+#include "rmnet_map.h"
+#include "rmnet_handlers.h"
+
+#define RMNET_IP_VERSION_4 0x40
+#define RMNET_IP_VERSION_6 0x60
+
+/* Helper Functions */
+
+static void rmnet_set_skb_proto(struct sk_buff *skb)
+{
+	switch (skb->data[0] & 0xF0) {
+	case RMNET_IP_VERSION_4:
+		skb->protocol = htons(ETH_P_IP);
+		break;
+	case RMNET_IP_VERSION_6:
+		skb->protocol = htons(ETH_P_IPV6);
+		break;
+	default:
+		skb->protocol = htons(ETH_P_MAP);
+		break;
+	}
+}
+
+/* Generic handler */
+
+static rx_handler_result_t
+rmnet_bridge_handler(struct sk_buff *skb, struct rmnet_endpoint *ep)
+{
+	if (!ep->egress_dev)
+		kfree_skb(skb);
+	else
+		rmnet_egress_handler(skb, ep);
+
+	return RX_HANDLER_CONSUMED;
+}
+
+static rx_handler_result_t
+rmnet_deliver_skb(struct sk_buff *skb, struct rmnet_endpoint *ep)
+{
+	switch (ep->rmnet_mode) {
+	case RMNET_EPMODE_NONE:
+		return RX_HANDLER_PASS;
+
+	case RMNET_EPMODE_BRIDGE:
+		return rmnet_bridge_handler(skb, ep);
+
+	case RMNET_EPMODE_VND:
+		skb_reset_transport_header(skb);
+		skb_reset_network_header(skb);
+		switch (rmnet_vnd_rx_fixup(skb, skb->dev)) {
+		case RX_HANDLER_CONSUMED:
+			return RX_HANDLER_CONSUMED;
+
+		case RX_HANDLER_PASS:
+			skb->pkt_type = PACKET_HOST;
+			skb_set_mac_header(skb, 0);
+			netif_receive_skb(skb);
+			return RX_HANDLER_CONSUMED;
+		}
+		return RX_HANDLER_PASS;
+
+	default:
+		kfree_skb(skb);
+		return RX_HANDLER_CONSUMED;
+	}
+}
+
+static rx_handler_result_t
+rmnet_ingress_deliver_packet(struct sk_buff *skb,
+			     struct rmnet_real_dev_info *rdinfo)
+{
+	if (!rdinfo) {
+		kfree_skb(skb);
+		return RX_HANDLER_CONSUMED;
+	}
+
+	skb->dev = rdinfo->local_ep.egress_dev;
+
+	return rmnet_deliver_skb(skb, &rdinfo->local_ep);
+}
+
+/* MAP handler */
+
+static rx_handler_result_t
+__rmnet_map_ingress_handler(struct sk_buff *skb,
+			    struct rmnet_real_dev_info *rdinfo)
+{
+	struct rmnet_endpoint *ep;
+	u8 mux_id;
+	u16 len;
+
+	if (RMNET_MAP_GET_CD_BIT(skb)) {
+		if (rdinfo->ingress_data_format
+		    & RMNET_INGRESS_FORMAT_MAP_COMMANDS)
+			return rmnet_map_command(skb, rdinfo);
+
+		kfree_skb(skb);
+		return RX_HANDLER_CONSUMED;
+	}
+
+	mux_id = RMNET_MAP_GET_MUX_ID(skb);
+	len = RMNET_MAP_GET_LENGTH(skb) - RMNET_MAP_GET_PAD(skb);
+
+	if (mux_id >= RMNET_MAX_LOGICAL_EP) {
+		kfree_skb(skb);
+		return RX_HANDLER_CONSUMED;
+	}
+
+	ep = &rdinfo->muxed_ep[mux_id];
+
+	if (rdinfo->ingress_data_format & RMNET_INGRESS_FORMAT_DEMUXING)
+		skb->dev = ep->egress_dev;
+
+	/* Subtract MAP header */
+	skb_pull(skb, sizeof(struct rmnet_map_header));
+	skb_trim(skb, len);
+	rmnet_set_skb_proto(skb);
+	return rmnet_deliver_skb(skb, ep);
+}
+
+static rx_handler_result_t
+rmnet_map_ingress_handler(struct sk_buff *skb,
+			  struct rmnet_real_dev_info *rdinfo)
+{
+	struct sk_buff *skbn;
+	int rc;
+
+	if (rdinfo->ingress_data_format & RMNET_INGRESS_FORMAT_DEAGGREGATION) {
+		while ((skbn = rmnet_map_deaggregate(skb, rdinfo)) != NULL)
+			__rmnet_map_ingress_handler(skbn, rdinfo);
+
+		consume_skb(skb);
+		rc = RX_HANDLER_CONSUMED;
+	} else {
+		rc = __rmnet_map_ingress_handler(skb, rdinfo);
+	}
+
+	return rc;
+}
+
+static int rmnet_map_egress_handler(struct sk_buff *skb,
+				    struct rmnet_real_dev_info *rdinfo,
+				    struct rmnet_endpoint *ep,
+				    struct net_device *orig_dev)
+{
+	int required_headroom, additional_header_len;
+	struct rmnet_map_header *map_header;
+
+	additional_header_len = 0;
+	required_headroom = sizeof(struct rmnet_map_header);
+
+	if (skb_headroom(skb) < required_headroom) {
+		if (pskb_expand_head(skb, required_headroom, 0, GFP_KERNEL))
+			return RMNET_MAP_CONSUMED;
+	}
+
+	map_header = rmnet_map_add_map_header(skb, additional_header_len, 0);
+	if (!map_header)
+		return RMNET_MAP_CONSUMED;
+
+	if (rdinfo->egress_data_format & RMNET_EGRESS_FORMAT_MUXING) {
+		if (ep->mux_id == 0xff)
+			map_header->mux_id = 0;
+		else
+			map_header->mux_id = ep->mux_id;
+	}
+
+	skb->protocol = htons(ETH_P_MAP);
+
+	return RMNET_MAP_SUCCESS;
+}
+
+/* Ingress / Egress Entry Points */
+
+/* Processes packet as per ingress data format for receiving device. Logical
+ * endpoint is determined from packet inspection. Packet is then sent to the
+ * egress device listed in the logical endpoint configuration.
+ */
+rx_handler_result_t rmnet_rx_handler(struct sk_buff **pskb)
+{
+	struct rmnet_real_dev_info *rdinfo;
+	struct sk_buff *skb = *pskb;
+	struct net_device *dev;
+	int rc;
+
+	if (!skb)
+		return RX_HANDLER_CONSUMED;
+
+	dev = skb->dev;
+	rdinfo = rmnet_get_real_dev_info(dev);
+
+	if (rdinfo->ingress_data_format & RMNET_INGRESS_FORMAT_MAP) {
+		rc = rmnet_map_ingress_handler(skb, rdinfo);
+	} else {
+		switch (ntohs(skb->protocol)) {
+		case ETH_P_MAP:
+			if (rdinfo->local_ep.rmnet_mode ==
+				RMNET_EPMODE_BRIDGE) {
+				rc = rmnet_ingress_deliver_packet(skb, rdinfo);
+			} else {
+				kfree_skb(skb);
+				rc = RX_HANDLER_CONSUMED;
+			}
+			break;
+
+		case ETH_P_IP:
+		case ETH_P_IPV6:
+			rc = rmnet_ingress_deliver_packet(skb, rdinfo);
+			break;
+
+		default:
+			rc = RX_HANDLER_PASS;
+		}
+	}
+
+	return rc;
+}
+
+/* Modifies packet as per logical endpoint configuration and egress data format
+ * for egress device configured in logical endpoint. Packet is then transmitted
+ * on the egress device.
+ */
+void rmnet_egress_handler(struct sk_buff *skb,
+			  struct rmnet_endpoint *ep)
+{
+	struct rmnet_real_dev_info *rdinfo;
+	struct net_device *orig_dev;
+
+	orig_dev = skb->dev;
+	skb->dev = ep->egress_dev;
+
+	rdinfo = rmnet_get_real_dev_info(skb->dev);
+	if (!rdinfo) {
+		kfree_skb(skb);
+		return;
+	}
+
+	if (rdinfo->egress_data_format & RMNET_EGRESS_FORMAT_MAP) {
+		switch (rmnet_map_egress_handler(skb, rdinfo, ep, orig_dev)) {
+		case RMNET_MAP_CONSUMED:
+			return;
+
+		case RMNET_MAP_SUCCESS:
+			break;
+
+		default:
+			kfree_skb(skb);
+			return;
+		}
+	}
+
+	if (ep->rmnet_mode == RMNET_EPMODE_VND)
+		rmnet_vnd_tx_fixup(skb, orig_dev);
+
+	dev_queue_xmit(skb);
+}
diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_handlers.h b/drivers/net/ethernet/qualcomm/rmnet/rmnet_handlers.h
new file mode 100644
index 0000000..f2638cf
--- /dev/null
+++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_handlers.h
@@ -0,0 +1,26 @@
+/* Copyright (c) 2013, 2016-2017 The Linux Foundation. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 and
+ * only version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * RMNET Data ingress/egress handler
+ *
+ */
+
+#ifndef _RMNET_HANDLERS_H_
+#define _RMNET_HANDLERS_H_
+
+#include "rmnet_config.h"
+
+void rmnet_egress_handler(struct sk_buff *skb,
+			  struct rmnet_endpoint *ep);
+
+rx_handler_result_t rmnet_rx_handler(struct sk_buff **pskb);
+
+#endif /* _RMNET_HANDLERS_H_ */
diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_map.h b/drivers/net/ethernet/qualcomm/rmnet/rmnet_map.h
new file mode 100644
index 0000000..2aabad2
--- /dev/null
+++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_map.h
@@ -0,0 +1,88 @@
+/* Copyright (c) 2013-2017, The Linux Foundation. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 and
+ * only version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#ifndef _RMNET_MAP_H_
+#define _RMNET_MAP_H_
+
+struct rmnet_map_control_command {
+	u8  command_name;
+	u8  cmd_type:2;
+	u8  reserved:6;
+	u16 reserved2;
+	u32 transaction_id;
+	union {
+		struct {
+			u16 ip_family:2;
+			u16 reserved:14;
+			u16 flow_control_seq_num;
+			u32 qos_id;
+		} flow_control;
+		u8 data[0];
+	};
+}  __aligned(1);
+
+enum rmnet_map_results {
+	RMNET_MAP_SUCCESS,
+	RMNET_MAP_CONSUMED,
+	RMNET_MAP_GENERAL_FAILURE,
+	RMNET_MAP_NOT_ENABLED,
+	RMNET_MAP_FAILED_AGGREGATION,
+	RMNET_MAP_FAILED_MUX
+};
+
+enum rmnet_map_commands {
+	RMNET_MAP_COMMAND_NONE,
+	RMNET_MAP_COMMAND_FLOW_DISABLE,
+	RMNET_MAP_COMMAND_FLOW_ENABLE,
+	/* These should always be the last 2 elements */
+	RMNET_MAP_COMMAND_UNKNOWN,
+	RMNET_MAP_COMMAND_ENUM_LENGTH
+};
+
+struct rmnet_map_header {
+	u8  pad_len:6;
+	u8  reserved_bit:1;
+	u8  cd_bit:1;
+	u8  mux_id;
+	u16 pkt_len;
+}  __aligned(1);
+
+#define RMNET_MAP_GET_MUX_ID(Y) (((struct rmnet_map_header *) \
+				 (Y)->data)->mux_id)
+#define RMNET_MAP_GET_CD_BIT(Y) (((struct rmnet_map_header *) \
+				(Y)->data)->cd_bit)
+#define RMNET_MAP_GET_PAD(Y) (((struct rmnet_map_header *) \
+				(Y)->data)->pad_len)
+#define RMNET_MAP_GET_CMD_START(Y) ((struct rmnet_map_control_command *) \
+				    ((Y)->data + \
+				      sizeof(struct rmnet_map_header)))
+#define RMNET_MAP_GET_LENGTH(Y) (ntohs(((struct rmnet_map_header *) \
+					(Y)->data)->pkt_len))
+
+#define RMNET_MAP_COMMAND_REQUEST     0
+#define RMNET_MAP_COMMAND_ACK         1
+#define RMNET_MAP_COMMAND_UNSUPPORTED 2
+#define RMNET_MAP_COMMAND_INVALID     3
+
+#define RMNET_MAP_NO_PAD_BYTES        0
+#define RMNET_MAP_ADD_PAD_BYTES       1
+
+u8 rmnet_map_demultiplex(struct sk_buff *skb);
+struct sk_buff *rmnet_map_deaggregate(struct sk_buff *skb,
+				      struct rmnet_real_dev_info *rdinfo);
+
+struct rmnet_map_header *rmnet_map_add_map_header(struct sk_buff *skb,
+						  int hdrlen, int pad);
+rx_handler_result_t rmnet_map_command(struct sk_buff *skb,
+				      struct rmnet_real_dev_info *rdinfo);
+
+#endif /* _RMNET_MAP_H_ */
diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_map_command.c b/drivers/net/ethernet/qualcomm/rmnet/rmnet_map_command.c
new file mode 100644
index 0000000..2de93e5
--- /dev/null
+++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_map_command.c
@@ -0,0 +1,116 @@
+/* Copyright (c) 2013-2017, The Linux Foundation. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 and
+ * only version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include <linux/netdevice.h>
+#include "rmnet_config.h"
+#include "rmnet_map.h"
+#include "rmnet_private.h"
+#include "rmnet_vnd.h"
+
+static u8 rmnet_map_do_flow_control(struct sk_buff *skb,
+				    struct rmnet_real_dev_info *rdinfo,
+				    int enable)
+{
+	struct rmnet_map_control_command *cmd;
+	struct rmnet_endpoint *ep;
+	struct net_device *vnd;
+	u16 ip_family;
+	u16 fc_seq;
+	u32 qos_id;
+	u8 mux_id;
+	int r;
+
+	if (unlikely(!skb || !rdinfo))
+		return RX_HANDLER_CONSUMED;
+
+	mux_id = RMNET_MAP_GET_MUX_ID(skb);
+	cmd = RMNET_MAP_GET_CMD_START(skb);
+
+	if (mux_id >= RMNET_MAX_LOGICAL_EP) {
+		kfree_skb(skb);
+		return RX_HANDLER_CONSUMED;
+	}
+
+	ep = &rdinfo->muxed_ep[mux_id];
+	vnd = ep->egress_dev;
+
+	ip_family = cmd->flow_control.ip_family;
+	fc_seq = ntohs(cmd->flow_control.flow_control_seq_num);
+	qos_id = ntohl(cmd->flow_control.qos_id);
+
+	/* Ignore the ip family and pass the sequence number for both v4 and v6
+	 * sequence. User space does not support creating dedicated flows for
+	 * the 2 protocols
+	 */
+	r = rmnet_vnd_do_flow_control(rdinfo, vnd, enable);
+	if (r) {
+		kfree_skb(skb);
+		return RMNET_MAP_COMMAND_UNSUPPORTED;
+	} else {
+		return RMNET_MAP_COMMAND_ACK;
+	}
+}
+
+static void rmnet_map_send_ack(struct sk_buff *skb,
+			       unsigned char type,
+			       struct rmnet_real_dev_info *rdinfo)
+{
+	struct rmnet_map_control_command *cmd;
+	int xmit_status;
+
+	if (unlikely(!skb))
+		return;
+
+	skb->protocol = htons(ETH_P_MAP);
+
+	cmd = RMNET_MAP_GET_CMD_START(skb);
+	cmd->cmd_type = type & 0x03;
+
+	netif_tx_lock(skb->dev);
+	xmit_status = skb->dev->netdev_ops->ndo_start_xmit(skb, skb->dev);
+	netif_tx_unlock(skb->dev);
+}
+
+/* Process MAP command frame and send N/ACK message as appropriate. Message cmd
+ * name is decoded here and appropriate handler is called.
+ */
+rx_handler_result_t rmnet_map_command(struct sk_buff *skb,
+				      struct rmnet_real_dev_info *rdinfo)
+{
+	struct rmnet_map_control_command *cmd;
+	unsigned char command_name;
+	unsigned char rc = 0;
+
+	if (unlikely(!skb))
+		return RX_HANDLER_CONSUMED;
+
+	cmd = RMNET_MAP_GET_CMD_START(skb);
+	command_name = cmd->command_name;
+
+	switch (command_name) {
+	case RMNET_MAP_COMMAND_FLOW_ENABLE:
+		rc = rmnet_map_do_flow_control(skb, rdinfo, 1);
+		break;
+
+	case RMNET_MAP_COMMAND_FLOW_DISABLE:
+		rc = rmnet_map_do_flow_control(skb, rdinfo, 0);
+		break;
+
+	default:
+		rc = RMNET_MAP_COMMAND_UNSUPPORTED;
+		kfree_skb(skb);
+		break;
+	}
+	if (rc == RMNET_MAP_COMMAND_ACK)
+		rmnet_map_send_ack(skb, rc, rdinfo);
+	return RX_HANDLER_CONSUMED;
+}
diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_map_data.c b/drivers/net/ethernet/qualcomm/rmnet/rmnet_map_data.c
new file mode 100644
index 0000000..6d16c6ac
--- /dev/null
+++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_map_data.c
@@ -0,0 +1,105 @@
+/* Copyright (c) 2013-2017, The Linux Foundation. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 and
+ * only version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * RMNET Data MAP protocol
+ *
+ */
+
+#include <linux/netdevice.h>
+#include "rmnet_config.h"
+#include "rmnet_map.h"
+#include "rmnet_private.h"
+
+#define RMNET_MAP_DEAGGR_SPACING  64
+#define RMNET_MAP_DEAGGR_HEADROOM (RMNET_MAP_DEAGGR_SPACING / 2)
+
+/* Adds MAP header to front of skb->data
+ * Padding is calculated and set appropriately in MAP header. Mux ID is
+ * initialized to 0.
+ */
+struct rmnet_map_header *rmnet_map_add_map_header(struct sk_buff *skb,
+						  int hdrlen, int pad)
+{
+	struct rmnet_map_header *map_header;
+	u32 padding, map_datalen;
+	u8 *padbytes;
+
+	if (skb_headroom(skb) < sizeof(struct rmnet_map_header))
+		return 0;
+
+	map_datalen = skb->len - hdrlen;
+	map_header = (struct rmnet_map_header *)
+			skb_push(skb, sizeof(struct rmnet_map_header));
+	memset(map_header, 0, sizeof(struct rmnet_map_header));
+
+	if (pad == RMNET_MAP_NO_PAD_BYTES) {
+		map_header->pkt_len = htons(map_datalen);
+		return map_header;
+	}
+
+	padding = ALIGN(map_datalen, 4) - map_datalen;
+
+	if (padding == 0)
+		goto done;
+
+	if (skb_tailroom(skb) < padding)
+		return 0;
+
+	padbytes = (u8 *)skb_put(skb, padding);
+	memset(padbytes, 0, padding);
+
+done:
+	map_header->pkt_len = htons(map_datalen + padding);
+	map_header->pad_len = padding & 0x3F;
+
+	return map_header;
+}
+
+/* Deaggregates a single packet
+ * A whole new buffer is allocated for each portion of an aggregated frame.
+ * Caller should keep calling deaggregate() on the source skb until 0 is
+ * returned, indicating that there are no more packets to deaggregate. Caller
+ * is responsible for freeing the original skb.
+ */
+struct sk_buff *rmnet_map_deaggregate(struct sk_buff *skb,
+				      struct rmnet_real_dev_info *rdinfo)
+{
+	struct rmnet_map_header *maph;
+	struct sk_buff *skbn;
+	u32 packet_len;
+
+	if (skb->len == 0)
+		return 0;
+
+	maph = (struct rmnet_map_header *)skb->data;
+	packet_len = ntohs(maph->pkt_len) + sizeof(struct rmnet_map_header);
+
+	if (((int)skb->len - (int)packet_len) < 0)
+		return 0;
+
+	skbn = alloc_skb(packet_len + RMNET_MAP_DEAGGR_SPACING, GFP_ATOMIC);
+	if (!skbn)
+		return 0;
+
+	skbn->dev = skb->dev;
+	skb_reserve(skbn, RMNET_MAP_DEAGGR_HEADROOM);
+	skb_put(skbn, packet_len);
+	memcpy(skbn->data, skb->data, packet_len);
+	skb_pull(skb, packet_len);
+
+	/* Some hardware can send us empty frames. Catch them */
+	if (ntohs(maph->pkt_len) == 0) {
+		kfree_skb(skb);
+		return 0;
+	}
+
+	return skbn;
+}
diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_private.h b/drivers/net/ethernet/qualcomm/rmnet/rmnet_private.h
new file mode 100644
index 0000000..ed820b5
--- /dev/null
+++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_private.h
@@ -0,0 +1,45 @@
+/* Copyright (c) 2013-2014, 2016-2017 The Linux Foundation. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 and
+ * only version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#ifndef _RMNET_PRIVATE_H_
+#define _RMNET_PRIVATE_H_
+
+#define RMNET_MAX_VND              32
+#define RMNET_MAX_PACKET_SIZE      16384
+#define RMNET_DFLT_PACKET_SIZE     1500
+#define RMNET_NEEDED_HEADROOM      16
+#define RMNET_TX_QUEUE_LEN         1000
+
+/* Constants */
+#define RMNET_EGRESS_FORMAT__RESERVED__         BIT(0)
+#define RMNET_EGRESS_FORMAT_MAP                 BIT(1)
+#define RMNET_EGRESS_FORMAT_AGGREGATION         BIT(2)
+#define RMNET_EGRESS_FORMAT_MUXING              BIT(3)
+#define RMNET_EGRESS_FORMAT_MAP_CKSUMV3         BIT(4)
+#define RMNET_EGRESS_FORMAT_MAP_CKSUMV4         BIT(5)
+
+#define RMNET_INGRESS_FIX_ETHERNET              BIT(0)
+#define RMNET_INGRESS_FORMAT_MAP                BIT(1)
+#define RMNET_INGRESS_FORMAT_DEAGGREGATION      BIT(2)
+#define RMNET_INGRESS_FORMAT_DEMUXING           BIT(3)
+#define RMNET_INGRESS_FORMAT_MAP_COMMANDS       BIT(4)
+#define RMNET_INGRESS_FORMAT_MAP_CKSUMV3        BIT(5)
+#define RMNET_INGRESS_FORMAT_MAP_CKSUMV4        BIT(6)
+
+/* Pass the frame up the stack with no modifications to skb->dev */
+#define RMNET_EPMODE_NONE (0)
+/* Replace skb->dev to a virtual rmnet device and pass up the stack */
+#define RMNET_EPMODE_VND (1)
+/* Pass the frame directly to another device with dev_queue_xmit() */
+#define RMNET_EPMODE_BRIDGE (2)
+
+#endif /* _RMNET_PRIVATE_H_ */
diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.c b/drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.c
new file mode 100644
index 0000000..2813d84
--- /dev/null
+++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.c
@@ -0,0 +1,263 @@
+/* Copyright (c) 2013-2017, The Linux Foundation. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 and
+ * only version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ *
+ * RMNET Data virtual network driver
+ *
+ */
+
+#include <linux/etherdevice.h>
+#include <linux/if_arp.h>
+#include <net/pkt_sched.h>
+#include "rmnet_config.h"
+#include "rmnet_handlers.h"
+#include "rmnet_private.h"
+#include "rmnet_map.h"
+#include "rmnet_vnd.h"
+
+/* RX/TX Fixup */
+
+int rmnet_vnd_rx_fixup(struct sk_buff *skb, struct net_device *dev)
+{
+	if (unlikely(!dev || !skb))
+		return RX_HANDLER_CONSUMED;
+
+	dev->stats.rx_packets++;
+	dev->stats.rx_bytes += skb->len;
+
+	return RX_HANDLER_PASS;
+}
+
+int rmnet_vnd_tx_fixup(struct sk_buff *skb, struct net_device *dev)
+{
+	struct rmnet_priv *priv;
+
+	priv = netdev_priv(dev);
+
+	if (unlikely(!dev || !skb))
+		return RX_HANDLER_CONSUMED;
+
+	dev->stats.tx_packets++;
+	dev->stats.tx_bytes += skb->len;
+
+	return RX_HANDLER_PASS;
+}
+
+/* Network Device Operations */
+
+static netdev_tx_t rmnet_vnd_start_xmit(struct sk_buff *skb,
+					struct net_device *dev)
+{
+	struct rmnet_priv *priv;
+
+	priv = netdev_priv(dev);
+	if (priv->local_ep.egress_dev) {
+		rmnet_egress_handler(skb, &priv->local_ep);
+	} else {
+		dev->stats.tx_dropped++;
+		kfree_skb(skb);
+	}
+	return NETDEV_TX_OK;
+}
+
+static int rmnet_vnd_change_mtu(struct net_device *rmnet_dev, int new_mtu)
+{
+	if (new_mtu < 0 || new_mtu > RMNET_MAX_PACKET_SIZE)
+		return -EINVAL;
+
+	rmnet_dev->mtu = new_mtu;
+	return 0;
+}
+
+static const struct net_device_ops rmnet_vnd_ops = {
+	.ndo_start_xmit = rmnet_vnd_start_xmit,
+	.ndo_change_mtu = rmnet_vnd_change_mtu,
+};
+
+/* Called by kernel whenever a new rmnet<n> device is created. Sets MTU,
+ * flags, ARP type, needed headroom, etc...
+ */
+void rmnet_vnd_setup(struct net_device *rmnet_dev)
+{
+	struct rmnet_priv *priv;
+
+	/* Clear out private data */
+	priv = netdev_priv(rmnet_dev);
+	memset(priv, 0, sizeof(struct rmnet_priv));
+
+	netdev_info(rmnet_dev, "Setting up device %s\n", rmnet_dev->name);
+
+	rmnet_dev->netdev_ops = &rmnet_vnd_ops;
+	rmnet_dev->mtu = RMNET_DFLT_PACKET_SIZE;
+	rmnet_dev->needed_headroom = RMNET_NEEDED_HEADROOM;
+	random_ether_addr(rmnet_dev->dev_addr);
+	rmnet_dev->tx_queue_len = RMNET_TX_QUEUE_LEN;
+
+	/* Raw IP mode */
+	rmnet_dev->header_ops = 0;  /* No header */
+	rmnet_dev->type = ARPHRD_RAWIP;
+	rmnet_dev->hard_header_len = 0;
+	rmnet_dev->flags &= ~(IFF_BROADCAST | IFF_MULTICAST);
+
+	rmnet_dev->needs_free_netdev = true;
+}
+
+/* Exposed API */
+
+int rmnet_vnd_newlink(struct net_device *real_dev, int id,
+		      struct net_device *rmnet_dev)
+{
+	struct rmnet_real_dev_info *rdinfo;
+	int rc;
+
+	rdinfo = rmnet_get_real_dev_info(real_dev);
+
+	if (rdinfo->rmnet_devices[id])
+		return -EINVAL;
+
+	rc = register_netdevice(rmnet_dev);
+	if (!rc) {
+		rdinfo->rmnet_devices[id] = rmnet_dev;
+		rmnet_dev->rtnl_link_ops = &rmnet_link_ops;
+	}
+	return rc;
+}
+
+/* Unregisters the virtual network device node and frees it.
+ * unregister_netdev locks the rtnl mutex, so the mutex must not be locked
+ * by the caller of the function. unregister_netdev enqueues the request to
+ * unregister the device into a TODO queue. The requests in the TODO queue
+ * are only done after rtnl mutex is unlocked, therefore free_netdev has to
+ * called after unlocking rtnl mutex.
+ */
+int rmnet_vnd_free_dev(struct net_device *real_dev, int id)
+{
+	struct rmnet_real_dev_info *rdinfo;
+	struct net_device *rmnet_dev;
+	struct rmnet_endpoint *ep;
+
+	rdinfo = rmnet_get_real_dev_info(real_dev);
+
+	rtnl_lock();
+	if (id < 0 || id >= RMNET_MAX_VND || !rdinfo->rmnet_devices[id]) {
+		rtnl_unlock();
+		return -EINVAL;
+	}
+
+	ep = rmnet_vnd_get_endpoint(rdinfo->rmnet_devices[id]);
+	if (ep) {
+		rtnl_unlock();
+		return -EINVAL;
+	}
+
+	rmnet_dev = rdinfo->rmnet_devices[id];
+	rdinfo->rmnet_devices[id] = 0;
+	rtnl_unlock();
+
+	if (rmnet_dev) {
+		unregister_netdev(rmnet_dev);
+		free_netdev(rmnet_dev);
+		return 0;
+	} else {
+		return -EINVAL;
+	}
+}
+
+int rmnet_vnd_remove_ref_dev(struct net_device *real_dev, int id)
+{
+	struct rmnet_real_dev_info *rdinfo;
+	struct rmnet_endpoint *ep;
+
+	rdinfo = rmnet_get_real_dev_info(real_dev);
+	if (id < 0 || id >= RMNET_MAX_VND || !rdinfo->rmnet_devices[id])
+		return -EINVAL;
+
+	ep = rmnet_vnd_get_endpoint(rdinfo->rmnet_devices[id]);
+	rdinfo->rmnet_devices[id] = 0;
+	return 0;
+}
+
+/* Searches through list of known RmNet virtual devices. This function is O(n)
+ * and should not be used in the data path.
+ *
+ * To get the read id, subtract this result by 1.
+ */
+int rmnet_vnd_get_mux(struct net_device *real_dev,
+		      struct net_device *rmnet_dev)
+{
+	/* This is not an efficient search, but, this will only be called in
+	 * a configuration context, and the list is small.
+	 */
+	struct rmnet_real_dev_info *rdinfo;
+	int i;
+
+	rdinfo = rmnet_get_real_dev_info(real_dev);
+
+	if (!rmnet_dev)
+		return 0;
+
+	for (i = 0; i < RMNET_MAX_VND; i++)
+		if (rmnet_dev == rdinfo->rmnet_devices[i])
+			return i + 1;
+
+	return 0;
+}
+
+/* Gets the logical endpoint configuration for a RmNet virtual network device
+ * node. Caller should confirm that devices is a RmNet VND before calling.
+ */
+struct rmnet_endpoint *rmnet_vnd_get_endpoint(struct net_device *rmnet_dev)
+{
+	struct rmnet_priv *priv;
+
+	if (!rmnet_dev)
+		return 0;
+
+	priv = netdev_priv(rmnet_dev);
+	if (!priv)
+		return 0;
+
+	return &priv->local_ep;
+}
+
+int rmnet_vnd_do_flow_control(struct rmnet_real_dev_info *rdinfo,
+			      struct net_device *rmnet_dev, int enable)
+{
+	struct rmnet_priv *priv;
+
+	priv = netdev_priv(rmnet_dev);
+	if (unlikely(!priv))
+		return -EINVAL;
+
+	netdev_info(rmnet_dev, "Setting VND TX queue state to %d\n", enable);
+	/* Although we expect similar number of enable/disable
+	 * commands, optimize for the disable. That is more
+	 * latency sensitive than enable
+	 */
+	if (unlikely(enable))
+		netif_wake_queue(rmnet_dev);
+	else
+		netif_stop_queue(rmnet_dev);
+
+	return 0;
+}
+
+struct net_device *rmnet_vnd_get_by_id(struct net_device *real_dev, int id)
+{
+	struct rmnet_real_dev_info *rdinfo;
+
+	rdinfo = rmnet_get_real_dev_info(real_dev);
+
+	if (id < 0 || id >= RMNET_MAX_VND)
+		return 0;
+
+	return rdinfo->rmnet_devices[id];
+}
diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.h b/drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.h
new file mode 100644
index 0000000..3020646
--- /dev/null
+++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.h
@@ -0,0 +1,32 @@
+/* Copyright (c) 2013-2017, The Linux Foundation. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 and
+ * only version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * RMNET Data Virtual Network Device APIs
+ *
+ */
+
+#ifndef _RMNET_VND_H_
+#define _RMNET_VND_H_
+
+int rmnet_vnd_do_flow_control(struct rmnet_real_dev_info *rdinfo,
+			      struct net_device *dev, int enable);
+struct rmnet_endpoint *rmnet_vnd_get_endpoint(struct net_device *dev);
+int rmnet_vnd_free_dev(struct net_device *real_dev, int id);
+int rmnet_vnd_remove_ref_dev(struct net_device *real_dev, int id);
+int rmnet_vnd_rx_fixup(struct sk_buff *skb, struct net_device *dev);
+int rmnet_vnd_tx_fixup(struct sk_buff *skb, struct net_device *dev);
+int rmnet_vnd_get_mux(struct net_device *real_dev,
+		      struct net_device *rmnet_dev);
+struct net_device *rmnet_vnd_get_by_id(struct net_device *real_dev, int id);
+void rmnet_vnd_setup(struct net_device *dev);
+int rmnet_vnd_newlink(struct net_device *real_dev, int id,
+		      struct net_device *new_device);
+#endif /* _RMNET_VND_H_ */
-- 
1.9.1

^ permalink raw reply related

* Re: [PATCH net] udp: on peeking bad csum, drop packets even if not at head
From: Willem de Bruijn @ 2017-08-21 22:37 UTC (permalink / raw)
  To: Network Development
  Cc: David Miller, Paolo Abeni, Willem de Bruijn, Eric Dumazet
In-Reply-To: <20170821213912.93333-1-willemdebruijn.kernel@gmail.com>

On Mon, Aug 21, 2017 at 5:39 PM, Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:
> From: Willem de Bruijn <willemb@google.com>
>
> When peeking, if a bad csum is discovered, the skb is unlinked from
> the queue with __sk_queue_drop_skb and the peek operation restarted.
>
> __sk_queue_drop_skb only drops packets that match the queue head. With
> sk_peek_off, the skb need not be at head, causing the call to fail and
> the same skb to be found again on restart.
>
> Walk the queue to find the correct skb. Limit the walk to sk_peek_off,
> to bound cycle cost to at most twice the original skb_queue_walk in
> __skb_try_recv_from_queue.
>
> The operation may race with updates to sk_peek_off. As the operation
> is retried, it will eventually succeed.
>
> Signed-off-by: Willem de Bruijn <willemb@google.com>

Eric just suggested an alternative that does not require looping, which
is much nicer.

^ permalink raw reply

* Re: [PATCH net] udp: on peeking bad csum, drop packets even if not at head
From: Eric Dumazet @ 2017-08-21 22:40 UTC (permalink / raw)
  To: Willem de Bruijn; +Cc: netdev, davem, pabeni, Willem de Bruijn
In-Reply-To: <20170821213912.93333-1-willemdebruijn.kernel@gmail.com>

On Mon, 2017-08-21 at 17:39 -0400, Willem de Bruijn wrote:
> From: Willem de Bruijn <willemb@google.com>
> 
> When peeking, if a bad csum is discovered, the skb is unlinked from
> the queue with __sk_queue_drop_skb and the peek operation restarted.
> 
> __sk_queue_drop_skb only drops packets that match the queue head. With
> sk_peek_off, the skb need not be at head, causing the call to fail and
> the same skb to be found again on restart.
> 
> Walk the queue to find the correct skb. Limit the walk to sk_peek_off,
> to bound cycle cost to at most twice the original skb_queue_walk in
> __skb_try_recv_from_queue.
> 
> The operation may race with updates to sk_peek_off. As the operation
> is retried, it will eventually succeed.
> 
> Signed-off-by: Willem de Bruijn <willemb@google.com>

You forgot the Fixes: tag, that such a bug fix deserves.

I am not a big fan of your patch and would prefer a solution without the
loop.

skb already have ->next and ->prev pointer telling us its position in
the receive queue.

We only need to make sure we are the last owner of this skb before doing
the check (ie cancel the kfree_skb() that usually follows the call to
v__sk_queue_drop_skb() of skb->next being NULL or not.

Something like :

 include/net/sock.h  |    2 +-
 net/core/datagram.c |   22 ++++++++++++++--------
 net/ipv4/udp.c      |    2 +-
 net/ipv6/udp.c      |    2 +-
 4 files changed, 17 insertions(+), 11 deletions(-)

diff --git a/include/net/sock.h b/include/net/sock.h
index aeeec62992ca..6e43bab92d95 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -2030,7 +2030,7 @@ void sk_reset_timer(struct sock *sk, struct timer_list *timer,
 void sk_stop_timer(struct sock *sk, struct timer_list *timer);
 
 int __sk_queue_drop_skb(struct sock *sk, struct sk_buff_head *sk_queue,
-			struct sk_buff *skb, unsigned int flags,
+			struct sk_buff **pskb, unsigned int flags,
 			void (*destructor)(struct sock *sk,
 					   struct sk_buff *skb));
 int __sock_queue_rcv_skb(struct sock *sk, struct sk_buff *skb);
diff --git a/net/core/datagram.c b/net/core/datagram.c
index a21ca8dee5ea..7e129f91af89 100644
--- a/net/core/datagram.c
+++ b/net/core/datagram.c
@@ -353,21 +353,27 @@ void __skb_free_datagram_locked(struct sock *sk, struct sk_buff *skb, int len)
 EXPORT_SYMBOL(__skb_free_datagram_locked);
 
 int __sk_queue_drop_skb(struct sock *sk, struct sk_buff_head *sk_queue,
-			struct sk_buff *skb, unsigned int flags,
+			struct sk_buff **pskb, unsigned int flags,
 			void (*destructor)(struct sock *sk,
 					   struct sk_buff *skb))
 {
 	int err = 0;
 
 	if (flags & MSG_PEEK) {
+		struct sk_buff *skb = *pskb;
+
 		err = -ENOENT;
 		spin_lock_bh(&sk_queue->lock);
-		if (skb == skb_peek(sk_queue)) {
-			__skb_unlink(skb, sk_queue);
-			refcount_dec(&skb->users);
-			if (destructor)
-				destructor(sk, skb);
-			err = 0;
+		refcount_dec(&skb->users);
+		*pskb = NULL;
+		if (refcount_dec_if_one(&skb->users)) {
+			if (skb->next) {
+				__skb_unlink(skb, sk_queue);
+				if (destructor)
+					destructor(sk, skb);
+				err = 0;
+			}
+			__kfree_skb(skb);
 		}
 		spin_unlock_bh(&sk_queue->lock);
 	}
@@ -400,7 +406,7 @@ EXPORT_SYMBOL(__sk_queue_drop_skb);
 
 int skb_kill_datagram(struct sock *sk, struct sk_buff *skb, unsigned int flags)
 {
-	int err = __sk_queue_drop_skb(sk, &sk->sk_receive_queue, skb, flags,
+	int err = __sk_queue_drop_skb(sk, &sk->sk_receive_queue, &skb, flags,
 				      NULL);
 
 	kfree_skb(skb);
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index cd1d044a7fa5..b5f90b845a6f 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1648,7 +1648,7 @@ int udp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, int noblock,
 	return err;
 
 csum_copy_err:
-	if (!__sk_queue_drop_skb(sk, &udp_sk(sk)->reader_queue, skb, flags,
+	if (!__sk_queue_drop_skb(sk, &udp_sk(sk)->reader_queue, &skb, flags,
 				 udp_skb_destructor)) {
 		UDP_INC_STATS(sock_net(sk), UDP_MIB_CSUMERRORS, is_udplite);
 		UDP_INC_STATS(sock_net(sk), UDP_MIB_INERRORS, is_udplite);
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 20039c8501eb..214a973571fd 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -465,7 +465,7 @@ int udpv6_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
 	return err;
 
 csum_copy_err:
-	if (!__sk_queue_drop_skb(sk, &udp_sk(sk)->reader_queue, skb, flags,
+	if (!__sk_queue_drop_skb(sk, &udp_sk(sk)->reader_queue, &skb, flags,
 				 udp_skb_destructor)) {
 		if (is_udp4) {
 			UDP_INC_STATS(sock_net(sk),

^ permalink raw reply related

* Re: [PATCH net-next,1/4] hv_netvsc: Clean up unused parameter from netvsc_get_hash()
From: David Miller @ 2017-08-21 23:12 UTC (permalink / raw)
  To: haiyangz, haiyangz; +Cc: netdev, kys, olaf, vkuznets, linux-kernel
In-Reply-To: <1503352555-9256-1-git-send-email-haiyangz@exchange.microsoft.com>


All proper patch series must have a header "[PATCH xxx 0/N]" posting
which explains at a high level what the patch series does, how it does
it, and why it is doing it that way.

Therefore, please resubmit this patch series with a proper header
posting.

Thank you.

^ permalink raw reply

* Re: [PATCH net-next] net: dsa: User per-cpu 64-bit statistics
From: Florian Fainelli @ 2017-08-21 23:23 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: netdev, davem, Andrew Lunn, Vivien Didelot, David S. Miller,
	open list
In-Reply-To: <1501866711.25002.46.camel@edumazet-glaptop3.roam.corp.google.com>

On 08/04/2017 10:11 AM, Eric Dumazet wrote:
> On Fri, 2017-08-04 at 08:51 -0700, Florian Fainelli wrote:
>> On 08/03/2017 10:36 PM, Eric Dumazet wrote:
>>> On Thu, 2017-08-03 at 21:33 -0700, Florian Fainelli wrote:
>>>> During testing with a background iperf pushing 1Gbit/sec worth of
>>>> traffic and having both ifconfig and ethtool collect statistics, we
>>>> could see quite frequent deadlocks. Convert the often accessed DSA slave
>>>> network devices statistics to per-cpu 64-bit statistics to remove these
>>>> deadlocks and provide fast efficient statistics updates.
>>>>
>>>
>>> This seems to be a bug fix, it would be nice to get a proper tag like :
>>>
>>> Fixes: f613ed665bb3 ("net: dsa: Add support for 64-bit statistics")
>>
>> Right, should have been added, thanks!
>>
>>>
>>> Problem here is that if multiple cpus can call dsa_switch_rcv() at the
>>> same time, then u64_stats_update_begin() contract is not respected.
>>
>> This is really where I struggled understanding what is wrong in the
>> non-per CPU version, my understanding is that we have:
>>
>> - writers for xmit executes in process context
>> - writers for receive executes from NAPI (from the DSA's master network
>> device through it's own NAPI doing netif_receive_skb -> netdev_uses_dsa
>> -> netif_receive_skb)
>>
>> readers should all execute in process context. The test scenario that
>> led to a deadlock involved running iperf in the background, having a
>> while loop with both ifconfig and ethtool reading stats, and somehow
>> when iperf exited, either reader would just be locked. So I guess this
>> leaves us with the two writers not being mutually excluded then, right?
> 
> You could add a debug version of u64_stats_update_begin()
> 
> doing 
> 
> int ret = atomic_inc((atomic_t *)syncp);
> 
> BUG_ON(ret & 1);>
> 
> And u64_stats_update_end()
> 
> int ret = atomic_inc((atomic_t *)syncp);

so with your revised suggested patch:

static inline void u64_stats_update_begin(struct u64_stats_sync *syncp)
{
#if BITS_PER_LONG==32 && defined(CONFIG_SMP)
        int ret = atomic_inc_return((atomic_t *)syncp);
        BUG_ON(ret & 1);
#endif
#if 0
#if BITS_PER_LONG==32 && defined(CONFIG_SMP)
        write_seqcount_begin(&syncp->seq);
#endif
#endif
}

static inline void u64_stats_update_end(struct u64_stats_sync *syncp)
{
#if BITS_PER_LONG==32 && defined(CONFIG_SMP)
        int ret = atomic_inc_return((atomic_t *)syncp);
        BUG_ON(!(ret & 1));
#endif
#if 0
#if BITS_PER_LONG==32 && defined(CONFIG_SMP)
        write_seqcount_end(&syncp->seq);
#endif
#endif
}

and this makes us choke pretty early in IRQ accounting, did I get your
suggestion right?

[    0.015149] ------------[ cut here ]------------
[    0.020051] kernel BUG at ./include/linux/u64_stats_sync.h:82!
[    0.026221] Internal error: Oops - BUG: 0 [#1] SMP ARM
[    0.031661] Modules linked in:
[    0.034970] CPU: 0 PID: 0 Comm: swapper/0 Not tainted
4.13.0-rc5-01297-g7d3f0cd43fee-dirty #33
[    0.043990] Hardware name: Broadcom STB (Flattened Device Tree)
[    0.050237] task: c180a500 task.stack: c1800000
[    0.055065] PC is at irqtime_account_delta+0xa4/0xa8
[    0.060322] LR is at 0x1
[    0.063057] pc : [<c0250504>]    lr : [<00000001>]    psr: 000001d3
[    0.069652] sp : c1801eec  ip : ee78b458  fp : c0e5ea48
[    0.075212] r10: c18b4b40  r9 : f0803000  r8 : ee00a800
[    0.080781] r7 : 00000001  r6 : c180a500  r5 : c1800000  r4 : 00000000
[    0.087680] r3 : 00000000  r2 : 0000ec8c  r1 : ee78b3c0  r0 : ee78b440
[    0.094546] Flags: nzcv  IRQs off  FIQs off  Mode SVC_32  ISA ARM
Segment user
[    0.102314] Control: 30c5387d  Table: 00003000  DAC: fffffffd
[    0.108414] Process swapper/0 (pid: 0, stack limit = 0xc1800210)
[    0.114791] Stack: (0xc1801eec to 0xc1802000)
[    0.119431] 1ee0:                            ee78b440 c1800000
c180a500 00000001 c02505c8
[    0.128079] 1f00: 00000004 ee00a800 ffffe000 00000000 00000000
c0227890 c17e6f20 c0278910
[    0.136665] 1f20: c185724c c18079a0 f080200c c1801f58 f0802000
c0201494 c0e00c18 20000053
[    0.145303] 1f40: ffffffff c1801f8c ffffffff c1800000 c18b4b40
c020d238 00000000 0000001f
[    0.153915] 1f60: 00040d00 00000000 efffc940 00000000 c18b4b40
c1807440 ffffffff 00000000
[    0.162571] 1f80: c18b4b40 c0e5ea48 00000004 c1801fa8 c0322fb0
c0e00c18 20000053 ffffffff
[    0.171226] 1fa0: c18b4b40 00000000 ffffffff ffffffff 00000000
c0e006c0 ffffffff 00000000
[    0.179890] 1fc0: 00000000 c1807448 c0e5ea48 00000000 00000000
c18b4dd4 c180745c c0e5ea44
[    0.188546] 1fe0: c180c0d0 00007000 420f00f3 00000000 00000000
00008090 00000000 00000000
[    0.197165] [<c0250504>] (irqtime_account_delta) from [<c02505c8>]
(irqtime_account_irq+0xc0/0xc4)
[    0.206664] [<c02505c8>] (irqtime_account_irq) from [<c0227890>]
(irq_exit+0x28/0x154)
[    0.215012] [<c0227890>] (irq_exit) from [<c0278910>]
(__handle_domain_irq+0x60/0xb4)
[    0.223245] [<c0278910>] (__handle_domain_irq) from [<c0201494>]
(gic_handle_irq+0x48/0x8c)
[    0.232035] [<c0201494>] (gic_handle_irq) from [<c020d238>]
(__irq_svc+0x58/0x74)
[    0.239941] Exception stack(0xc1801f58 to 0xc1801fa0)
[    0.245327] 1f40:
  00000000 0000001f
[    0.253948] 1f60: 00040d00 00000000 efffc940 00000000 c18b4b40
c1807440 ffffffff 00000000
[    0.262534] 1f80: c18b4b40 c0e5ea48 00000004 c1801fa8 c0322fb0
c0e00c18 20000053 ffffffff
[    0.271144] [<c020d238>] (__irq_svc) from [<c0e00c18>]
(start_kernel+0x300/0x410)
[    0.279028] [<c0e00c18>] (start_kernel) from [<00008090>] (0x8090)
[    0.285547] Code: f57ff05b e3130001 18bd80f0 e7f001f2 (e7f001f2)
[    0.291978] ---[ end trace f68728a0d3053b52 ]---
[    0.296871] Kernel panic - not syncing: Fatal exception in interrupt
[    0.303622] ---[ end Kernel panic - not syncing: Fatal exception in
interrupt
-- 
Florian

^ permalink raw reply

* [PATCH] once: switch to new jump label API
From: Eric Biggers @ 2017-08-21 23:42 UTC (permalink / raw)
  To: netdev
  Cc: linux-kernel, Hannes Frederic Sowa, Jason Baron, Peter Zijlstra,
	Eric Biggers

From: Eric Biggers <ebiggers@google.com>

Switch the DO_ONCE() macro from the deprecated jump label API to the new
one.  The new one is more readable, and for DO_ONCE() it also makes the
generated code more icache-friendly: now the one-time initialization
code is placed out-of-line at the jump target, rather than at the inline
fallthrough case.

Signed-off-by: Eric Biggers <ebiggers@google.com>
---
 include/linux/once.h | 6 +++---
 lib/once.c           | 8 ++++----
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/include/linux/once.h b/include/linux/once.h
index 9c98aaa87cbc..724724918e8b 100644
--- a/include/linux/once.h
+++ b/include/linux/once.h
@@ -5,7 +5,7 @@
 #include <linux/jump_label.h>
 
 bool __do_once_start(bool *done, unsigned long *flags);
-void __do_once_done(bool *done, struct static_key *once_key,
+void __do_once_done(bool *done, struct static_key_true *once_key,
 		    unsigned long *flags);
 
 /* Call a function exactly once. The idea of DO_ONCE() is to perform
@@ -38,8 +38,8 @@ void __do_once_done(bool *done, struct static_key *once_key,
 	({								     \
 		bool ___ret = false;					     \
 		static bool ___done = false;				     \
-		static struct static_key ___once_key = STATIC_KEY_INIT_TRUE; \
-		if (static_key_true(&___once_key)) {			     \
+		static DEFINE_STATIC_KEY_TRUE(___once_key);		     \
+		if (static_branch_unlikely(&___once_key)) {		     \
 			unsigned long ___flags;				     \
 			___ret = __do_once_start(&___done, &___flags);	     \
 			if (unlikely(___ret)) {				     \
diff --git a/lib/once.c b/lib/once.c
index 05c8604627eb..831c5a6b0bb2 100644
--- a/lib/once.c
+++ b/lib/once.c
@@ -5,7 +5,7 @@
 
 struct once_work {
 	struct work_struct work;
-	struct static_key *key;
+	struct static_key_true *key;
 };
 
 static void once_deferred(struct work_struct *w)
@@ -14,11 +14,11 @@ static void once_deferred(struct work_struct *w)
 
 	work = container_of(w, struct once_work, work);
 	BUG_ON(!static_key_enabled(work->key));
-	static_key_slow_dec(work->key);
+	static_branch_disable(work->key);
 	kfree(work);
 }
 
-static void once_disable_jump(struct static_key *key)
+static void once_disable_jump(struct static_key_true *key)
 {
 	struct once_work *w;
 
@@ -51,7 +51,7 @@ bool __do_once_start(bool *done, unsigned long *flags)
 }
 EXPORT_SYMBOL(__do_once_start);
 
-void __do_once_done(bool *done, struct static_key *once_key,
+void __do_once_done(bool *done, struct static_key_true *once_key,
 		    unsigned long *flags)
 	__releases(once_lock)
 {
-- 
2.14.1.480.gb18f417b89-goog

^ permalink raw reply related

* Re: [RFC net-next v2] bridge lwtunnel, VPLS & NVGRE
From: Stephen Hemminger @ 2017-08-22  0:01 UTC (permalink / raw)
  To: David Lamparter; +Cc: netdev, bridge, amine.kherbouche, roopa
In-Reply-To: <20170821171523.951260-1-equinox@diac24.net>

On Mon, 21 Aug 2017 19:15:17 +0200
David Lamparter <equinox@diac24.net> wrote:

> Hi all,
> 
> 
> this is an update on the earlier "[RFC net-next] VPLS support".  Note
> I've changed the subject lines on some of the patches to better reflect
> what they really do (tbh the earlier subject lines were crap.)
> 
> As previously, iproute2 / FRR patches are at:
> - https://github.com/eqvinox/vpls-iproute2
> - https://github.com/opensourcerouting/frr/commits/vpls
> while this patchset is also available at:
> - https://github.com/eqvinox/vpls-linux-kernel
> (but please be aware that I'm amending and rebasing commits)
> 
> The NVGRE implementation in the 3rd patch in this series is actually an
> accident - I was just wiring up gretap as a reference;  only after I was
> done I noticed that that sums up to NVGRE, more or less.  IMHO, it does
> serve well to demonstrate the bridge changes are not VPLS-specific.
> 
> To refer some notes from the first announce mail:
> > I've tested some basic setups, the chain from LDP down into the kernel
> > works at least in these.  FRR has some testcases around from OpenBSD
> > VPLS support, I haven't wired that up to run against Linux / this
> > patchset yet.  
> 
> Same as before (API didn't change).
> 
> > The patchset needs a lot of polishing (yes I left my TODO notes in the
> > commit messages), for now my primary concern is overall design
> > feedback.  Roopa has already provided a lot of input (Thanks!);  the
> > major topic I'm expecting to get discussion on is the bridge FDB
> > changes.  
> 
> Got some useful input;  but still need feedback on the bridge FDB
> changes (first 2 patches).  I don't believe it to have a significant
> impact on existing bridge operation, and I believe a multipoint tunnel
> driver without its own FDB (e.g. NVGRE in this set) should perform
> better than one with its own FDB (e.g. existing VXLAN).
> 
> > P.S.: For a little context on the bridge FDB changes - I'm hoping to
> > find some time to extend this to the MDB to allow aggregating dst
> > metadata and handing down a list of dst metas on TX.  This isn't
> > specifically for VPLS but rather to give sufficient information to the
> > 802.11 stack to allow it to optimize selecting rates (or unicasting)
> > for multicast traffic by having the multicast subscriber list known.
> > This is done by major commercial wifi solutions (e.g. google "dynamic
> > multicast optimization".)  
> 
> You can find hacks at this on:
> https://github.com/eqvinox/vpls-linux-kernel/tree/mdb-hack
> Please note that the patches in that branch are not at an acceptable
> quality level, but you can see the semantic relation to 802.11.
> 
> I would, however, like to point out that this branch has pseudo-working
> IGMP/MLD snooping for VPLS, and it'd be 20-ish lines to add it to NVGRE
> (I'll do that as soon as I get to it, it'll pop up on that branch too.)
> 
> This is relevant to the discussion because it's a feature which is
> non-obvious (to me) on how to do with the VXLAN model of having an
> entirely separate FDB.  Meanwhile, with this architecture, the proof of
> concept / hack is coming in at a measly cost of:
> 8 files changed, 176 insertions(+), 15 deletions(-)
> 
> 
> Cheers,
> 
> -David
> 
> 
> --- diffstat:
> include/linux/netdevice.h      |  18 ++++++
> include/net/dst_metadata.h     |  51 ++++++++++++++---
> include/net/ip_tunnels.h       |   5 ++
> include/uapi/linux/lwtunnel.h  |   8 +++
> include/uapi/linux/neighbour.h |   2 +
> include/uapi/linux/rtnetlink.h |   5 ++
> net/bridge/br.c                |   2 +-
> net/bridge/br_device.c         |   4 ++
> net/bridge/br_fdb.c            | 119 ++++++++++++++++++++++++++++++++--------
> net/bridge/br_input.c          |   6 +-
> net/bridge/br_private.h        |   6 +-
> net/core/lwtunnel.c            |   1 +
> net/ipv4/ip_gre.c              |  40 ++++++++++++--
> net/ipv4/ip_tunnel.c           |   1 +
> net/ipv4/ip_tunnel_core.c      |  87 +++++++++++++++++++++++------
> net/mpls/Kconfig               |  11 ++++
> net/mpls/Makefile              |   1 +
> net/mpls/af_mpls.c             | 113 ++++++++++++++++++++++++++++++++------
> net/mpls/internal.h            |  44 +++++++++++++--
> net/mpls/vpls.c                | 550 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> 20 files changed, 990 insertions(+), 84 deletions(-)

I know the bridge is an easy target to extend L2 forwarding, but it is not
the only option. Have you condidered building a new driver (like VXLAN does)
which does the forwarding you want. Having all features in one driver
makes for worse performance, and increased complexity.

^ permalink raw reply

* Re: [PATCH net-next] net: dsa: User per-cpu 64-bit statistics
From: Florian Fainelli @ 2017-08-22  0:10 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: netdev, davem, Andrew Lunn, Vivien Didelot, David S. Miller,
	open list
In-Reply-To: <bd260f17-3996-adb5-5c69-353f2a483f84@gmail.com>

On 08/21/2017 04:23 PM, Florian Fainelli wrote:
> On 08/04/2017 10:11 AM, Eric Dumazet wrote:
>> On Fri, 2017-08-04 at 08:51 -0700, Florian Fainelli wrote:
>>> On 08/03/2017 10:36 PM, Eric Dumazet wrote:
>>>> On Thu, 2017-08-03 at 21:33 -0700, Florian Fainelli wrote:
>>>>> During testing with a background iperf pushing 1Gbit/sec worth of
>>>>> traffic and having both ifconfig and ethtool collect statistics, we
>>>>> could see quite frequent deadlocks. Convert the often accessed DSA slave
>>>>> network devices statistics to per-cpu 64-bit statistics to remove these
>>>>> deadlocks and provide fast efficient statistics updates.
>>>>>
>>>>
>>>> This seems to be a bug fix, it would be nice to get a proper tag like :
>>>>
>>>> Fixes: f613ed665bb3 ("net: dsa: Add support for 64-bit statistics")
>>>
>>> Right, should have been added, thanks!
>>>
>>>>
>>>> Problem here is that if multiple cpus can call dsa_switch_rcv() at the
>>>> same time, then u64_stats_update_begin() contract is not respected.
>>>
>>> This is really where I struggled understanding what is wrong in the
>>> non-per CPU version, my understanding is that we have:
>>>
>>> - writers for xmit executes in process context
>>> - writers for receive executes from NAPI (from the DSA's master network
>>> device through it's own NAPI doing netif_receive_skb -> netdev_uses_dsa
>>> -> netif_receive_skb)
>>>
>>> readers should all execute in process context. The test scenario that
>>> led to a deadlock involved running iperf in the background, having a
>>> while loop with both ifconfig and ethtool reading stats, and somehow
>>> when iperf exited, either reader would just be locked. So I guess this
>>> leaves us with the two writers not being mutually excluded then, right?
>>
>> You could add a debug version of u64_stats_update_begin()
>>
>> doing 
>>
>> int ret = atomic_inc((atomic_t *)syncp);
>>
>> BUG_ON(ret & 1);>
>>
>> And u64_stats_update_end()
>>
>> int ret = atomic_inc((atomic_t *)syncp);
> 
> so with your revised suggested patch:
> 
> static inline void u64_stats_update_begin(struct u64_stats_sync *syncp)
> {
> #if BITS_PER_LONG==32 && defined(CONFIG_SMP)
>         int ret = atomic_inc_return((atomic_t *)syncp);
>         BUG_ON(ret & 1);
> #endif
> #if 0
> #if BITS_PER_LONG==32 && defined(CONFIG_SMP)
>         write_seqcount_begin(&syncp->seq);
> #endif
> #endif
> }
> 
> static inline void u64_stats_update_end(struct u64_stats_sync *syncp)
> {
> #if BITS_PER_LONG==32 && defined(CONFIG_SMP)
>         int ret = atomic_inc_return((atomic_t *)syncp);
>         BUG_ON(!(ret & 1));
> #endif
> #if 0
> #if BITS_PER_LONG==32 && defined(CONFIG_SMP)
>         write_seqcount_end(&syncp->seq);
> #endif
> #endif
> }
> 
> and this makes us choke pretty early in IRQ accounting, did I get your
> suggestion right?

Well if we return 1 from atomic_inc_return() and the previous value was
zero, of course we are going to be bugging here. The idea behind the
patch I suppose is to make sure that we always get an odd number upon
u64_stats_update_begin()/entry, and an even number upon
u64_stats_update_end()/exit, right?

> 
> [    0.015149] ------------[ cut here ]------------
> [    0.020051] kernel BUG at ./include/linux/u64_stats_sync.h:82!
> [    0.026221] Internal error: Oops - BUG: 0 [#1] SMP ARM
> [    0.031661] Modules linked in:
> [    0.034970] CPU: 0 PID: 0 Comm: swapper/0 Not tainted
> 4.13.0-rc5-01297-g7d3f0cd43fee-dirty #33
> [    0.043990] Hardware name: Broadcom STB (Flattened Device Tree)
> [    0.050237] task: c180a500 task.stack: c1800000
> [    0.055065] PC is at irqtime_account_delta+0xa4/0xa8
> [    0.060322] LR is at 0x1
> [    0.063057] pc : [<c0250504>]    lr : [<00000001>]    psr: 000001d3
> [    0.069652] sp : c1801eec  ip : ee78b458  fp : c0e5ea48
> [    0.075212] r10: c18b4b40  r9 : f0803000  r8 : ee00a800
> [    0.080781] r7 : 00000001  r6 : c180a500  r5 : c1800000  r4 : 00000000
> [    0.087680] r3 : 00000000  r2 : 0000ec8c  r1 : ee78b3c0  r0 : ee78b440
> [    0.094546] Flags: nzcv  IRQs off  FIQs off  Mode SVC_32  ISA ARM
> Segment user
> [    0.102314] Control: 30c5387d  Table: 00003000  DAC: fffffffd
> [    0.108414] Process swapper/0 (pid: 0, stack limit = 0xc1800210)
> [    0.114791] Stack: (0xc1801eec to 0xc1802000)
> [    0.119431] 1ee0:                            ee78b440 c1800000
> c180a500 00000001 c02505c8
> [    0.128079] 1f00: 00000004 ee00a800 ffffe000 00000000 00000000
> c0227890 c17e6f20 c0278910
> [    0.136665] 1f20: c185724c c18079a0 f080200c c1801f58 f0802000
> c0201494 c0e00c18 20000053
> [    0.145303] 1f40: ffffffff c1801f8c ffffffff c1800000 c18b4b40
> c020d238 00000000 0000001f
> [    0.153915] 1f60: 00040d00 00000000 efffc940 00000000 c18b4b40
> c1807440 ffffffff 00000000
> [    0.162571] 1f80: c18b4b40 c0e5ea48 00000004 c1801fa8 c0322fb0
> c0e00c18 20000053 ffffffff
> [    0.171226] 1fa0: c18b4b40 00000000 ffffffff ffffffff 00000000
> c0e006c0 ffffffff 00000000
> [    0.179890] 1fc0: 00000000 c1807448 c0e5ea48 00000000 00000000
> c18b4dd4 c180745c c0e5ea44
> [    0.188546] 1fe0: c180c0d0 00007000 420f00f3 00000000 00000000
> 00008090 00000000 00000000
> [    0.197165] [<c0250504>] (irqtime_account_delta) from [<c02505c8>]
> (irqtime_account_irq+0xc0/0xc4)
> [    0.206664] [<c02505c8>] (irqtime_account_irq) from [<c0227890>]
> (irq_exit+0x28/0x154)
> [    0.215012] [<c0227890>] (irq_exit) from [<c0278910>]
> (__handle_domain_irq+0x60/0xb4)
> [    0.223245] [<c0278910>] (__handle_domain_irq) from [<c0201494>]
> (gic_handle_irq+0x48/0x8c)
> [    0.232035] [<c0201494>] (gic_handle_irq) from [<c020d238>]
> (__irq_svc+0x58/0x74)
> [    0.239941] Exception stack(0xc1801f58 to 0xc1801fa0)
> [    0.245327] 1f40:
>   00000000 0000001f
> [    0.253948] 1f60: 00040d00 00000000 efffc940 00000000 c18b4b40
> c1807440 ffffffff 00000000
> [    0.262534] 1f80: c18b4b40 c0e5ea48 00000004 c1801fa8 c0322fb0
> c0e00c18 20000053 ffffffff
> [    0.271144] [<c020d238>] (__irq_svc) from [<c0e00c18>]
> (start_kernel+0x300/0x410)
> [    0.279028] [<c0e00c18>] (start_kernel) from [<00008090>] (0x8090)
> [    0.285547] Code: f57ff05b e3130001 18bd80f0 e7f001f2 (e7f001f2)
> [    0.291978] ---[ end trace f68728a0d3053b52 ]---
> [    0.296871] Kernel panic - not syncing: Fatal exception in interrupt
> [    0.303622] ---[ end Kernel panic - not syncing: Fatal exception in
> interrupt
> 


-- 
Florian

^ permalink raw reply

* Re: [PATCH v6 iproute2 0/8] RDMAtool
From: Stephen Hemminger @ 2017-08-22  0:11 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Doug Ledford, linux-rdma, Leon Romanovsky, Dennis Dalessandro,
	Jason Gunthorpe, Jiri Pirko, Ariel Almog, David Laight,
	Linux Netdev
In-Reply-To: <20170820095828.13812-1-leon@kernel.org>

On Sun, 20 Aug 2017 12:58:20 +0300
Leon Romanovsky <leon@kernel.org> wrote:

> From: Leon Romanovsky <leonro@mellanox.com>
> 
> This is fifth revision of series implementing the RDAMtool -  the tool
> to configure RDMA devices.
> 
> It looks like everyone who was interested to read cover letter already did it,
> so I'll start from the changelog:
> 
> Changelog:
> v5->v6:
>  * Removed double includes
>  * Copied rdma_netlink.h from he kernel to include/rdma folder, so the
>    tool can be built as a standalone.
> v4->v5:
>  * Rebased to latest net-next branch
>  * Moved BIT() macro from devlink to general utils.h file - Patch #1.
>  * Changed the order of patches - moved man pages to be last patch.
>  * Rewrote all switch->case->return_string constructions to be static
>    tables with help of David's macro magic. Thanks a lot.
>  * Dropped dependency on exported device and port properties. Now tool depends
>    on RDMA netlink only and all needed code is already in Doug's for-next.
>  * Added two OPA specific physical link states, because their names is
>    too broad - TEST and OFFLINE, I named it as OPA_TEST and OPA_OFFLINE.
> v3->v4:
>  * Rebased to latest net-next branch
>  * Added JSON output -j (json) and -p (pretty output)
>  * Exported and reused kernel UAPIs and defines instead of hard coded
>    version.
> v2->v3:
>  * Removed MAX()
>  * Reduced scope of rd_argv_match
>  * Removed return from rdma_free_devmap
>  * Added extra break at rdma_send_msg
> v1->v2:
>  * Squashed multiple (and similar) patches to be one patch for dev object
>    and one patch for link object.
>  * Removed port_map struct
>  * Removed global netlink dump during initialization, it removed the need to store
>    the intermediate variables and reuse ability of netlink to signal if variable
>    exists or doesn't.
>  * Added "-d" --details option and put all CAPs under it.
> 
> v0->v1:
>  * Moved hunk with changes in man/Makefile from first patch to the last patch
>  * Removed the "unknown command" from the examples in commit messages
>  * Removed special "caps" parsing command and put it to be part of general "show" command
>  * Changed parsed capability format to be similar to iproute2 suite
>  * Added FW version as an output of show command.
>  * Added forgotten CAP_FLAGS to the nla_policy list
> RFC->v0:
>  * Removed everything that is not implemented yet.
>  * Abandoned sysfs interfaces in favor of netlink.
> 
> -----
> The initial proposal was sent as RFC [1] and was based on sysfs entries as POC.
> 
> The current series was rewritten completely to work with RDMA netlinks as
> a source of user<->kernel communications. In order to achieve that, the
> RDMA netlinks were extensively refactored and modernized [2, 3, 4 and 5].
> 
> The Doug's for-next tag includes most of the needed patches for this tool.
> 
> The following is an example of various runs on my machine with 5 devices
> (4 in IB mode and one in Ethernet mode).
> 
> ### Without parameters
> $ rdma
> Usage: rdma [ OPTIONS ] OBJECT { COMMAND | help }
> where  OBJECT := { dev | link | help }
>        OPTIONS := { -V[ersion] | -d[etails] | -j[son] | -p[retty]}
> 
> ### With unspecified device name
> $ rdma dev
> 1: mlx5_0: node_type ca fw 2.8.9999 node_guid 5254:00c0:fe12:3457 sys_image_guid 5254:00c0:fe12:3457
> 2: mlx5_1: node_type ca fw 2.8.9999 node_guid 5254:00c0:fe12:3458 sys_image_guid 5254:00c0:fe12:3458
> 3: mlx5_2: node_type ca fw 2.8.9999 node_guid 5254:00c0:fe12:3459 sys_image_guid 5254:00c0:fe12:3459
> 4: mlx5_3: node_type ca fw 2.8.9999 node_guid 5254:00c0:fe12:345a sys_image_guid 5254:00c0:fe12:345a
> 5: mlx5_4: node_type ca fw 2.8.9999 node_guid 5254:00c0:fe12:345b sys_image_guid 5254:00c0:fe12:345b
> 
> ### Detailed mode
> $ rdma -d dev
> 1: mlx5_0: node_type ca fw 2.8.9999 node_guid 5254:00c0:fe12:3457 sys_image_guid 5254:00c0:fe12:3457
>     caps: <BAD_PKEY_CNTR, BAD_QKEY_CNTR, CHANGE_PHY_POR, PORT_ACTIVE_EVENT, SYS_IMAGE_GUID, RC_RNR_NAK_GEN, MEM_WINDOW, UD_IP_CSUM, UD_TSO, XRC, MEM_MGT_EXTENSIONS, BLOCK_MULTICAST_LOOPBACK, MEM_WINDOW_TYPE_2B, RAW_IP_CSUM, MANAGED_FLOW_STEERING, RESIZE_MAX_WR>
> 2: mlx5_1: node_type ca fw 2.8.9999 node_guid 5254:00c0:fe12:3458 sys_image_guid 5254:00c0:fe12:3458
>     caps: <BAD_PKEY_CNTR, BAD_QKEY_CNTR, CHANGE_PHY_POR, PORT_ACTIVE_EVENT, SYS_IMAGE_GUID, RC_RNR_NAK_GEN, MEM_WINDOW, UD_IP_CSUM, UD_TSO, XRC, MEM_MGT_EXTENSIONS, BLOCK_MULTICAST_LOOPBACK, MEM_WINDOW_TYPE_2B, RAW_IP_CSUM, MANAGED_FLOW_STEERING, RESIZE_MAX_WR>
> 3: mlx5_2: node_type ca fw 2.8.9999 node_guid 5254:00c0:fe12:3459 sys_image_guid 5254:00c0:fe12:3459
>     caps: <BAD_PKEY_CNTR, BAD_QKEY_CNTR, CHANGE_PHY_POR, PORT_ACTIVE_EVENT, SYS_IMAGE_GUID, RC_RNR_NAK_GEN, MEM_WINDOW, UD_IP_CSUM, UD_TSO, XRC, MEM_MGT_EXTENSIONS, BLOCK_MULTICAST_LOOPBACK, MEM_WINDOW_TYPE_2B, RAW_IP_CSUM, MANAGED_FLOW_STEERING, RESIZE_MAX_WR>
> 4: mlx5_3: node_type ca fw 2.8.9999 node_guid 5254:00c0:fe12:345a sys_image_guid 5254:00c0:fe12:345a
>     caps: <BAD_PKEY_CNTR, BAD_QKEY_CNTR, CHANGE_PHY_POR, PORT_ACTIVE_EVENT, SYS_IMAGE_GUID, RC_RNR_NAK_GEN, MEM_WINDOW, UD_IP_CSUM, UD_TSO, XRC, MEM_MGT_EXTENSIONS, BLOCK_MULTICAST_LOOPBACK, MEM_WINDOW_TYPE_2B, RAW_IP_CSUM, MANAGED_FLOW_STEERING, RESIZE_MAX_WR>
> 5: mlx5_4: node_type ca fw 2.8.9999 node_guid 5254:00c0:fe12:345b sys_image_guid 5254:00c0:fe12:345b
>     caps: <BAD_PKEY_CNTR, BAD_QKEY_CNTR, CHANGE_PHY_POR, PORT_ACTIVE_EVENT, SYS_IMAGE_GUID, RC_RNR_NAK_GEN, MEM_WINDOW, UD_IP_CSUM, UD_TSO, XRC, MEM_MGT_EXTENSIONS, BLOCK_MULTICAST_LOOPBACK, MEM_WINDOW_TYPE_2B, RAW_IP_CSUM, MANAGED_FLOW_STEERING, RESIZE_MAX_WR>
> 
> ### Specific device
> $ rdma dev show mlx5_4
> 5: mlx5_4: node_type ca fw 2.8.9999 node_guid 5254:00c0:fe12:345b sys_image_guid 5254:00c0:fe12:345b
> 
> ### Specific device in detailed mode
> $ rdma dev show mlx5_4 -d
> 5: mlx5_4: node_type ca fw 2.8.9999 node_guid 5254:00c0:fe12:345b sys_image_guid 5254:00c0:fe12:345b
>     caps: <BAD_PKEY_CNTR, BAD_QKEY_CNTR, CHANGE_PHY_POR, PORT_ACTIVE_EVENT, SYS_IMAGE_GUID, RC_RNR_NAK_GEN, MEM_WINDOW, UD_IP_CSUM, UD_TSO, XRC, MEM_MGT_EXTENSIONS, BLOCK_MULTICAST_LOOPBACK, MEM_WINDOW_TYPE_2B, RAW_IP_CSUM, MANAGED_FLOW_STEERING, RESIZE_MAX_WR>
> 
> ### Unknown command (caps)
> $ rdma dev show mlx5_4 caps
> Unknown parameter 'caps'.
> 
> ### Link properties without device name
> $ rdma link
> 1/1: mlx5_0/1: subnet_prefix fe80:0000:0000:0000 lid 13399 sm_lid 49151 lmc 0 state ACTIVE physical_state LINK_UP
> 2/1: mlx5_1/1: subnet_prefix fe80:0000:0000:0000 lid 13400 sm_lid 49151 lmc 0 state ACTIVE physical_state LINK_UP
> 3/1: mlx5_2/1: subnet_prefix fe80:0000:0000:0000 lid 13401 sm_lid 49151 lmc 0 state ACTIVE physical_state LINK_UP
> 4/1: mlx5_3/1: state DOWN physical_state DISABLED
> 5/1: mlx5_4/1: subnet_prefix fe80:0000:0000:0000 lid 13403 sm_lid 49151 lmc 0 state ACTIVE physical_state LINK_UP
> 
> ### Link properties in detailed mode
> $ rdma link -d
> 1/1: mlx5_0/1: subnet_prefix fe80:0000:0000:0000 lid 13399 sm_lid 49151 lmc 0 state ACTIVE physical_state LINK_UP
>     caps: <AUTO_MIGR>
> 2/1: mlx5_1/1: subnet_prefix fe80:0000:0000:0000 lid 13400 sm_lid 49151 lmc 0 state ACTIVE physical_state LINK_UP
>     caps: <AUTO_MIGR>
> 3/1: mlx5_2/1: subnet_prefix fe80:0000:0000:0000 lid 13401 sm_lid 49151 lmc 0 state ACTIVE physical_state LINK_UP
>     caps: <AUTO_MIGR>
> 4/1: mlx5_3/1: state DOWN physical_state DISABLED
>     caps: <CM, IP_BASED_GIDS>
> 5/1: mlx5_4/1: subnet_prefix fe80:0000:0000:0000 lid 13403 sm_lid 49151 lmc 0 state ACTIVE physical_state LINK_UP
>     caps: <AUTO_MIGR>
> 
> ### All links for specific device
> $ rdma link show mlx5_3
> 1/1: mlx5_0/1: subnet_prefix fe80:0000:0000:0000 lid 13399 sm_lid 49151 lmc 0 state ACTIVE physical_state LINK_UP
> 
> ### Detailed link properties for specific device
> $ rdma link -d show mlx5_3
> 1/1: mlx5_0/1: subnet_prefix fe80:0000:0000:0000 lid 13399 sm_lid 49151 lmc 0 state ACTIVE physical_state LINK_UP
>     caps: <AUTO_MIGR>
> 
> ### Specific port for specific device
> $ rdma link show mlx5_4/1
> 1/1: mlx5_0/1: subnet_prefix fe80:0000:0000:0000 lid 13399 sm_lid 49151 lmc 0 state ACTIVE physical_state LINK_UP
> 
> ### Unknown parameter
> $ rdma link show mlx5_4/1 caps
> Unknown parameter 'caps'.
> 
> Available in the "topic/rdmatool-netlink-v6" topic branch of this git repo:
> git://git.kernel.org/pub/scm/linux/kernel/git/leon/iproute2.git
> 
> Or for browsing:
> https://git.kernel.org/cgit/linux/kernel/git/leon/iproute2.git/log/?h=topic/rdmatool-netlink-v6
> 
> Thanks
> 
> [1] https://www.spinics.net/lists/linux-rdma/msg49575.html
> [2] https://patchwork.kernel.org/patch/9752865/
> [3] https://www.spinics.net/lists/linux-rdma/msg50827.html
> [4] https://www.spinics.net/lists/linux-rdma/msg51210.html
> [5] https://patchwork.kernel.org/patch/9811729/ and https://patchwork.kernel.org/patch/9811731/]
> 
> Cc: Doug Ledford <dledford@redhat.com>
> Cc: Dennis Dalessandro <dennis.dalessandro@intel.com>
> Cc: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
> Cc: Jiri Pirko <jiri@mellanox.com>
> Cc: Ariel Almog <ariela@mellanox.com>
> Cc: David Laight <David.Laight@ACULAB.COM>
> Cc: Linux Netdev <netdev@vger.kernel.org>
> 
> Leon Romanovsky (8):
>   utils: Move BIT macro to common header
>   rdma: Add basic infrastructure for RDMA tool
>   rdma: Add dev object
>   rdma: Add link object
>   rdma: Add json and pretty outputs
>   rdma: Implement json output for dev object
>   rdma: Add json output to link object
>   rdma: Add initial manual for the tool
> 
>  Makefile                    |   2 +-
>  devlink/devlink.c           |   2 +-
>  include/rdma/rdma_netlink.h | 307 +++++++++++++++++++++++++++++++++++++++
>  include/utils.h             |   2 +
>  man/man8/rdma-dev.8         |  55 +++++++
>  man/man8/rdma-link.8        |  55 +++++++
>  man/man8/rdma.8             | 102 +++++++++++++
>  rdma/.gitignore             |   1 +
>  rdma/Makefile               |  22 +++
>  rdma/dev.c                  | 284 ++++++++++++++++++++++++++++++++++++
>  rdma/link.c                 | 343 ++++++++++++++++++++++++++++++++++++++++++++
>  rdma/rdma.c                 | 143 ++++++++++++++++++
>  rdma/rdma.h                 |  91 ++++++++++++
>  rdma/utils.c                | 266 ++++++++++++++++++++++++++++++++++
>  14 files changed, 1673 insertions(+), 2 deletions(-)
>  create mode 100644 include/rdma/rdma_netlink.h
>  create mode 100644 man/man8/rdma-dev.8
>  create mode 100644 man/man8/rdma-link.8
>  create mode 100644 man/man8/rdma.8
>  create mode 100644 rdma/.gitignore
>  create mode 100644 rdma/Makefile
>  create mode 100644 rdma/dev.c
>  create mode 100644 rdma/link.c
>  create mode 100644 rdma/rdma.c
>  create mode 100644 rdma/rdma.h
>  create mode 100644 rdma/utils.c

Applied to master branch (for 4.13).
Thanks for your patience and persistance.

^ permalink raw reply

* Re: [PATCH net] udp: on peeking bad csum, drop packets even if not at head
From: Willem de Bruijn @ 2017-08-22  0:12 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Network Development, David Miller, Paolo Abeni, Willem de Bruijn
In-Reply-To: <1503355232.2499.15.camel@edumazet-glaptop3.roam.corp.google.com>

On Mon, Aug 21, 2017 at 6:40 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Mon, 2017-08-21 at 17:39 -0400, Willem de Bruijn wrote:
>> From: Willem de Bruijn <willemb@google.com>
>>
>> When peeking, if a bad csum is discovered, the skb is unlinked from
>> the queue with __sk_queue_drop_skb and the peek operation restarted.
>>
>> __sk_queue_drop_skb only drops packets that match the queue head. With
>> sk_peek_off, the skb need not be at head, causing the call to fail and
>> the same skb to be found again on restart.
>>
>> Walk the queue to find the correct skb. Limit the walk to sk_peek_off,
>> to bound cycle cost to at most twice the original skb_queue_walk in
>> __skb_try_recv_from_queue.
>>
>> The operation may race with updates to sk_peek_off. As the operation
>> is retried, it will eventually succeed.
>>
>> Signed-off-by: Willem de Bruijn <willemb@google.com>
>
> You forgot the Fixes: tag, that such a bug fix deserves.

Indeed, sorry. I'm looking into that now. It should be the patch that
introduced peeking at offset, but need to verify.

I should also add that this bug was discovered by syzkaller.

> I am not a big fan of your patch and would prefer a solution without the
> loop.

Agreed.

^ permalink raw reply

* Re: [iproute PATCH v2 0/7] Covscan: Dead code elimination
From: Stephen Hemminger @ 2017-08-22  0:13 UTC (permalink / raw)
  To: Phil Sutter; +Cc: netdev
In-Reply-To: <20170817170931.24089-1-phil@nwl.cc>

On Thu, 17 Aug 2017 19:09:24 +0200
Phil Sutter <phil@nwl.cc> wrote:

> This series collects patches from v1 which deal with dead code, either
> by removing it or changing context so it is accessed again if that makes
> sense.
> 
> No changes to the actual patches, just splitting into smaller series.
> 
> Phil Sutter (7):
>   devlink: No need for this self-assignment
>   ipntable: No need to check and assign to parms_rta
>   iproute: Fix for missing 'Oifs:' display
>   lib/rt_names: Drop dead code in rtnl_rttable_n2a()
>   ss: Skip useless check in parse_hostcond()
>   ss: Drop useless assignment
>   tc/m_gact: Drop dead code
> 
>  devlink/devlink.c |  2 +-
>  ip/ipntable.c     |  2 --
>  ip/iproute.c      |  8 +++++---
>  lib/rt_names.c    |  4 ----
>  misc/ss.c         |  3 +--
>  tc/m_gact.c       | 14 +++-----------
>  6 files changed, 10 insertions(+), 23 deletions(-)
> 

Sure these look fine. Applied.

^ permalink raw reply

* Re: [iproute PATCH v3 0/6] Covscan: Don't access garbage
From: Stephen Hemminger @ 2017-08-22  0:18 UTC (permalink / raw)
  To: Phil Sutter; +Cc: netdev
In-Reply-To: <20170821092704.21614-1-phil@nwl.cc>

On Mon, 21 Aug 2017 11:26:58 +0200
Phil Sutter <phil@nwl.cc> wrote:

> This series collects patches from v1 which resolve situations where
> garbage might be read, either due to missing initialization of
> variables or accessing data which went out of scope.
> 
> Changes since v2:
> - Rebased onto current master branch.
> - Dropped first patch since it is not a real issue.
> 
> Phil Sutter (6):
>   ipaddress: Avoid accessing uninitialized variable lcl
>   iplink_can: Prevent overstepping array bounds
>   ipmaddr: Avoid accessing uninitialized data
>   ss: Use C99 initializer in netlink_show_one()
>   netem/maketable: Check return value of fstat()
>   tc/q_multiq: Don't pass garbage in TCA_OPTIONS
> 
>  ip/ipaddress.c    |  2 +-
>  ip/iplink_can.c   |  4 ++--
>  ip/ipmaddr.c      |  2 +-
>  misc/ss.c         | 13 +++++++------
>  netem/maketable.c |  4 ++--
>  tc/q_multiq.c     |  2 +-
>  6 files changed, 14 insertions(+), 13 deletions(-)
> 

These look fine. Applied.

^ permalink raw reply

* Re: [iproute PATCH v3 2/5] nstat: Fix for potential NULL pointer dereference
From: Stephen Hemminger @ 2017-08-22  0:19 UTC (permalink / raw)
  To: Phil Sutter; +Cc: netdev
In-Reply-To: <20170821100308.24854-3-phil@nwl.cc>

On Mon, 21 Aug 2017 12:03:05 +0200
Phil Sutter <phil@nwl.cc> wrote:

> If the string at 'p' contains neither space not newline, 'p' will become
> NULL. Make sure this isn't the case before dereferencing it.
> 
> Signed-off-by: Phil Sutter <phil@nwl.cc>
> ---
> Changes since v2:
> - Call abort() if 'p' becomes NULL.
> ---
>  misc/nstat.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/misc/nstat.c b/misc/nstat.c
> index a4dd405d43a93..56e9367e99736 100644
> --- a/misc/nstat.c
> +++ b/misc/nstat.c
> @@ -217,6 +217,8 @@ static void load_ugly_table(FILE *fp)
>  			n->next = db;
>  			db = n;
>  			p = next;
> +			if (!p)
> +				abort();
>  		}
>  		n = db;
>  		if (fgets(buf, sizeof(buf), fp) == NULL)

This doesn't do anything better than just dereferencing NULL.
In either case program crashes with no useful information to user.
Not applying this.

^ permalink raw reply

* Re: [iproute PATCH v2 1/7] nstat: Avoid passing negative fd to fdopen()
From: Stephen Hemminger @ 2017-08-22  0:23 UTC (permalink / raw)
  To: Phil Sutter; +Cc: netdev
In-Reply-To: <20170821170813.29697-2-phil@nwl.cc>

On Mon, 21 Aug 2017 19:08:07 +0200
Phil Sutter <phil@nwl.cc> wrote:

> Introduce a wrapper which does the sanity checking and returns NULL
> in case fd is invalid.
> 
> Signed-off-by: Phil Sutter <phil@nwl.cc>
> ---
>  misc/nstat.c | 15 +++++++++++----
>  1 file changed, 11 insertions(+), 4 deletions(-)
> 
> diff --git a/misc/nstat.c b/misc/nstat.c
> index 1212b1f2c8128..7cdde75a56e4e 100644
> --- a/misc/nstat.c
> +++ b/misc/nstat.c
> @@ -252,9 +252,16 @@ static void load_ugly_table(FILE *fp)
>  	}
>  }
>  
> +static FILE *fdopen_null(int fd, const char *mode)
> +{
> +	if (fd < 0)
> +		return NULL;
> +	return fdopen(fd, mode);
> +}
> +
>  static void load_sctp_snmp(void)
>  {
> -	FILE *fp = fdopen(net_sctp_snmp_open(), "r");
> +	FILE *fp = fdopen_null(net_sctp_snmp_open(), "r");
>  
>  	if (fp) {
>  		load_good_table(fp);
> @@ -264,7 +271,7 @@ static void load_sctp_snmp(void)
>  
>  static void load_snmp(void)
>  {
> -	FILE *fp = fdopen(net_snmp_open(), "r");
> +	FILE *fp = fdopen_null(net_snmp_open(), "r");
>  
>  	if (fp) {
>  		load_ugly_table(fp);
> @@ -274,7 +281,7 @@ static void load_snmp(void)
>  
>  static void load_snmp6(void)
>  {
> -	FILE *fp = fdopen(net_snmp6_open(), "r");
> +	FILE *fp = fdopen_null(net_snmp6_open(), "r");
>  
>  	if (fp) {
>  		load_good_table(fp);
> @@ -284,7 +291,7 @@ static void load_snmp6(void)
>  
>  static void load_netstat(void)
>  {
> -	FILE *fp = fdopen(net_netstat_open(), "r");
> +	FILE *fp = fdopen_null(net_netstat_open(), "r");
>  
>  	if (fp) {
>  		load_ugly_table(fp);

Why not just fix it at the source of the open.
I.e 
static FILE *generic_proc_open(condt char * env, const char *name)
{
...
	return fopen(p, "r");
}

^ permalink raw reply

* Re: [iproute PATCH v3 2/7] xfrm_state: Make sure alg_name is NULL-terminated
From: Stephen Hemminger @ 2017-08-22  0:28 UTC (permalink / raw)
  To: Phil Sutter; +Cc: netdev
In-Reply-To: <20170821132341.23118-3-phil@nwl.cc>

On Mon, 21 Aug 2017 15:23:36 +0200
Phil Sutter <phil@nwl.cc> wrote:

> Signed-off-by: Phil Sutter <phil@nwl.cc>
> ---
>  ip/xfrm_state.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/ip/xfrm_state.c b/ip/xfrm_state.c
> index e11c93bf1c3b5..7c0389038986e 100644
> --- a/ip/xfrm_state.c
> +++ b/ip/xfrm_state.c
> @@ -125,7 +125,8 @@ static int xfrm_algo_parse(struct xfrm_algo *alg, enum xfrm_attr_type_t type,
>  	fprintf(stderr, "warning: ALGO-NAME/ALGO-KEYMAT values will be sent to the kernel promiscuously! (verifying them isn't implemented yet)\n");
>  #endif
>  
> -	strncpy(alg->alg_name, name, sizeof(alg->alg_name));
> +	strncpy(alg->alg_name, name, sizeof(alg->alg_name) - 1);
> +	alg->alg_name[sizeof(alg->alg_name) - 1] = '\0';
>  
>  	if (slen > 2 && strncmp(key, "0x", 2) == 0) {
>  		/* split two chars "0x" from the top */

You are fixing enough of these null terminated string issues, that maybe
introducing strlcpy() would make sense. Either in utils (or -lbsd).

^ permalink raw reply

* Re: [iproute PATCH v2 0/3] Covscan: Fix for missing error checking
From: Stephen Hemminger @ 2017-08-22  0:29 UTC (permalink / raw)
  To: Phil Sutter; +Cc: netdev
In-Reply-To: <20170821163652.23752-1-phil@nwl.cc>

On Mon, 21 Aug 2017 18:36:49 +0200
Phil Sutter <phil@nwl.cc> wrote:

> This series collects patches from v1 dealing with spots where error
> checking is necessary or recommended.
> 
> Minor changes to patches 1 and 2, patch 3 remains unchanged.
> 
> Phil Sutter (3):
>   iproute: Check mark value input
>   iplink_vrf: Complain if main table is not found
>   devlink: Check return code of strslashrsplit()
> 
>  devlink/devlink.c | 16 ++++++++++++----
>  ip/iplink_vrf.c   |  4 +++-
>  ip/iproute.c      |  6 ++++--
>  3 files changed, 19 insertions(+), 7 deletions(-)
> 

These 3 look fine. Applied

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox