Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH net] net: sched: fix NULL pointer dereference when action calls some targets
From: Xin Long @ 2017-08-17  7:45 UTC (permalink / raw)
  To: Cong Wang; +Cc: network dev, David Miller, netfilter-devel, Jamal Hadi Salim
In-Reply-To: <CAM_iQpVDQ3iZ-bRPNNPKtXPTRQXdTJFtzPcMn2SupZWy6O_cqw@mail.gmail.com>

On Thu, Aug 17, 2017 at 5:57 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
> On Wed, Aug 16, 2017 at 1:39 AM, Xin Long <lucien.xin@gmail.com> wrote:
>> On Wed, Aug 9, 2017 at 7:33 AM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>>> On Mon, Aug 7, 2017 at 7:33 PM, Xin Long <lucien.xin@gmail.com> wrote:
>>>> On Tue, Aug 8, 2017 at 9:15 AM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>>>>> This looks like a completely API burden?
>>>> netfilter xt targets are not really compatible with netsched action.
>>>> I've got to say, the patch is just a way to make checkentry return
>>>> false and avoid panic. like [1] said
>>>
>>> I don't doubt you fix a crash, I am thinking if we can
>>> "fix" the API instead of fixing the caller.
>> Hi, Cong,
>>
>> For now, I don't think it's possible to change APIs or  some of their targets
>> for the panic caused by action xt calling.
>>
>> The common way should be fixed in net_sched side.
>>
>> Given that the issue is very easy to triggered,
>> let's wait for netfilter's replies for another few days,
>> otherwise I will repost the fix, agree ?
>
> Yeah, no objections from me.
>
> By the way, do you know how other callers of this API
> use 'entryinfo'? Do they pass the address of the struct
> on stack too?
afaik, two places:
1. translate_table -> find_check_entry -> check_target -> xt_check_target
most iptables operations go there and .entryinfo is set in check_target
with struct ipt_entry *e, which is an iptable rule, so can't be NULL.
(as well as ip6table in netfilter/ip6_tables.c )

2. nft_target_init -> xt_check_target, where nft_target_set_tgchk_param
does the exact thing to set .entryinfo with a local varible union nft_entry e:
union nft_entry {
        struct ipt_entry e4;
        struct ip6t_entry e6;
        struct ebt_entry ebt;
        struct arpt_entry arp;
};

case 2 is actually what nft does to use xt targets, so net/sched
action should do
the same.

^ permalink raw reply

* [PATCH REPOST v5 iproute2 8/8] rdma: Add initial manual for the tool
From: Leon Romanovsky @ 2017-08-17  6:56 UTC (permalink / raw)
  To: Doug Ledford, Stephen Hemminger
  Cc: linux-rdma, Leon Romanovsky, Dennis Dalessandro, Jason Gunthorpe,
	Jiri Pirko, Ariel Almog, David Laight, Linux Netdev
In-Reply-To: <20170817065614.1393-1-leonro@mellanox.com>

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
 man/man8/rdma-dev.8  |  55 +++++++++++++++++++++++++++
 man/man8/rdma-link.8 |  55 +++++++++++++++++++++++++++
 man/man8/rdma.8      | 102 +++++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 212 insertions(+)
 create mode 100644 man/man8/rdma-dev.8
 create mode 100644 man/man8/rdma-link.8
 create mode 100644 man/man8/rdma.8

diff --git a/man/man8/rdma-dev.8 b/man/man8/rdma-dev.8
new file mode 100644
index 00000000..461681b6
--- /dev/null
+++ b/man/man8/rdma-dev.8
@@ -0,0 +1,55 @@
+.TH RDMA\-DEV 8 "06 Jul 2017" "iproute2" "Linux"
+.SH NAME
+rdmak-dev \- RDMA device configuration
+.SH SYNOPSIS
+.sp
+.ad l
+.in +8
+.ti -8
+.B rdma
+.RI "[ " OPTIONS " ]"
+.B dev
+.RI  " { " COMMAND " | "
+.BR help " }"
+.sp
+
+.ti -8
+.IR OPTIONS " := { "
+\fB\-V\fR[\fIersion\fR] |
+\fB\-d\fR[\fIetails\fR] }
+
+.ti -8
+.B rdma dev show
+.RI "[ " DEV " ]"
+
+.ti -8
+.B rdma dev help
+
+.SH "DESCRIPTION"
+.SS rdma dev show - display rdma device attributes
+
+.PP
+.I "DEV"
+- specifies the RDMA device to show.
+If this argument is omitted all devices are listed.
+
+.SH "EXAMPLES"
+.PP
+rdma dev
+.RS 4
+Shows the state of all RDMA devices on the system.
+.RE
+.PP
+rdma dev show mlx5_3
+.RS 4
+Shows the state of specified RDMA device.
+.RE
+.PP
+
+.SH SEE ALSO
+.BR rdma (8),
+.BR rdma-link (8),
+.br
+
+.SH AUTHOR
+Leon Romanovsky <leonro@mellanox.com>
diff --git a/man/man8/rdma-link.8 b/man/man8/rdma-link.8
new file mode 100644
index 00000000..8ed049ef
--- /dev/null
+++ b/man/man8/rdma-link.8
@@ -0,0 +1,55 @@
+.TH RDMA\-LINK 8 "06 Jul 2017" "iproute2" "Linux"
+.SH NAME
+rdma-link \- rdma link configuration
+.SH SYNOPSIS
+.sp
+.ad l
+.in +8
+.ti -8
+.B devlink
+.RI "[ " OPTIONS " ]"
+.B link
+.RI  " { " COMMAND " | "
+.BR help " }"
+.sp
+
+.ti -8
+.IR OPTIONS " := { "
+\fB\-V\fR[\fIersion\fR] |
+\fB\-d\fR[\fIetails\fR] }
+
+.ti -8
+.B rdma link show
+.RI "[ " DEV/PORT_INDEX " ]"
+
+.ti -8
+.B rdma link help
+
+.SH "DESCRIPTION"
+.SS rdma link show - display rdma link attributes
+
+.PP
+.I "DEV/PORT_INDEX"
+- specifies the RDMa link to show.
+If this argument is omitted all links are listed.
+
+.SH "EXAMPLES"
+.PP
+rdma link show
+.RS 4
+Shows the state of all rdma links on the system.
+.RE
+.PP
+rdma link show mlx5_2/1
+.RS 4
+Shows the state of specified rdma link.
+.RE
+.PP
+
+.SH SEE ALSO
+.BR rdma (8),
+.BR rdma-dev (8),
+.br
+
+.SH AUTHOR
+Leon Romanovsky <leonro@mellanox.com>
diff --git a/man/man8/rdma.8 b/man/man8/rdma.8
new file mode 100644
index 00000000..798b33d3
--- /dev/null
+++ b/man/man8/rdma.8
@@ -0,0 +1,102 @@
+.TH RDMA 8 "28 Mar 2017" "iproute2" "Linux"
+.SH NAME
+rdma \- RDMA tool
+.SH SYNOPSIS
+.sp
+.ad l
+.in +8
+.ti -8
+.B rdma
+.RI "[ " OPTIONS " ] " OBJECT " { " COMMAND " | "
+.BR help " }"
+.sp
+
+.ti -8
+.IR OBJECT " := { "
+.BR dev " | " link " }"
+.sp
+
+.ti -8
+.IR OPTIONS " := { "
+\fB\-V\fR[\fIersion\fR] |
+\fB\-d\fR[\fIetails\fR] }
+\fB\-j\fR[\fIson\fR] }
+\fB\-p\fR[\fIretty\fR] }
+
+.SH OPTIONS
+
+.TP
+.BR "\-V" , " -Version"
+Print the version of the
+.B rdma
+tool and exit.
+
+.TP
+.BR "\-d" , " --details"
+Otuput detailed information.
+
+.TP
+.BR "\-p" , " --pretty"
+When combined with -j generate a pretty JSON output.
+
+.TP
+.BR "\-j" , " --json"
+Generate JSON output.
+
+.SS
+.I OBJECT
+
+.TP
+.B dev
+- RDMA device.
+
+.TP
+.B link
+- RDMA port related.
+
+.PP
+The names of all objects may be written in full or
+abbreviated form, for example
+.B stats
+can be abbreviated as
+.B stat
+or just
+.B s.
+
+.SS
+.I COMMAND
+
+Specifies the action to perform on the object.
+The set of possible actions depends on the object type.
+As a rule, it is possible to
+.B show
+(or
+.B list
+) objects, but some objects do not allow all of these operations
+or have some additional commands. The
+.B help
+command is available for all objects. It prints
+out a list of available commands and argument syntax conventions.
+.sp
+If no command is given, some default command is assumed.
+Usually it is
+.B list
+or, if the objects of this class cannot be listed,
+.BR "help" .
+
+.SH EXIT STATUS
+Exit status is 0 if command was successful or a positive integer upon failure.
+
+.SH SEE ALSO
+.BR rdma-dev (8),
+.BR rdma-link (8),
+.br
+
+.SH REPORTING BUGS
+Report any bugs to the Linux RDMA mailing list
+.B <linux-rdma@vger.kernel.org>
+where the development and maintenance is primarily done.
+You do not have to be subscribed to the list to send a message there.
+
+.SH AUTHOR
+Leon Romanovsky <leonro@mellanox.com>

^ permalink raw reply related

* [PATCH REPOST v5 iproute2 7/8] rdma: Add json output to link object
From: Leon Romanovsky @ 2017-08-17  6:56 UTC (permalink / raw)
  To: Doug Ledford, Stephen Hemminger
  Cc: linux-rdma, Leon Romanovsky, Dennis Dalessandro, Jason Gunthorpe,
	Jiri Pirko, Ariel Almog, David Laight, Linux Netdev
In-Reply-To: <20170817065614.1393-1-leonro@mellanox.com>

An example for the JSON output for two devices system.

root@mtr-leonro:~# rdma link -d -p -j
[{
        "ifindex": 1,
        "port": 1,
        "ifname": "mlx5_0/1",
        "subnet_prefix": "fe80:0000:0000:0000",
        "lid": 13399,
        "sm_lid": 49151,
        "lmc": 0,
        "state": "ACTIVE",
        "physical_state": "LINK_UP",
        "caps": ["AUTO_MIG"
        ]
    },{
        "ifindex": 2,
        "port": 1,
        "ifname": "mlx5_1/1",
        "subnet_prefix": "fe80:0000:0000:0000",
        "lid": 13400,
        "sm_lid": 49151,
        "lmc": 0,
        "state": "ACTIVE",
        "physical_state": "LINK_UP",
        "caps": ["AUTO_MIG"
        ]
    }
]

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
 rdma/link.c  | 144 +++++++++++++++++++++++++++++++++++++++++++----------------
 rdma/rdma.h  |   1 -
 rdma/utils.c |   8 ----
 3 files changed, 105 insertions(+), 48 deletions(-)

diff --git a/rdma/link.c b/rdma/link.c
index b0e5bee0..eae96cd8 100644
--- a/rdma/link.c
+++ b/rdma/link.c
@@ -56,7 +56,7 @@ static const char *caps_to_str(uint32_t idx)
 	return "UNKNOWN";
 }

-static void link_print_caps(struct nlattr **tb)
+static void link_print_caps(struct rd *rd, struct nlattr **tb)
 {
 	uint64_t caps;
 	uint32_t idx;
@@ -66,54 +66,89 @@ static void link_print_caps(struct nlattr **tb)

 	caps = mnl_attr_get_u64(tb[RDMA_NLDEV_ATTR_CAP_FLAGS]);

-	pr_out("\n    caps: <");
+	if (rd->json_output) {
+		jsonw_name(rd->jw, "caps");
+		jsonw_start_array(rd->jw);
+	} else {
+		pr_out("\n    caps: <");
+	}
 	for (idx = 0; caps; idx++) {
 		if (caps & 0x1) {
-			pr_out("%s", caps_to_str(idx));
-			if (caps >> 0x1)
-				pr_out(", ");
+			if (rd->json_output) {
+				jsonw_string(rd->jw, caps_to_str(idx));
+			} else {
+				pr_out("%s", caps_to_str(idx));
+				if (caps >> 0x1)
+					pr_out(", ");
+			}
 		}
 		caps >>= 0x1;
 	}

-	pr_out(">");
+	if (rd->json_output)
+		jsonw_end_array(rd->jw);
+	else
+		pr_out(">");
 }

-static void link_print_subnet_prefix(struct nlattr **tb)
+static void link_print_subnet_prefix(struct rd *rd, struct nlattr **tb)
 {
 	uint64_t subnet_prefix;
+	uint16_t vp[4];
+	char str[32];

 	if (!tb[RDMA_NLDEV_ATTR_SUBNET_PREFIX])
 		return;

 	subnet_prefix = mnl_attr_get_u64(tb[RDMA_NLDEV_ATTR_SUBNET_PREFIX]);
-	rd_print_u64("subnet_prefix", subnet_prefix);
+	memcpy(vp, &subnet_prefix, sizeof(uint64_t));
+	snprintf(str, 32, "%04x:%04x:%04x:%04x", vp[3], vp[2], vp[1], vp[0]);
+	if (rd->json_output)
+		jsonw_string_field(rd->jw, "subnet_prefix", str);
+	else
+		pr_out("subnet_prefix %s ", str);
 }

-static void link_print_lid(struct nlattr **tb)
+static void link_print_lid(struct rd *rd, struct nlattr **tb)
 {
+	uint32_t lid;
+
 	if (!tb[RDMA_NLDEV_ATTR_LID])
 		return;

-	pr_out("lid %u ",
-	       mnl_attr_get_u32(tb[RDMA_NLDEV_ATTR_LID]));
+	lid = mnl_attr_get_u32(tb[RDMA_NLDEV_ATTR_LID]);
+	if (rd->json_output)
+		jsonw_uint_field(rd->jw, "lid", lid);
+	else
+		pr_out("lid %u ", lid);
 }

-static void link_print_sm_lid(struct nlattr **tb)
+static void link_print_sm_lid(struct rd *rd, struct nlattr **tb)
 {
+	uint32_t sm_lid;
+
 	if (!tb[RDMA_NLDEV_ATTR_SM_LID])
 		return;

-	pr_out("sm_lid %u ",
-	       mnl_attr_get_u32(tb[RDMA_NLDEV_ATTR_SM_LID]));
+	sm_lid = mnl_attr_get_u32(tb[RDMA_NLDEV_ATTR_SM_LID]);
+	if (rd->json_output)
+		jsonw_uint_field(rd->jw, "sm_lid", sm_lid);
+	else
+		pr_out("sm_lid %u ", sm_lid);
 }

-static void link_print_lmc(struct nlattr **tb)
+static void link_print_lmc(struct rd *rd, struct nlattr **tb)
 {
+	uint8_t lmc;
+
 	if (!tb[RDMA_NLDEV_ATTR_LMC])
 		return;

-	pr_out("lmc %u ", mnl_attr_get_u8(tb[RDMA_NLDEV_ATTR_LMC]));
+	lmc = mnl_attr_get_u8(tb[RDMA_NLDEV_ATTR_LMC]);
+	if (rd->json_output)
+		jsonw_uint_field(rd->jw, "lmc", lmc);
+	else
+		pr_out("lmc %u ", lmc);
 }

 static const char *link_state_to_str(uint8_t link_state)
@@ -127,7 +162,7 @@ static const char *link_state_to_str(uint8_t link_state)
 	return "UNKNOWN";
 }

-static void link_print_state(struct nlattr **tb)
+static void link_print_state(struct rd *rd, struct nlattr **tb)
 {
 	uint8_t state;

@@ -135,7 +170,10 @@ static void link_print_state(struct nlattr **tb)
 		return;

 	state = mnl_attr_get_u8(tb[RDMA_NLDEV_ATTR_PORT_STATE]);
-	pr_out("state %s ", link_state_to_str(state));
+	if (rd->json_output)
+		jsonw_string_field(rd->jw, "state", link_state_to_str(state));
+	else
+		pr_out("state %s ", link_state_to_str(state));
 }

 static const char *phys_state_to_str(uint8_t phys_state)
@@ -152,7 +190,7 @@ static const char *phys_state_to_str(uint8_t phys_state)
 	return "UNKNOWN";
 };

-static void link_print_phys_state(struct nlattr **tb)
+static void link_print_phys_state(struct rd *rd, struct nlattr **tb)
 {
 	uint8_t phys_state;

@@ -160,13 +198,19 @@ static void link_print_phys_state(struct nlattr **tb)
 		return;

 	phys_state = mnl_attr_get_u8(tb[RDMA_NLDEV_ATTR_PORT_PHYS_STATE]);
-	pr_out("physical_state %s ", phys_state_to_str(phys_state));
+	if (rd->json_output)
+		jsonw_string_field(rd->jw, "physical_state",
+				   phys_state_to_str(phys_state));
+	else
+		pr_out("physical_state %s ", phys_state_to_str(phys_state));
 }

 static int link_parse_cb(const struct nlmsghdr *nlh, void *data)
 {
 	struct nlattr *tb[RDMA_NLDEV_ATTR_MAX] = {};
 	struct rd *rd = data;
+	uint32_t port, idx;
+	char name[32];

 	mnl_attr_parse(nlh, 0, rd_attr_cb, tb);
 	if (!tb[RDMA_NLDEV_ATTR_DEV_INDEX] || !tb[RDMA_NLDEV_ATTR_DEV_NAME])
@@ -177,21 +221,31 @@ static int link_parse_cb(const struct nlmsghdr *nlh, void *data)
 		return MNL_CB_ERROR;
 	}

-	pr_out("%u/%u: %s/%u: ",
-	       mnl_attr_get_u32(tb[RDMA_NLDEV_ATTR_DEV_INDEX]),
-	       mnl_attr_get_u32(tb[RDMA_NLDEV_ATTR_PORT_INDEX]),
-	       mnl_attr_get_str(tb[RDMA_NLDEV_ATTR_DEV_NAME]),
-	       mnl_attr_get_u32(tb[RDMA_NLDEV_ATTR_PORT_INDEX]));
-	link_print_subnet_prefix(tb);
-	link_print_lid(tb);
-	link_print_sm_lid(tb);
-	link_print_lmc(tb);
-	link_print_state(tb);
-	link_print_phys_state(tb);
+	idx = mnl_attr_get_u32(tb[RDMA_NLDEV_ATTR_DEV_INDEX]);
+	port = mnl_attr_get_u32(tb[RDMA_NLDEV_ATTR_PORT_INDEX]);
+	snprintf(name, 32, "%s/%u",
+		 mnl_attr_get_str(tb[RDMA_NLDEV_ATTR_DEV_NAME]), port);
+
+	if (rd->json_output) {
+		jsonw_uint_field(rd->jw, "ifindex", idx);
+		jsonw_uint_field(rd->jw, "port", port);
+		jsonw_string_field(rd->jw, "ifname", name);
+
+	} else {
+		pr_out("%u/%u: %s: ", idx, port, name);
+	}
+
+	link_print_subnet_prefix(rd, tb);
+	link_print_lid(rd, tb);
+	link_print_sm_lid(rd, tb);
+	link_print_lmc(rd, tb);
+	link_print_state(rd, tb);
+	link_print_phys_state(rd, tb);
 	if (rd->show_details)
-		link_print_caps(tb);
+		link_print_caps(rd, tb);

-	pr_out("\n");
+	if (!rd->json_output)
+		pr_out("\n");
 	return MNL_CB_OK;
 }

@@ -208,7 +262,12 @@ static int link_no_args(struct rd *rd)
 	if (ret)
 		return ret;

-	return rd_recv_msg(rd, link_parse_cb, rd, seq);
+	if (rd->json_output)
+		jsonw_start_object(rd->jw);
+	ret = rd_recv_msg(rd, link_parse_cb, rd, seq);
+	if (rd->json_output)
+		jsonw_end_object(rd->jw);
+	return ret;
 }

 static int link_one_show(struct rd *rd)
@@ -225,8 +284,10 @@ static int link_show(struct rd *rd)
 {
 	struct dev_map *dev_map;
 	uint32_t port;
-	int ret;
+	int ret = 0;

+	if (rd->json_output)
+		jsonw_start_array(rd->jw);
 	if (rd_no_arg(rd)) {
 		list_for_each_entry(dev_map, &rd->dev_map_list, list) {
 			rd->dev_idx = dev_map->idx;
@@ -234,7 +295,7 @@ static int link_show(struct rd *rd)
 				rd->port_idx = port;
 				ret = link_one_show(rd);
 				if (ret)
-					return ret;
+					goto out;
 			}
 		}

@@ -243,7 +304,8 @@ static int link_show(struct rd *rd)
 		port = get_port_from_argv(rd);
 		if (!dev_map || port > dev_map->num_ports) {
 			pr_err("Wrong device name\n");
-			return -ENOENT;
+			ret = -ENOENT;
+			goto out;
 		}
 		rd_arg_inc(rd);
 		rd->dev_idx = dev_map->idx;
@@ -251,7 +313,7 @@ static int link_show(struct rd *rd)
 		for (; rd->port_idx < dev_map->num_ports + 1; rd->port_idx++) {
 			ret = link_one_show(rd);
 			if (ret)
-				return ret;
+				goto out;
 			if (port)
 				/*
 				 * We got request to show link for devname
@@ -260,7 +322,11 @@ static int link_show(struct rd *rd)
 				break;
 		}
 	}
-	return 0;
+
+out:
+	if (rd->json_output)
+		jsonw_end_array(rd->jw);
+	return ret;
 }

 int cmd_link(struct rd *rd)
diff --git a/rdma/rdma.h b/rdma/rdma.h
index 5904f177..d62a0ac8 100644
--- a/rdma/rdma.h
+++ b/rdma/rdma.h
@@ -68,7 +68,6 @@ void rd_arg_inc(struct rd *rd);
 char *rd_argv(struct rd *rd);
 uint32_t get_port_from_argv(struct rd *rd);

-void rd_print_u64(char *name, uint64_t val);
 /*
  * Commands interface
  */
diff --git a/rdma/utils.c b/rdma/utils.c
index 91d05271..eb4377cf 100644
--- a/rdma/utils.c
+++ b/rdma/utils.c
@@ -59,14 +59,6 @@ uint32_t get_port_from_argv(struct rd *rd)
 	return slash ? atoi(slash + 1) : 0;
 }

-void rd_print_u64(char *name, uint64_t val)
-{
-	uint16_t vp[4];
-
-	memcpy(vp, &val, sizeof(uint64_t));
-	pr_out("%s %04x:%04x:%04x:%04x ", name, vp[3], vp[2], vp[1], vp[0]);
-}
-
 static struct dev_map *dev_map_alloc(const char *dev_name)
 {
 	struct dev_map *dev_map;

^ permalink raw reply related

* [PATCH REPOST v5 iproute2 6/8] rdma: Implement json output for dev object
From: Leon Romanovsky @ 2017-08-17  6:56 UTC (permalink / raw)
  To: Doug Ledford, Stephen Hemminger
  Cc: linux-rdma, Leon Romanovsky, Dennis Dalessandro, Jason Gunthorpe,
	Jiri Pirko, Ariel Almog, David Laight, Linux Netdev
In-Reply-To: <20170817065614.1393-1-leonro@mellanox.com>

The example output for machine with two devices

root@mtr-leonro:~# rdma dev -j -p
[{
	"ifindex": 1,
	"ifname": "mlx5_0",
	"node_type": "ca",
	"fw": "2.8.9999",
	"node_guid": "5254:00c0:fe12:3457",
	"sys_image_guid": 5254:00c0:fe12:3457",
	"caps": [ "BAD_PKEY_CNTR", "BAD_QKEY_CNTR", "CHANGE_PHY_POR",
		  "PORT_ACTIVE_EVENT", "SYS_IMAGE_GUID", "RC_RNR_NAK_GEN",
		  "MEM_WINDOW", "UD_IP_CSUM", "UD_TSO", "XRC",
		  "MEM_MGT_EXTENSIONS", "BLOCK_MULTICAST_LOOPBACK",
		  "MEM_WINDOW_TYPE_2B", "RAW_IP_CSUM",
		  "MANAGED_FLOW_STEERING", "RESIZE_MAX_WR" ]
	},{
	"ifindex": 2,
	"ifname": mlx5_1,
	"node_type": "ca",
	"fw": "2.8.9999",
	"node_guid": "5254:00c0:fe12:3458",
	"sys_image_guid": "5254:00c0:fe12:3458",
	"caps": [ "BAD_PKEY_CNTR", "BAD_QKEY_CNTR", "CHANGE_PHY_POR",
		  "PORT_ACTIVE_EVENT", "SYS_IMAGE_GUID", "RC_RNR_NAK_GEN",
		  "MEM_WINDOW", "UD_IP_CSUM", "UD_TSO", "XRC",
		  "MEM_MGT_EXTENSIONS", "BLOCK_MULTICAST_LOOPBACK",
		  "MEM_WINDOW_TYPE_2B", "RAW_IP_CSUM",
		  "MANAGED_FLOW_STEERING", "RESIZE_MAX_WR" ]
	}
]

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
 rdma/dev.c | 110 +++++++++++++++++++++++++++++++++++++++++++++----------------
 1 file changed, 82 insertions(+), 28 deletions(-)

diff --git a/rdma/dev.c b/rdma/dev.c
index f6b55bae..9fadf3ac 100644
--- a/rdma/dev.c
+++ b/rdma/dev.c
@@ -66,7 +66,7 @@ static const char *dev_caps_to_str(uint32_t idx)
 	return "UNKNOWN";
 }

-static void dev_print_caps(struct nlattr **tb)
+static void dev_print_caps(struct rd *rd, struct nlattr **tb)
 {
 	uint64_t caps;
 	uint32_t idx;
@@ -76,48 +76,78 @@ static void dev_print_caps(struct nlattr **tb)

 	caps = mnl_attr_get_u64(tb[RDMA_NLDEV_ATTR_CAP_FLAGS]);

-	pr_out("\n    caps: <");
+	if (rd->json_output) {
+		jsonw_name(rd->jw, "caps");
+		jsonw_start_array(rd->jw);
+	} else {
+		pr_out("\n    caps: <");
+	}
 	for (idx = 0; caps; idx++) {
 		if (caps & 0x1) {
-			pr_out("%s", dev_caps_to_str(idx));
-			if (caps >> 0x1)
-				pr_out(", ");
+			if (rd->json_output) {
+				jsonw_string(rd->jw, dev_caps_to_str(idx));
+			} else {
+				pr_out("%s", dev_caps_to_str(idx));
+				if (caps >> 0x1)
+					pr_out(", ");
+			}
 		}
 		caps >>= 0x1;
 	}

-	pr_out(">");
+	if (rd->json_output)
+		jsonw_end_array(rd->jw);
+	else
+		pr_out(">");
 }

-static void dev_print_fw(struct nlattr **tb)
+static void dev_print_fw(struct rd *rd, struct nlattr **tb)
 {
+	const char *str;
 	if (!tb[RDMA_NLDEV_ATTR_FW_VERSION])
 		return;

-	pr_out("fw %s ",
-	       mnl_attr_get_str(tb[RDMA_NLDEV_ATTR_FW_VERSION]));
+	str = mnl_attr_get_str(tb[RDMA_NLDEV_ATTR_FW_VERSION]);
+	if (rd->json_output)
+		jsonw_string_field(rd->jw, "fw", str);
+	else
+		pr_out("fw %s ", str);
 }

-static void dev_print_node_guid(struct nlattr **tb)
+static void dev_print_node_guid(struct rd *rd, struct nlattr **tb)
 {
 	uint64_t node_guid;
+	uint16_t vp[4];
+	char str[32];

 	if (!tb[RDMA_NLDEV_ATTR_NODE_GUID])
 		return;

 	node_guid = mnl_attr_get_u64(tb[RDMA_NLDEV_ATTR_NODE_GUID]);
-	rd_print_u64("node_guid", node_guid);
+	memcpy(vp, &node_guid, sizeof(uint64_t));
+	snprintf(str, 32, "%04x:%04x:%04x:%04x", vp[3], vp[2], vp[1], vp[0]);
+	if (rd->json_output)
+		jsonw_string_field(rd->jw, "node_guid", str);
+	else
+		pr_out("node_guid %s ", str);
 }

-static void dev_print_sys_image_guid(struct nlattr **tb)
+static void dev_print_sys_image_guid(struct rd *rd, struct nlattr **tb)
 {
 	uint64_t sys_image_guid;
+	uint16_t vp[4];
+	char str[32];

 	if (!tb[RDMA_NLDEV_ATTR_SYS_IMAGE_GUID])
 		return;

 	sys_image_guid = mnl_attr_get_u64(tb[RDMA_NLDEV_ATTR_SYS_IMAGE_GUID]);
-	rd_print_u64("sys_image_guid", sys_image_guid);
+	memcpy(vp, &sys_image_guid, sizeof(uint64_t));
+	snprintf(str, 32, "%04x:%04x:%04x:%04x", vp[3], vp[2], vp[1], vp[0]);
+	if (rd->json_output)
+		jsonw_string_field(rd->jw, "sys_image_guid", str);
+	else
+		pr_out("sys_image_guid %s ", str);
 }

 static const char *node_type_to_str(uint8_t node_type)
@@ -131,37 +161,51 @@ static const char *node_type_to_str(uint8_t node_type)
 	return "unknown";
 }

-static void dev_print_node_type(struct nlattr **tb)
+static void dev_print_node_type(struct rd *rd, struct nlattr **tb)
 {
+	const char *node_str;
 	uint8_t node_type;

 	if (!tb[RDMA_NLDEV_ATTR_DEV_NODE_TYPE])
 		return;

 	node_type = mnl_attr_get_u8(tb[RDMA_NLDEV_ATTR_DEV_NODE_TYPE]);
-	pr_out("node_type %s ", node_type_to_str(node_type));
+	node_str = node_type_to_str(node_type);
+	if (rd->json_output)
+		jsonw_string_field(rd->jw, "node_type", node_str);
+	else
+		pr_out("node_type %s ", node_str);
 }

 static int dev_parse_cb(const struct nlmsghdr *nlh, void *data)
 {
 	struct nlattr *tb[RDMA_NLDEV_ATTR_MAX] = {};
 	struct rd *rd = data;
+	const char *name;
+	uint32_t idx;

 	mnl_attr_parse(nlh, 0, rd_attr_cb, tb);
 	if (!tb[RDMA_NLDEV_ATTR_DEV_INDEX] || !tb[RDMA_NLDEV_ATTR_DEV_NAME])
 		return MNL_CB_ERROR;

-	pr_out("%u: %s: ",
-	       mnl_attr_get_u32(tb[RDMA_NLDEV_ATTR_DEV_INDEX]),
-	       mnl_attr_get_str(tb[RDMA_NLDEV_ATTR_DEV_NAME]));
-	dev_print_node_type(tb);
-	dev_print_fw(tb);
-	dev_print_node_guid(tb);
-	dev_print_sys_image_guid(tb);
+	idx =  mnl_attr_get_u32(tb[RDMA_NLDEV_ATTR_DEV_INDEX]);
+	name = mnl_attr_get_str(tb[RDMA_NLDEV_ATTR_DEV_NAME]);
+	if (rd->json_output) {
+		jsonw_uint_field(rd->jw, "ifindex", idx);
+		jsonw_string_field(rd->jw, "ifname", name);
+	} else {
+		pr_out("%u: %s: ", idx, name);
+	}
+
+	dev_print_node_type(rd, tb);
+	dev_print_fw(rd, tb);
+	dev_print_node_guid(rd, tb);
+	dev_print_sys_image_guid(rd, tb);
 	if (rd->show_details)
-		dev_print_caps(tb);
+		dev_print_caps(rd, tb);

-	pr_out("\n");
+	if (!rd->json_output)
+		pr_out("\n");
 	return MNL_CB_OK;
 }

@@ -177,7 +221,12 @@ static int dev_no_args(struct rd *rd)
 	if (ret)
 		return ret;

-	return rd_recv_msg(rd, dev_parse_cb, rd, seq);
+	if (rd->json_output)
+		jsonw_start_object(rd->jw);
+	ret = rd_recv_msg(rd, dev_parse_cb, rd, seq);
+	if (rd->json_output)
+		jsonw_end_object(rd->jw);
+	return ret;
 }

 static int dev_one_show(struct rd *rd)
@@ -195,24 +244,29 @@ static int dev_show(struct rd *rd)
 	struct dev_map *dev_map;
 	int ret = 0;

+	if (rd->json_output)
+		jsonw_start_array(rd->jw);
 	if (rd_no_arg(rd)) {
 		list_for_each_entry(dev_map, &rd->dev_map_list, list) {
 			rd->dev_idx = dev_map->idx;
 			ret = dev_one_show(rd);
 			if (ret)
-				return ret;
+				goto out;
 		}
-
 	} else {
 		dev_map = dev_map_lookup(rd, false);
 		if (!dev_map) {
 			pr_err("Wrong device name\n");
-			return -ENOENT;
+			ret = -ENOENT;
+			goto out;
 		}
 		rd_arg_inc(rd);
 		rd->dev_idx = dev_map->idx;
 		ret = dev_one_show(rd);
 	}
+out:
+	if (rd->json_output)
+		jsonw_end_array(rd->jw);
 	return ret;
 }

^ permalink raw reply related

* [PATCH REPOST v5 iproute2 5/8] rdma: Add json and pretty outputs
From: Leon Romanovsky @ 2017-08-17  6:56 UTC (permalink / raw)
  To: Doug Ledford, Stephen Hemminger
  Cc: linux-rdma, Leon Romanovsky, Dennis Dalessandro, Jason Gunthorpe,
	Jiri Pirko, Ariel Almog, David Laight, Linux Netdev
In-Reply-To: <20170817065614.1393-1-leonro@mellanox.com>

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
 rdma/rdma.c | 31 ++++++++++++++++++++++++++++---
 rdma/rdma.h |  4 ++++
 2 files changed, 32 insertions(+), 3 deletions(-)

diff --git a/rdma/rdma.c b/rdma/rdma.c
index 74c09e8b..f9f4f2a2 100644
--- a/rdma/rdma.c
+++ b/rdma/rdma.c
@@ -16,7 +16,7 @@ static void help(char *name)
 {
 	pr_out("Usage: %s [ OPTIONS ] OBJECT { COMMAND | help }\n"
 	       "where  OBJECT := { dev | link | help }\n"
-	       "       OPTIONS := { -V[ersion] | -d[etails]}\n", name);
+	       "       OPTIONS := { -V[ersion] | -d[etails] | -j[son] | -p[retty]}\n", name);
 }

 static int cmd_help(struct rd *rd)
@@ -47,6 +47,16 @@ static int rd_init(struct rd *rd, int argc, char **argv, char *filename)
 	rd->argc = argc;
 	rd->argv = argv;
 	INIT_LIST_HEAD(&rd->dev_map_list);
+
+	if (rd->json_output) {
+		rd->jw = jsonw_new(stdout);
+		if (!rd->jw) {
+			pr_err("Failed to create JSON writer\n");
+			return -ENOMEM;
+		}
+		jsonw_pretty(rd->jw, rd->pretty_output);
+	}
+
 	rd->buff = malloc(MNL_SOCKET_BUFFER_SIZE);
 	if (!rd->buff)
 		return -ENOMEM;
@@ -62,6 +72,8 @@ static int rd_init(struct rd *rd, int argc, char **argv, char *filename)

 static void rd_free(struct rd *rd)
 {
+	if (rd->json_output)
+		jsonw_destroy(&rd->jw);
 	free(rd->buff);
 	rd_free_devmap(rd);
 }
@@ -71,10 +83,14 @@ int main(int argc, char **argv)
 	static const struct option long_options[] = {
 		{ "version",		no_argument,		NULL, 'V' },
 		{ "help",		no_argument,		NULL, 'h' },
+		{ "json",		no_argument,		NULL, 'j' },
+		{ "pretty",		no_argument,		NULL, 'p' },
 		{ "details",		no_argument,		NULL, 'd' },
 		{ NULL, 0, NULL, 0 }
 	};
+	bool pretty_output = false;
 	bool show_details = false;
+	bool json_output = false;
 	char *filename;
 	struct rd rd;
 	int opt;
@@ -82,16 +98,22 @@ int main(int argc, char **argv)

 	filename = basename(argv[0]);

-	while ((opt = getopt_long(argc, argv, "Vhd",
+	while ((opt = getopt_long(argc, argv, "Vhdpj",
 				  long_options, NULL)) >= 0) {
 		switch (opt) {
 		case 'V':
 			printf("%s utility, iproute2-ss%s\n",
 			       filename, SNAPSHOT);
 			return EXIT_SUCCESS;
+		case 'p':
+			pretty_output = true;
+			break;
 		case 'd':
 			show_details = true;
 			break;
+		case 'j':
+			json_output = true;
+			break;
 		case 'h':
 			help(filename);
 			return EXIT_SUCCESS;
@@ -105,11 +127,14 @@ int main(int argc, char **argv)
 	argc -= optind;
 	argv += optind;

+	rd.show_details = show_details;
+	rd.json_output = json_output;
+	rd.pretty_output = pretty_output;
+
 	err = rd_init(&rd, argc, argv, filename);
 	if (err)
 		goto out;

-	rd.show_details = show_details;
 	err = rd_cmd(&rd);
 out:
 	/* Always cleanup */
diff --git a/rdma/rdma.h b/rdma/rdma.h
index 8037e2e6..5904f177 100644
--- a/rdma/rdma.h
+++ b/rdma/rdma.h
@@ -23,6 +23,7 @@

 #include "list.h"
 #include "utils.h"
+#include "json_writer.h"

 #define pr_err(args...) fprintf(stderr, ##args)
 #define pr_out(args...) fprintf(stdout, ##args)
@@ -48,6 +49,9 @@ struct rd {
 	struct mnl_socket *nl;
 	struct nlmsghdr *nlh;
 	char *buff;
+	json_writer_t *jw;
+	bool json_output;
+	bool pretty_output;
 };

 struct rd_cmd {

^ permalink raw reply related

* [PATCH REPOST v5 iproute2 4/8] rdma: Add link object
From: Leon Romanovsky @ 2017-08-17  6:56 UTC (permalink / raw)
  To: Doug Ledford, Stephen Hemminger
  Cc: linux-rdma, Leon Romanovsky, Dennis Dalessandro, Jason Gunthorpe,
	Jiri Pirko, Ariel Almog, David Laight, Linux Netdev
In-Reply-To: <20170817065614.1393-1-leonro@mellanox.com>

Link (port) object represent struct ib_port to the user space.

Link properties:
 * Port capabilities
 * IB subnet prefix
 * LID, SM_LID and LMC
 * Port state
 * Physical state

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
 rdma/Makefile |   2 +-
 rdma/link.c   | 277 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 rdma/rdma.c   |   3 +-
 rdma/utils.c  |   5 ++
 4 files changed, 285 insertions(+), 2 deletions(-)
 create mode 100644 rdma/link.c

diff --git a/rdma/Makefile b/rdma/Makefile
index 123d7ac5..1a9e4b1a 100644
--- a/rdma/Makefile
+++ b/rdma/Makefile
@@ -2,7 +2,7 @@ include ../Config

 ifeq ($(HAVE_MNL),y)

-RDMA_OBJ = rdma.o utils.o dev.o
+RDMA_OBJ = rdma.o utils.o dev.o link.o

 TARGETS=rdma
 CFLAGS += $(shell $(PKG_CONFIG) libmnl --cflags)
diff --git a/rdma/link.c b/rdma/link.c
new file mode 100644
index 00000000..b0e5bee0
--- /dev/null
+++ b/rdma/link.c
@@ -0,0 +1,277 @@
+/*
+ * link.c	RDMA tool
+ *
+ *              This program is free software; you can redistribute it and/or
+ *              modify it under the terms of the GNU General Public License
+ *              as published by the Free Software Foundation; either version
+ *              2 of the License, or (at your option) any later version.
+ *
+ * Authors:     Leon Romanovsky <leonro@mellanox.com>
+ */
+
+#include "rdma.h"
+
+static int link_help(struct rd *rd)
+{
+	pr_out("Usage: %s link show [DEV/PORT_INDEX]\n", rd->filename);
+	return 0;
+}
+
+static const char *caps_to_str(uint32_t idx)
+{
+#define RDMA_PORT_FLAGS(x) \
+	x(SM, 1) \
+	x(NOTICE, 2) \
+	x(TRAP, 3) \
+	x(OPT_IPD, 4) \
+	x(AUTO_MIGR, 5) \
+	x(SL_MAP, 6) \
+	x(MKEY_NVRAM, 7) \
+	x(PKEY_NVRAM, 8) \
+	x(LED_INFO, 9) \
+	x(SM_DISABLED, 10) \
+	x(SYS_IMAGE_GUIG, 11) \
+	x(PKEY_SW_EXT_PORT_TRAP, 12) \
+	x(EXTENDED_SPEEDS, 14) \
+	x(CM, 16) \
+	x(SNMP_TUNNEL, 17) \
+	x(REINIT, 18) \
+	x(DEVICE_MGMT, 19) \
+	x(VENDOR_CLASS, 20) \
+	x(DR_NOTICE, 21) \
+	x(CAP_MASK_NOTICE, 22) \
+	x(BOOT_MGMT, 23) \
+	x(LINK_LATENCY, 24) \
+	x(CLIENT_REG, 23) \
+	x(IP_BASED_GIDS, 26)
+
+	enum { RDMA_PORT_FLAGS(RDMA_BITMAP_ENUM) };
+
+	static const char * const
+		rdma_port_names[] = { RDMA_PORT_FLAGS(RDMA_BITMAP_NAMES) };
+	#undef RDMA_PORT_FLAGS
+
+	if (idx < ARRAY_SIZE(rdma_port_names) && rdma_port_names[idx])
+		return rdma_port_names[idx];
+	return "UNKNOWN";
+}
+
+static void link_print_caps(struct nlattr **tb)
+{
+	uint64_t caps;
+	uint32_t idx;
+
+	if (!tb[RDMA_NLDEV_ATTR_CAP_FLAGS])
+		return;
+
+	caps = mnl_attr_get_u64(tb[RDMA_NLDEV_ATTR_CAP_FLAGS]);
+
+	pr_out("\n    caps: <");
+	for (idx = 0; caps; idx++) {
+		if (caps & 0x1) {
+			pr_out("%s", caps_to_str(idx));
+			if (caps >> 0x1)
+				pr_out(", ");
+		}
+		caps >>= 0x1;
+	}
+
+	pr_out(">");
+}
+
+static void link_print_subnet_prefix(struct nlattr **tb)
+{
+	uint64_t subnet_prefix;
+
+	if (!tb[RDMA_NLDEV_ATTR_SUBNET_PREFIX])
+		return;
+
+	subnet_prefix = mnl_attr_get_u64(tb[RDMA_NLDEV_ATTR_SUBNET_PREFIX]);
+	rd_print_u64("subnet_prefix", subnet_prefix);
+}
+
+static void link_print_lid(struct nlattr **tb)
+{
+	if (!tb[RDMA_NLDEV_ATTR_LID])
+		return;
+
+	pr_out("lid %u ",
+	       mnl_attr_get_u32(tb[RDMA_NLDEV_ATTR_LID]));
+}
+
+static void link_print_sm_lid(struct nlattr **tb)
+{
+	if (!tb[RDMA_NLDEV_ATTR_SM_LID])
+		return;
+
+	pr_out("sm_lid %u ",
+	       mnl_attr_get_u32(tb[RDMA_NLDEV_ATTR_SM_LID]));
+}
+
+static void link_print_lmc(struct nlattr **tb)
+{
+	if (!tb[RDMA_NLDEV_ATTR_LMC])
+		return;
+
+	pr_out("lmc %u ", mnl_attr_get_u8(tb[RDMA_NLDEV_ATTR_LMC]));
+}
+
+static const char *link_state_to_str(uint8_t link_state)
+{
+	static const char * const link_state_str[] = { "NOP", "DOWN",
+						       "INIT", "ARMED",
+						       "ACTIVE",
+						       "ACTIVE_DEFER" };
+	if (link_state < ARRAY_SIZE(link_state_str))
+		return link_state_str[link_state];
+	return "UNKNOWN";
+}
+
+static void link_print_state(struct nlattr **tb)
+{
+	uint8_t state;
+
+	if (!tb[RDMA_NLDEV_ATTR_PORT_STATE])
+		return;
+
+	state = mnl_attr_get_u8(tb[RDMA_NLDEV_ATTR_PORT_STATE]);
+	pr_out("state %s ", link_state_to_str(state));
+}
+
+static const char *phys_state_to_str(uint8_t phys_state)
+{
+	static const char * const phys_state_str[] = { "NOP", "SLEEP",
+						       "POLLING", "DISABLED",
+						       "ARMED", "LINK_UP",
+						       "LINK_ERROR_RECOVER",
+						       "PHY_TEST", "UNKNOWN",
+						       "OPA_OFFLINE",
+						       "UNKNOWN", "OPA_TEST" };
+	if (phys_state < ARRAY_SIZE(phys_state_str))
+		return phys_state_str[phys_state];
+	return "UNKNOWN";
+};
+
+static void link_print_phys_state(struct nlattr **tb)
+{
+	uint8_t phys_state;
+
+	if (!tb[RDMA_NLDEV_ATTR_PORT_PHYS_STATE])
+		return;
+
+	phys_state = mnl_attr_get_u8(tb[RDMA_NLDEV_ATTR_PORT_PHYS_STATE]);
+	pr_out("physical_state %s ", phys_state_to_str(phys_state));
+}
+
+static int link_parse_cb(const struct nlmsghdr *nlh, void *data)
+{
+	struct nlattr *tb[RDMA_NLDEV_ATTR_MAX] = {};
+	struct rd *rd = data;
+
+	mnl_attr_parse(nlh, 0, rd_attr_cb, tb);
+	if (!tb[RDMA_NLDEV_ATTR_DEV_INDEX] || !tb[RDMA_NLDEV_ATTR_DEV_NAME])
+		return MNL_CB_ERROR;
+
+	if (!tb[RDMA_NLDEV_ATTR_PORT_INDEX]) {
+		pr_err("This tool doesn't support switches yet\n");
+		return MNL_CB_ERROR;
+	}
+
+	pr_out("%u/%u: %s/%u: ",
+	       mnl_attr_get_u32(tb[RDMA_NLDEV_ATTR_DEV_INDEX]),
+	       mnl_attr_get_u32(tb[RDMA_NLDEV_ATTR_PORT_INDEX]),
+	       mnl_attr_get_str(tb[RDMA_NLDEV_ATTR_DEV_NAME]),
+	       mnl_attr_get_u32(tb[RDMA_NLDEV_ATTR_PORT_INDEX]));
+	link_print_subnet_prefix(tb);
+	link_print_lid(tb);
+	link_print_sm_lid(tb);
+	link_print_lmc(tb);
+	link_print_state(tb);
+	link_print_phys_state(tb);
+	if (rd->show_details)
+		link_print_caps(tb);
+
+	pr_out("\n");
+	return MNL_CB_OK;
+}
+
+static int link_no_args(struct rd *rd)
+{
+	uint32_t seq;
+	int ret;
+
+	rd_prepare_msg(rd, RDMA_NLDEV_CMD_PORT_GET, &seq,
+		       (NLM_F_REQUEST | NLM_F_ACK));
+	mnl_attr_put_u32(rd->nlh, RDMA_NLDEV_ATTR_DEV_INDEX, rd->dev_idx);
+	mnl_attr_put_u32(rd->nlh, RDMA_NLDEV_ATTR_PORT_INDEX, rd->port_idx);
+	ret = rd_send_msg(rd);
+	if (ret)
+		return ret;
+
+	return rd_recv_msg(rd, link_parse_cb, rd, seq);
+}
+
+static int link_one_show(struct rd *rd)
+{
+	const struct rd_cmd cmds[] = {
+		{ NULL,		link_no_args},
+		{ 0 }
+	};
+
+	return rd_exec_cmd(rd, cmds, "parameter");
+}
+
+static int link_show(struct rd *rd)
+{
+	struct dev_map *dev_map;
+	uint32_t port;
+	int ret;
+
+	if (rd_no_arg(rd)) {
+		list_for_each_entry(dev_map, &rd->dev_map_list, list) {
+			rd->dev_idx = dev_map->idx;
+			for (port = 1; port < dev_map->num_ports + 1; port++) {
+				rd->port_idx = port;
+				ret = link_one_show(rd);
+				if (ret)
+					return ret;
+			}
+		}
+
+	} else {
+		dev_map = dev_map_lookup(rd, true);
+		port = get_port_from_argv(rd);
+		if (!dev_map || port > dev_map->num_ports) {
+			pr_err("Wrong device name\n");
+			return -ENOENT;
+		}
+		rd_arg_inc(rd);
+		rd->dev_idx = dev_map->idx;
+		rd->port_idx = port ? : 1;
+		for (; rd->port_idx < dev_map->num_ports + 1; rd->port_idx++) {
+			ret = link_one_show(rd);
+			if (ret)
+				return ret;
+			if (port)
+				/*
+				 * We got request to show link for devname
+				 * with port index.
+				 */
+				break;
+		}
+	}
+	return 0;
+}
+
+int cmd_link(struct rd *rd)
+{
+	const struct rd_cmd cmds[] = {
+		{ NULL,		link_show },
+		{ "show",	link_show },
+		{ "list",	link_show },
+		{ "help",	link_help },
+		{ 0 }
+	};
+
+	return rd_exec_cmd(rd, cmds, "link command");
+}
diff --git a/rdma/rdma.c b/rdma/rdma.c
index 9c2bdc8f..74c09e8b 100644
--- a/rdma/rdma.c
+++ b/rdma/rdma.c
@@ -15,7 +15,7 @@
 static void help(char *name)
 {
 	pr_out("Usage: %s [ OPTIONS ] OBJECT { COMMAND | help }\n"
-	       "where  OBJECT := { dev | help }\n"
+	       "where  OBJECT := { dev | link | help }\n"
 	       "       OPTIONS := { -V[ersion] | -d[etails]}\n", name);
 }

@@ -31,6 +31,7 @@ static int rd_cmd(struct rd *rd)
 		{ NULL,		cmd_help },
 		{ "help",	cmd_help },
 		{ "dev",	cmd_dev },
+		{ "link",	cmd_link },
 		{ 0 }
 	};

diff --git a/rdma/utils.c b/rdma/utils.c
index 0e32eefe..91d05271 100644
--- a/rdma/utils.c
+++ b/rdma/utils.c
@@ -107,6 +107,11 @@ static const enum mnl_attr_data_type nldev_policy[RDMA_NLDEV_ATTR_MAX] = {
 	[RDMA_NLDEV_ATTR_FW_VERSION] = MNL_TYPE_NUL_STRING,
 	[RDMA_NLDEV_ATTR_NODE_GUID] = MNL_TYPE_U64,
 	[RDMA_NLDEV_ATTR_SYS_IMAGE_GUID] = MNL_TYPE_U64,
+	[RDMA_NLDEV_ATTR_LID] = MNL_TYPE_U32,
+	[RDMA_NLDEV_ATTR_SM_LID] = MNL_TYPE_U32,
+	[RDMA_NLDEV_ATTR_LMC] = MNL_TYPE_U8,
+	[RDMA_NLDEV_ATTR_PORT_STATE] = MNL_TYPE_U8,
+	[RDMA_NLDEV_ATTR_PORT_PHYS_STATE] = MNL_TYPE_U8,
 	[RDMA_NLDEV_ATTR_DEV_NODE_TYPE] = MNL_TYPE_U8,
 };

^ permalink raw reply related

* [PATCH REPOST v5 iproute2 3/8] rdma: Add dev object
From: Leon Romanovsky @ 2017-08-17  6:56 UTC (permalink / raw)
  To: Doug Ledford, Stephen Hemminger
  Cc: linux-rdma, Leon Romanovsky, Dennis Dalessandro, Jason Gunthorpe,
	Jiri Pirko, Ariel Almog, David Laight, Linux Netdev
In-Reply-To: <20170817065614.1393-1-leonro@mellanox.com>

Device (dev) object represents struct ib_device to the user space.

Device properties:
 * Device capabilities
 * FW version to the device output
 * node_guid and sys_image_guid
 * node_type

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
 rdma/Makefile |   2 +-
 rdma/dev.c    | 230 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 rdma/rdma.c   |   3 +-
 rdma/rdma.h   |  17 +++++
 rdma/utils.c  |  54 +++++++++++++-
 5 files changed, 303 insertions(+), 3 deletions(-)
 create mode 100644 rdma/dev.c

diff --git a/rdma/Makefile b/rdma/Makefile
index 64da2142..123d7ac5 100644
--- a/rdma/Makefile
+++ b/rdma/Makefile
@@ -2,7 +2,7 @@ include ../Config

 ifeq ($(HAVE_MNL),y)

-RDMA_OBJ = rdma.o utils.o
+RDMA_OBJ = rdma.o utils.o dev.o

 TARGETS=rdma
 CFLAGS += $(shell $(PKG_CONFIG) libmnl --cflags)
diff --git a/rdma/dev.c b/rdma/dev.c
new file mode 100644
index 00000000..f6b55bae
--- /dev/null
+++ b/rdma/dev.c
@@ -0,0 +1,230 @@
+/*
+ * dev.c	RDMA tool
+ *
+ *              This program is free software; you can redistribute it and/or
+ *              modify it under the terms of the GNU General Public License
+ *              as published by the Free Software Foundation; either version
+ *              2 of the License, or (at your option) any later version.
+ *
+ * Authors:     Leon Romanovsky <leonro@mellanox.com>
+ */
+
+#include "rdma.h"
+
+static int dev_help(struct rd *rd)
+{
+	pr_out("Usage: %s dev show [DEV]\n", rd->filename);
+	return 0;
+}
+
+static const char *dev_caps_to_str(uint32_t idx)
+{
+#define RDMA_DEV_FLAGS(x) \
+	x(RESIZE_MAX_WR, 0) \
+	x(BAD_PKEY_CNTR, 1) \
+	x(BAD_QKEY_CNTR, 2) \
+	x(RAW_MULTI, 3) \
+	x(AUTO_PATH_MIG, 4) \
+	x(CHANGE_PHY_PORT, 5) \
+	x(UD_AV_PORT_ENFORCE_PORT_ENFORCE, 6) \
+	x(CURR_QP_STATE_MOD, 7) \
+	x(SHUTDOWN_PORT, 8) \
+	x(INIT_TYPE, 9) \
+	x(PORT_ACTIVE_EVENT, 10) \
+	x(SYS_IMAGE_GUID, 11) \
+	x(RC_RNR_NAK_GEN, 12) \
+	x(SRQ_RESIZE, 13) \
+	x(N_NOTIFY_CQ, 14) \
+	x(LOCAL_DMA_LKEY, 15) \
+	x(MEM_WINDOW, 17) \
+	x(UD_IP_CSUM, 18) \
+	x(UD_TSO, 19) \
+	x(XRC, 20) \
+	x(MEM_MGT_EXTENSIONS, 21) \
+	x(BLOCK_MULTICAST_LOOPBACK, 22) \
+	x(MEM_WINDOW_TYPE_2A, 23) \
+	x(MEM_WINDOW_TYPE_2B, 24) \
+	x(RC_IP_CSUM, 25) \
+	x(RAW_IP_CSUM, 26) \
+	x(CROSS_CHANNEL, 27) \
+	x(MANAGED_FLOW_STEERING, 29) \
+	x(SIGNATURE_HANDOVER, 30) \
+	x(ON_DEMAND_PAGING, 31) \
+	x(SG_GAPS_REG, 32) \
+	x(VIRTUAL_FUNCTION, 33) \
+	x(RAW_SCATTER_FCS, 34) \
+	x(RDMA_NETDEV_OPA_VNIC, 35)
+
+	enum { RDMA_DEV_FLAGS(RDMA_BITMAP_ENUM) };
+
+	static const char * const
+		rdma_dev_names[] = { RDMA_DEV_FLAGS(RDMA_BITMAP_NAMES) };
+	#undef RDMA_DEV_FLAGS
+
+	if (idx < ARRAY_SIZE(rdma_dev_names) && rdma_dev_names[idx])
+		return rdma_dev_names[idx];
+	return "UNKNOWN";
+}
+
+static void dev_print_caps(struct nlattr **tb)
+{
+	uint64_t caps;
+	uint32_t idx;
+
+	if (!tb[RDMA_NLDEV_ATTR_CAP_FLAGS])
+		return;
+
+	caps = mnl_attr_get_u64(tb[RDMA_NLDEV_ATTR_CAP_FLAGS]);
+
+	pr_out("\n    caps: <");
+	for (idx = 0; caps; idx++) {
+		if (caps & 0x1) {
+			pr_out("%s", dev_caps_to_str(idx));
+			if (caps >> 0x1)
+				pr_out(", ");
+		}
+		caps >>= 0x1;
+	}
+
+	pr_out(">");
+}
+
+static void dev_print_fw(struct nlattr **tb)
+{
+	if (!tb[RDMA_NLDEV_ATTR_FW_VERSION])
+		return;
+
+	pr_out("fw %s ",
+	       mnl_attr_get_str(tb[RDMA_NLDEV_ATTR_FW_VERSION]));
+}
+
+static void dev_print_node_guid(struct nlattr **tb)
+{
+	uint64_t node_guid;
+
+	if (!tb[RDMA_NLDEV_ATTR_NODE_GUID])
+		return;
+
+	node_guid = mnl_attr_get_u64(tb[RDMA_NLDEV_ATTR_NODE_GUID]);
+	rd_print_u64("node_guid", node_guid);
+}
+
+static void dev_print_sys_image_guid(struct nlattr **tb)
+{
+	uint64_t sys_image_guid;
+
+	if (!tb[RDMA_NLDEV_ATTR_SYS_IMAGE_GUID])
+		return;
+
+	sys_image_guid = mnl_attr_get_u64(tb[RDMA_NLDEV_ATTR_SYS_IMAGE_GUID]);
+	rd_print_u64("sys_image_guid", sys_image_guid);
+}
+
+static const char *node_type_to_str(uint8_t node_type)
+{
+	static const char * const node_type_str[] = { "unknown", "ca",
+						      "switch", "router",
+						      "rnic", "usnic",
+						      "usnic_dp" };
+	if (node_type < ARRAY_SIZE(node_type_str))
+		return node_type_str[node_type];
+	return "unknown";
+}
+
+static void dev_print_node_type(struct nlattr **tb)
+{
+	uint8_t node_type;
+
+	if (!tb[RDMA_NLDEV_ATTR_DEV_NODE_TYPE])
+		return;
+
+	node_type = mnl_attr_get_u8(tb[RDMA_NLDEV_ATTR_DEV_NODE_TYPE]);
+	pr_out("node_type %s ", node_type_to_str(node_type));
+}
+
+static int dev_parse_cb(const struct nlmsghdr *nlh, void *data)
+{
+	struct nlattr *tb[RDMA_NLDEV_ATTR_MAX] = {};
+	struct rd *rd = data;
+
+	mnl_attr_parse(nlh, 0, rd_attr_cb, tb);
+	if (!tb[RDMA_NLDEV_ATTR_DEV_INDEX] || !tb[RDMA_NLDEV_ATTR_DEV_NAME])
+		return MNL_CB_ERROR;
+
+	pr_out("%u: %s: ",
+	       mnl_attr_get_u32(tb[RDMA_NLDEV_ATTR_DEV_INDEX]),
+	       mnl_attr_get_str(tb[RDMA_NLDEV_ATTR_DEV_NAME]));
+	dev_print_node_type(tb);
+	dev_print_fw(tb);
+	dev_print_node_guid(tb);
+	dev_print_sys_image_guid(tb);
+	if (rd->show_details)
+		dev_print_caps(tb);
+
+	pr_out("\n");
+	return MNL_CB_OK;
+}
+
+static int dev_no_args(struct rd *rd)
+{
+	uint32_t seq;
+	int ret;
+
+	rd_prepare_msg(rd, RDMA_NLDEV_CMD_GET,
+		       &seq, (NLM_F_REQUEST | NLM_F_ACK));
+	mnl_attr_put_u32(rd->nlh, RDMA_NLDEV_ATTR_DEV_INDEX, rd->dev_idx);
+	ret = rd_send_msg(rd);
+	if (ret)
+		return ret;
+
+	return rd_recv_msg(rd, dev_parse_cb, rd, seq);
+}
+
+static int dev_one_show(struct rd *rd)
+{
+	const struct rd_cmd cmds[] = {
+		{ NULL,		dev_no_args},
+		{ 0 }
+	};
+
+	return rd_exec_cmd(rd, cmds, "parameter");
+}
+
+static int dev_show(struct rd *rd)
+{
+	struct dev_map *dev_map;
+	int ret = 0;
+
+	if (rd_no_arg(rd)) {
+		list_for_each_entry(dev_map, &rd->dev_map_list, list) {
+			rd->dev_idx = dev_map->idx;
+			ret = dev_one_show(rd);
+			if (ret)
+				return ret;
+		}
+
+	} else {
+		dev_map = dev_map_lookup(rd, false);
+		if (!dev_map) {
+			pr_err("Wrong device name\n");
+			return -ENOENT;
+		}
+		rd_arg_inc(rd);
+		rd->dev_idx = dev_map->idx;
+		ret = dev_one_show(rd);
+	}
+	return ret;
+}
+
+int cmd_dev(struct rd *rd)
+{
+	const struct rd_cmd cmds[] = {
+		{ NULL,		dev_show },
+		{ "show",	dev_show },
+		{ "list",	dev_show },
+		{ "help",	dev_help },
+		{ 0 }
+	};
+
+	return rd_exec_cmd(rd, cmds, "dev command");
+}
diff --git a/rdma/rdma.c b/rdma/rdma.c
index d850e396..9c2bdc8f 100644
--- a/rdma/rdma.c
+++ b/rdma/rdma.c
@@ -15,7 +15,7 @@
 static void help(char *name)
 {
 	pr_out("Usage: %s [ OPTIONS ] OBJECT { COMMAND | help }\n"
-	       "where  OBJECT := { help }\n"
+	       "where  OBJECT := { dev | help }\n"
 	       "       OPTIONS := { -V[ersion] | -d[etails]}\n", name);
 }

@@ -30,6 +30,7 @@ static int rd_cmd(struct rd *rd)
 	const struct rd_cmd cmds[] = {
 		{ NULL,		cmd_help },
 		{ "help",	cmd_help },
+		{ "dev",	cmd_dev },
 		{ 0 }
 	};

diff --git a/rdma/rdma.h b/rdma/rdma.h
index c1ef1059..8037e2e6 100644
--- a/rdma/rdma.h
+++ b/rdma/rdma.h
@@ -22,10 +22,14 @@
 #include <rdma/rdma_netlink.h>

 #include "list.h"
+#include "utils.h"

 #define pr_err(args...) fprintf(stderr, ##args)
 #define pr_out(args...) fprintf(stdout, ##args)

+#define RDMA_BITMAP_ENUM(name, bit_no) RDMA_BITMAP_##name = BIT(bit_no),
+#define RDMA_BITMAP_NAMES(name, bit_no) [bit_no] = #name,
+
 struct dev_map {
 	struct list_head list;
 	char *dev_name;
@@ -39,6 +43,8 @@ struct rd {
 	char *filename;
 	bool show_details;
 	struct list_head dev_map_list;
+	uint32_t dev_idx;
+	uint32_t port_idx;
 	struct mnl_socket *nl;
 	struct nlmsghdr *nlh;
 	char *buff;
@@ -55,12 +61,23 @@ struct rd_cmd {
 bool rd_no_arg(struct rd *rd);
 void rd_arg_inc(struct rd *rd);

+char *rd_argv(struct rd *rd);
+uint32_t get_port_from_argv(struct rd *rd);
+
+void rd_print_u64(char *name, uint64_t val);
+/*
+ * Commands interface
+ */
+int cmd_dev(struct rd *rd);
+int cmd_link(struct rd *rd);
 int rd_exec_cmd(struct rd *rd, const struct rd_cmd *c, const char *str);

 /*
  * Device manipulation
  */
 void rd_free_devmap(struct rd *rd);
+struct dev_map *dev_map_lookup(struct rd *rd, bool allow_port_index);
+struct dev_map *_dev_map_lookup(struct rd *rd, const char *dev_name);

 /*
  * Netlink
diff --git a/rdma/utils.c b/rdma/utils.c
index 9bd7418f..0e32eefe 100644
--- a/rdma/utils.c
+++ b/rdma/utils.c
@@ -16,7 +16,7 @@ static int rd_argc(struct rd *rd)
 	return rd->argc;
 }

-static char *rd_argv(struct rd *rd)
+char *rd_argv(struct rd *rd)
 {
 	if (!rd_argc(rd))
 		return NULL;
@@ -50,6 +50,23 @@ bool rd_no_arg(struct rd *rd)
 	return rd_argc(rd) == 0;
 }

+uint32_t get_port_from_argv(struct rd *rd)
+{
+	char *slash;
+
+	slash = strchr(rd_argv(rd), '/');
+	/* if no port found, return 0 */
+	return slash ? atoi(slash + 1) : 0;
+}
+
+void rd_print_u64(char *name, uint64_t val)
+{
+	uint16_t vp[4];
+
+	memcpy(vp, &val, sizeof(uint64_t));
+	pr_out("%s %04x:%04x:%04x:%04x ", name, vp[3], vp[2], vp[1], vp[0]);
+}
+
 static struct dev_map *dev_map_alloc(const char *dev_name)
 {
 	struct dev_map *dev_map;
@@ -83,8 +100,14 @@ static void dev_map_cleanup(struct rd *rd)
 }

 static const enum mnl_attr_data_type nldev_policy[RDMA_NLDEV_ATTR_MAX] = {
+	[RDMA_NLDEV_ATTR_DEV_INDEX] = MNL_TYPE_U32,
 	[RDMA_NLDEV_ATTR_DEV_NAME] = MNL_TYPE_NUL_STRING,
 	[RDMA_NLDEV_ATTR_PORT_INDEX] = MNL_TYPE_U32,
+	[RDMA_NLDEV_ATTR_CAP_FLAGS] = MNL_TYPE_U64,
+	[RDMA_NLDEV_ATTR_FW_VERSION] = MNL_TYPE_NUL_STRING,
+	[RDMA_NLDEV_ATTR_NODE_GUID] = MNL_TYPE_U64,
+	[RDMA_NLDEV_ATTR_SYS_IMAGE_GUID] = MNL_TYPE_U64,
+	[RDMA_NLDEV_ATTR_DEV_NODE_TYPE] = MNL_TYPE_U8,
 };

 int rd_attr_cb(const struct nlattr *attr, void *data)
@@ -215,3 +238,32 @@ int rd_recv_msg(struct rd *rd, mnl_cb_t callback, void *data, unsigned int seq)
 	mnl_socket_close(rd->nl);
 	return ret;
 }
+
+struct dev_map *_dev_map_lookup(struct rd *rd, const char *dev_name)
+{
+	struct dev_map *dev_map;
+
+	list_for_each_entry(dev_map, &rd->dev_map_list, list)
+		if (strcmp(dev_name, dev_map->dev_name) == 0)
+			return dev_map;
+
+	return NULL;
+}
+
+struct dev_map *dev_map_lookup(struct rd *rd, bool allow_port_index)
+{
+	struct dev_map *dev_map;
+	char *dev_name;
+	char *slash;
+
+	dev_name = strdup(rd_argv(rd));
+	if (allow_port_index) {
+		slash = strrchr(dev_name, '/');
+		if (slash)
+			*slash = '\0';
+	}
+
+	dev_map = _dev_map_lookup(rd, dev_name);
+	free(dev_name);
+	return dev_map;
+}

^ permalink raw reply related

* [PATCH REPOST v5 iproute2 2/8] rdma: Add basic infrastructure for RDMA tool
From: Leon Romanovsky @ 2017-08-17  6:56 UTC (permalink / raw)
  To: Doug Ledford, Stephen Hemminger
  Cc: linux-rdma, Leon Romanovsky, Dennis Dalessandro, Jason Gunthorpe,
	Jiri Pirko, Ariel Almog, David Laight, Linux Netdev
In-Reply-To: <20170817065614.1393-1-leonro@mellanox.com>

RDMA devices are cross-functional devices from one side,
but very tailored for the specific markets from another.

Such diversity caused to spread of RDMA related configuration
across various tools, e.g. devlink, ip, ethtool, ib specific and
vendor specific solutions.

This patch adds ability to fill device and port information
by reading RDMA netlink.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
 Makefile        |   2 +-
 rdma/.gitignore |   1 +
 rdma/Makefile   |  22 ++++++
 rdma/rdma.c     | 116 ++++++++++++++++++++++++++++++
 rdma/rdma.h     |  73 +++++++++++++++++++
 rdma/utils.c    | 217 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 6 files changed, 430 insertions(+), 1 deletion(-)
 create mode 100644 rdma/.gitignore
 create mode 100644 rdma/Makefile
 create mode 100644 rdma/rdma.c
 create mode 100644 rdma/rdma.h
 create mode 100644 rdma/utils.c

diff --git a/Makefile b/Makefile
index 1f88f7f5..dbb4a4af 100644
--- a/Makefile
+++ b/Makefile
@@ -49,7 +49,7 @@ WFLAGS += -Wmissing-declarations -Wold-style-definition -Wformat=2
 CFLAGS := $(WFLAGS) $(CCOPTS) -I../include $(DEFINES) $(CFLAGS)
 YACCFLAGS = -d -t -v

-SUBDIRS=lib ip tc bridge misc netem genl tipc devlink man
+SUBDIRS=lib ip tc bridge misc netem genl tipc devlink rdma man

 LIBNETLINK=../lib/libnetlink.a ../lib/libutil.a
 LDLIBS += $(LIBNETLINK)
diff --git a/rdma/.gitignore b/rdma/.gitignore
new file mode 100644
index 00000000..51fb172b
--- /dev/null
+++ b/rdma/.gitignore
@@ -0,0 +1 @@
+rdma
diff --git a/rdma/Makefile b/rdma/Makefile
new file mode 100644
index 00000000..64da2142
--- /dev/null
+++ b/rdma/Makefile
@@ -0,0 +1,22 @@
+include ../Config
+
+ifeq ($(HAVE_MNL),y)
+
+RDMA_OBJ = rdma.o utils.o
+
+TARGETS=rdma
+CFLAGS += $(shell $(PKG_CONFIG) libmnl --cflags)
+LDLIBS += $(shell $(PKG_CONFIG) libmnl --libs)
+
+endif
+
+all:	$(TARGETS) $(LIBS)
+
+rdma:	$(RDMA_OBJ) $(LIBS)
+	$(QUIET_LINK)$(CC) $^ $(LDFLAGS) $(LDLIBS) -o $@
+
+install: all
+	install -m 0755 $(TARGETS) $(DESTDIR)$(SBINDIR)
+
+clean:
+	rm -f $(RDMA_OBJ) $(TARGETS)
diff --git a/rdma/rdma.c b/rdma/rdma.c
new file mode 100644
index 00000000..d850e396
--- /dev/null
+++ b/rdma/rdma.c
@@ -0,0 +1,116 @@
+/*
+ * rdma.c	RDMA tool
+ *
+ *              This program is free software; you can redistribute it and/or
+ *              modify it under the terms of the GNU General Public License
+ *              as published by the Free Software Foundation; either version
+ *              2 of the License, or (at your option) any later version.
+ *
+ * Authors:     Leon Romanovsky <leonro@mellanox.com>
+ */
+
+#include "rdma.h"
+#include "SNAPSHOT.h"
+
+static void help(char *name)
+{
+	pr_out("Usage: %s [ OPTIONS ] OBJECT { COMMAND | help }\n"
+	       "where  OBJECT := { help }\n"
+	       "       OPTIONS := { -V[ersion] | -d[etails]}\n", name);
+}
+
+static int cmd_help(struct rd *rd)
+{
+	help(rd->filename);
+	return 0;
+}
+
+static int rd_cmd(struct rd *rd)
+{
+	const struct rd_cmd cmds[] = {
+		{ NULL,		cmd_help },
+		{ "help",	cmd_help },
+		{ 0 }
+	};
+
+	return rd_exec_cmd(rd, cmds, "object");
+}
+
+static int rd_init(struct rd *rd, int argc, char **argv, char *filename)
+{
+	uint32_t seq;
+	int ret;
+
+	rd->filename = filename;
+	rd->argc = argc;
+	rd->argv = argv;
+	INIT_LIST_HEAD(&rd->dev_map_list);
+	rd->buff = malloc(MNL_SOCKET_BUFFER_SIZE);
+	if (!rd->buff)
+		return -ENOMEM;
+
+	rd_prepare_msg(rd, RDMA_NLDEV_CMD_GET,
+		       &seq, (NLM_F_REQUEST | NLM_F_ACK | NLM_F_DUMP));
+	ret = rd_send_msg(rd);
+	if (ret)
+		return ret;
+
+	return rd_recv_msg(rd, rd_dev_init_cb, rd, seq);
+}
+
+static void rd_free(struct rd *rd)
+{
+	free(rd->buff);
+	rd_free_devmap(rd);
+}
+
+int main(int argc, char **argv)
+{
+	static const struct option long_options[] = {
+		{ "version",		no_argument,		NULL, 'V' },
+		{ "help",		no_argument,		NULL, 'h' },
+		{ "details",		no_argument,		NULL, 'd' },
+		{ NULL, 0, NULL, 0 }
+	};
+	bool show_details = false;
+	char *filename;
+	struct rd rd;
+	int opt;
+	int err;
+
+	filename = basename(argv[0]);
+
+	while ((opt = getopt_long(argc, argv, "Vhd",
+				  long_options, NULL)) >= 0) {
+		switch (opt) {
+		case 'V':
+			printf("%s utility, iproute2-ss%s\n",
+			       filename, SNAPSHOT);
+			return EXIT_SUCCESS;
+		case 'd':
+			show_details = true;
+			break;
+		case 'h':
+			help(filename);
+			return EXIT_SUCCESS;
+		default:
+			pr_err("Unknown option.\n");
+			help(filename);
+			return EXIT_FAILURE;
+		}
+	}
+
+	argc -= optind;
+	argv += optind;
+
+	err = rd_init(&rd, argc, argv, filename);
+	if (err)
+		goto out;
+
+	rd.show_details = show_details;
+	err = rd_cmd(&rd);
+out:
+	/* Always cleanup */
+	rd_free(&rd);
+	return err ? EXIT_FAILURE : EXIT_SUCCESS;
+}
diff --git a/rdma/rdma.h b/rdma/rdma.h
new file mode 100644
index 00000000..c1ef1059
--- /dev/null
+++ b/rdma/rdma.h
@@ -0,0 +1,73 @@
+/*
+ * rdma.c	RDMA tool
+ *
+ *              This program is free software; you can redistribute it and/or
+ *              modify it under the terms of the GNU General Public License
+ *              as published by the Free Software Foundation; either version
+ *              2 of the License, or (at your option) any later version.
+ *
+ * Authors:     Leon Romanovsky <leonro@mellanox.com>
+ */
+#ifndef _RDMA_TOOL_H_
+#define _RDMA_TOOL_H_
+
+#include <stdlib.h>
+#include <string.h>
+#include <errno.h>
+#include <getopt.h>
+#include <libmnl/libmnl.h>
+#include <rdma/rdma_netlink.h>
+#include <time.h>
+#include <rdma/ib_user_verbs.h>
+#include <rdma/rdma_netlink.h>
+
+#include "list.h"
+
+#define pr_err(args...) fprintf(stderr, ##args)
+#define pr_out(args...) fprintf(stdout, ##args)
+
+struct dev_map {
+	struct list_head list;
+	char *dev_name;
+	uint32_t num_ports;
+	uint32_t idx;
+};
+
+struct rd {
+	int argc;
+	char **argv;
+	char *filename;
+	bool show_details;
+	struct list_head dev_map_list;
+	struct mnl_socket *nl;
+	struct nlmsghdr *nlh;
+	char *buff;
+};
+
+struct rd_cmd {
+	const char *cmd;
+	int (*func)(struct rd *rd);
+};
+
+/*
+ * Parser interface
+ */
+bool rd_no_arg(struct rd *rd);
+void rd_arg_inc(struct rd *rd);
+
+int rd_exec_cmd(struct rd *rd, const struct rd_cmd *c, const char *str);
+
+/*
+ * Device manipulation
+ */
+void rd_free_devmap(struct rd *rd);
+
+/*
+ * Netlink
+ */
+int rd_send_msg(struct rd *rd);
+int rd_recv_msg(struct rd *rd, mnl_cb_t callback, void *data, uint32_t seq);
+void rd_prepare_msg(struct rd *rd, uint32_t cmd, uint32_t *seq, uint16_t flags);
+int rd_dev_init_cb(const struct nlmsghdr *nlh, void *data);
+int rd_attr_cb(const struct nlattr *attr, void *data);
+#endif /* _RDMA_TOOL_H_ */
diff --git a/rdma/utils.c b/rdma/utils.c
new file mode 100644
index 00000000..9bd7418f
--- /dev/null
+++ b/rdma/utils.c
@@ -0,0 +1,217 @@
+/*
+ * utils.c	RDMA tool
+ *
+ *              This program is free software; you can redistribute it and/or
+ *              modify it under the terms of the GNU General Public License
+ *              as published by the Free Software Foundation; either version
+ *              2 of the License, or (at your option) any later version.
+ *
+ * Authors:     Leon Romanovsky <leonro@mellanox.com>
+ */
+
+#include "rdma.h"
+
+static int rd_argc(struct rd *rd)
+{
+	return rd->argc;
+}
+
+static char *rd_argv(struct rd *rd)
+{
+	if (!rd_argc(rd))
+		return NULL;
+	return *rd->argv;
+}
+
+static int strcmpx(const char *str1, const char *str2)
+{
+	if (strlen(str1) > strlen(str2))
+		return -1;
+	return strncmp(str1, str2, strlen(str1));
+}
+
+static bool rd_argv_match(struct rd *rd, const char *pattern)
+{
+	if (!rd_argc(rd))
+		return false;
+	return strcmpx(rd_argv(rd), pattern) == 0;
+}
+
+void rd_arg_inc(struct rd *rd)
+{
+	if (!rd_argc(rd))
+		return;
+	rd->argc--;
+	rd->argv++;
+}
+
+bool rd_no_arg(struct rd *rd)
+{
+	return rd_argc(rd) == 0;
+}
+
+static struct dev_map *dev_map_alloc(const char *dev_name)
+{
+	struct dev_map *dev_map;
+
+	dev_map = calloc(1, sizeof(*dev_map));
+	if (!dev_map)
+		return NULL;
+	dev_map->dev_name = strdup(dev_name);
+
+	return dev_map;
+}
+
+static void dev_map_free(struct dev_map *dev_map)
+{
+	if (!dev_map)
+		return;
+
+	free(dev_map->dev_name);
+	free(dev_map);
+}
+
+static void dev_map_cleanup(struct rd *rd)
+{
+	struct dev_map *dev_map, *tmp;
+
+	list_for_each_entry_safe(dev_map, tmp,
+				 &rd->dev_map_list, list) {
+		list_del(&dev_map->list);
+		dev_map_free(dev_map);
+	}
+}
+
+static const enum mnl_attr_data_type nldev_policy[RDMA_NLDEV_ATTR_MAX] = {
+	[RDMA_NLDEV_ATTR_DEV_NAME] = MNL_TYPE_NUL_STRING,
+	[RDMA_NLDEV_ATTR_PORT_INDEX] = MNL_TYPE_U32,
+};
+
+int rd_attr_cb(const struct nlattr *attr, void *data)
+{
+	const struct nlattr **tb = data;
+	int type;
+
+	if (mnl_attr_type_valid(attr, RDMA_NLDEV_ATTR_MAX) < 0)
+		return MNL_CB_ERROR;
+
+	type = mnl_attr_get_type(attr);
+
+	if (mnl_attr_validate(attr, nldev_policy[type]) < 0)
+		return MNL_CB_ERROR;
+
+	tb[type] = attr;
+	return MNL_CB_OK;
+}
+
+int rd_dev_init_cb(const struct nlmsghdr *nlh, void *data)
+{
+	struct nlattr *tb[RDMA_NLDEV_ATTR_MAX] = {};
+	struct dev_map *dev_map;
+	struct rd *rd = data;
+	const char *dev_name;
+
+	mnl_attr_parse(nlh, 0, rd_attr_cb, tb);
+	if (!tb[RDMA_NLDEV_ATTR_DEV_NAME] || !tb[RDMA_NLDEV_ATTR_DEV_INDEX])
+		return MNL_CB_ERROR;
+	if (!tb[RDMA_NLDEV_ATTR_PORT_INDEX]) {
+		pr_err("This tool doesn't support switches yet\n");
+		return MNL_CB_ERROR;
+	}
+
+	dev_name = mnl_attr_get_str(tb[RDMA_NLDEV_ATTR_DEV_NAME]);
+
+	dev_map = dev_map_alloc(dev_name);
+	if (!dev_map)
+		/* The main function will cleanup the allocations */
+		return MNL_CB_ERROR;
+	list_add_tail(&dev_map->list, &rd->dev_map_list);
+
+	dev_map->num_ports = mnl_attr_get_u32(tb[RDMA_NLDEV_ATTR_PORT_INDEX]);
+	dev_map->idx = mnl_attr_get_u32(tb[RDMA_NLDEV_ATTR_DEV_INDEX]);
+	return MNL_CB_OK;
+}
+
+void rd_free_devmap(struct rd *rd)
+{
+	if (!rd)
+		return;
+	dev_map_cleanup(rd);
+}
+
+int rd_exec_cmd(struct rd *rd, const struct rd_cmd *cmds, const char *str)
+{
+	const struct rd_cmd *c;
+
+	/* First argument in objs table is default variant */
+	if (rd_no_arg(rd))
+		return cmds->func(rd);
+
+	for (c = cmds + 1; c->cmd; ++c) {
+		if (rd_argv_match(rd, c->cmd)) {
+			/* Move to next argument */
+			rd_arg_inc(rd);
+			return c->func(rd);
+		}
+	}
+
+	pr_err("Unknown %s '%s'.\n", str, rd_argv(rd));
+	return 0;
+}
+
+void rd_prepare_msg(struct rd *rd, uint32_t cmd, uint32_t *seq, uint16_t flags)
+{
+	*seq = time(NULL);
+
+	rd->nlh = mnl_nlmsg_put_header(rd->buff);
+	rd->nlh->nlmsg_type = RDMA_NL_GET_TYPE(RDMA_NL_NLDEV, cmd);
+	rd->nlh->nlmsg_seq = *seq;
+	rd->nlh->nlmsg_flags = flags;
+}
+
+int rd_send_msg(struct rd *rd)
+{
+	int ret;
+
+	rd->nl = mnl_socket_open(NETLINK_RDMA);
+	if (!rd->nl) {
+		pr_err("Failed to open NETLINK_RDMA socket\n");
+		return -ENODEV;
+	}
+
+	ret = mnl_socket_bind(rd->nl, 0, MNL_SOCKET_AUTOPID);
+	if (ret < 0) {
+		pr_err("Failed to bind socket with err %d\n", ret);
+		goto err;
+	}
+
+	ret = mnl_socket_sendto(rd->nl, rd->nlh, rd->nlh->nlmsg_len);
+	if (ret < 0) {
+		pr_err("Failed to send to socket with err %d\n", ret);
+		goto err;
+	}
+	return 0;
+
+err:
+	mnl_socket_close(rd->nl);
+	return ret;
+}
+
+int rd_recv_msg(struct rd *rd, mnl_cb_t callback, void *data, unsigned int seq)
+{
+	int ret;
+	unsigned int portid;
+	char buf[MNL_SOCKET_BUFFER_SIZE];
+
+	portid = mnl_socket_get_portid(rd->nl);
+	do {
+		ret = mnl_socket_recvfrom(rd->nl, buf, sizeof(buf));
+		if (ret <= 0)
+			break;
+
+		ret = mnl_cb_run(buf, ret, seq, portid, callback, data);
+	} while (ret > 0);
+
+	mnl_socket_close(rd->nl);
+	return ret;
+}

^ permalink raw reply related

* [PATCH REPOST v5 iproute2 1/8] utils: Move BIT macro to common header
From: Leon Romanovsky @ 2017-08-17  6:56 UTC (permalink / raw)
  To: Doug Ledford, Stephen Hemminger
  Cc: linux-rdma, Leon Romanovsky, Dennis Dalessandro, Jason Gunthorpe,
	Jiri Pirko, Ariel Almog, David Laight, Linux Netdev
In-Reply-To: <20170817065614.1393-1-leonro@mellanox.com>

BIT() macro was implemented and used by devlink for now, but following
patches of rdmatool will reuse the same macro, so put it in common
header file.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
 devlink/devlink.c | 2 +-
 include/utils.h   | 2 ++
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/devlink/devlink.c b/devlink/devlink.c
index f9bc16c3..7602970b 100644
--- a/devlink/devlink.c
+++ b/devlink/devlink.c
@@ -25,6 +25,7 @@
 #include "list.h"
 #include "mnlg.h"
 #include "json_writer.h"
+#include "utils.h"

 #define ESWITCH_MODE_LEGACY "legacy"
 #define ESWITCH_MODE_SWITCHDEV "switchdev"
@@ -160,7 +161,6 @@ static void ifname_map_free(struct ifname_map *ifname_map)
 	free(ifname_map);
 }

-#define BIT(nr)                 (1UL << (nr))
 #define DL_OPT_HANDLE		BIT(0)
 #define DL_OPT_HANDLEP		BIT(1)
 #define DL_OPT_PORT_TYPE	BIT(2)
diff --git a/include/utils.h b/include/utils.h
index 6080b962..7a3b3fd2 100644
--- a/include/utils.h
+++ b/include/utils.h
@@ -195,6 +195,8 @@ static inline void __jiffies_to_tv(struct timeval *tv, unsigned long jiffies)
 int print_timestamp(FILE *fp);
 void print_nlmsg_timestamp(FILE *fp, const struct nlmsghdr *n);

+#define BIT(nr)                 (1UL << (nr))
+
 #define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))

 #define BUILD_BUG_ON(cond) ((void)sizeof(char[1 - 2 * !!(cond)]))

^ permalink raw reply related

* [PATCH REPOST v5 iproute2 0/8] RDMAtool
From: Leon Romanovsky @ 2017-08-17  6:56 UTC (permalink / raw)
  To: Doug Ledford, Stephen Hemminger
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Leon Romanovsky,
	Dennis Dalessandro, Jason Gunthorpe, Jiri Pirko, Ariel Almog,
	David Laight, Linux Netdev

This is fifth revision of series implementing the RDAMtool -  the tool
to configure RDMA devices.

It looks like everyone who was interested to read cover letter already did it,
so I'll start from the changelog:

Changelog:
v4->v5:
 * Rebased to latest net-next branch
 * Moved BIT() macro from devlink to general utils.h file - Patch #1.
 * Changed the order of patches - moved man pages to be last patch.
 * Rewrote all switch->case->return_string constructions to be static
   tables with help of David's macro magic. Thanks a lot.
 * Dropped dependency on exported device and port properties. Now tool depends
   on RDMA netlink only and all needed code is already in Doug's for-next.
 * Added two OPA specific physical link states, because their names is
   too broad - TEST and OFFLINE, I named it as OPA_TEST and OPA_OFFLINE.
v3->v4:
 * Rebased to latest net-next branch
 * Added JSON output -j (json) and -p (pretty output)
 * Exported and reused kernel UAPIs and defines instead of hard coded
   version.
v2->v3:
 * Removed MAX()
 * Reduced scope of rd_argv_match
 * Removed return from rdma_free_devmap
 * Added extra break at rdma_send_msg
v1->v2:
 * Squashed multiple (and similar) patches to be one patch for dev object
   and one patch for link object.
 * Removed port_map struct
 * Removed global netlink dump during initialization, it removed the need to store
   the intermediate variables and reuse ability of netlink to signal if variable
   exists or doesn't.
 * Added "-d" --details option and put all CAPs under it.

v0->v1:
 * Moved hunk with changes in man/Makefile from first patch to the last patch
 * Removed the "unknown command" from the examples in commit messages
 * Removed special "caps" parsing command and put it to be part of general "show" command
 * Changed parsed capability format to be similar to iproute2 suite
 * Added FW version as an output of show command.
 * Added forgotten CAP_FLAGS to the nla_policy list
RFC->v0:
 * Removed everything that is not implemented yet.
 * Abandoned sysfs interfaces in favor of netlink.

-----
The initial proposal was sent as RFC [1] and was based on sysfs entries as POC.

The current series was rewritten completely to work with RDMA netlinks as
a source of user<->kernel communications. In order to achieve that, the
RDMA netlinks were extensively refactored and modernized [2, 3, 4 and 5].

The Doug's for-next tag includes most of the needed patches for this tool.

The following is an example of various runs on my machine with 5 devices
(4 in IB mode and one in Ethernet mode).

### Without parameters
$ rdma
Usage: rdma [ OPTIONS ] OBJECT { COMMAND | help }
where  OBJECT := { dev | link | help }
       OPTIONS := { -V[ersion] | -d[etails] | -j[son] | -p[retty]}

### With unspecified device name
$ rdma dev
1: mlx5_0: node_type ca fw 2.8.9999 node_guid 5254:00c0:fe12:3457 sys_image_guid 5254:00c0:fe12:3457
2: mlx5_1: node_type ca fw 2.8.9999 node_guid 5254:00c0:fe12:3458 sys_image_guid 5254:00c0:fe12:3458
3: mlx5_2: node_type ca fw 2.8.9999 node_guid 5254:00c0:fe12:3459 sys_image_guid 5254:00c0:fe12:3459
4: mlx5_3: node_type ca fw 2.8.9999 node_guid 5254:00c0:fe12:345a sys_image_guid 5254:00c0:fe12:345a
5: mlx5_4: node_type ca fw 2.8.9999 node_guid 5254:00c0:fe12:345b sys_image_guid 5254:00c0:fe12:345b

### Detailed mode
$ rdma -d dev
1: mlx5_0: node_type ca fw 2.8.9999 node_guid 5254:00c0:fe12:3457 sys_image_guid 5254:00c0:fe12:3457
    caps: <BAD_PKEY_CNTR, BAD_QKEY_CNTR, CHANGE_PHY_POR, PORT_ACTIVE_EVENT, SYS_IMAGE_GUID, RC_RNR_NAK_GEN, MEM_WINDOW, UD_IP_CSUM, UD_TSO, XRC, MEM_MGT_EXTENSIONS, BLOCK_MULTICAST_LOOPBACK, MEM_WINDOW_TYPE_2B, RAW_IP_CSUM, MANAGED_FLOW_STEERING, RESIZE_MAX_WR>
2: mlx5_1: node_type ca fw 2.8.9999 node_guid 5254:00c0:fe12:3458 sys_image_guid 5254:00c0:fe12:3458
    caps: <BAD_PKEY_CNTR, BAD_QKEY_CNTR, CHANGE_PHY_POR, PORT_ACTIVE_EVENT, SYS_IMAGE_GUID, RC_RNR_NAK_GEN, MEM_WINDOW, UD_IP_CSUM, UD_TSO, XRC, MEM_MGT_EXTENSIONS, BLOCK_MULTICAST_LOOPBACK, MEM_WINDOW_TYPE_2B, RAW_IP_CSUM, MANAGED_FLOW_STEERING, RESIZE_MAX_WR>
3: mlx5_2: node_type ca fw 2.8.9999 node_guid 5254:00c0:fe12:3459 sys_image_guid 5254:00c0:fe12:3459
    caps: <BAD_PKEY_CNTR, BAD_QKEY_CNTR, CHANGE_PHY_POR, PORT_ACTIVE_EVENT, SYS_IMAGE_GUID, RC_RNR_NAK_GEN, MEM_WINDOW, UD_IP_CSUM, UD_TSO, XRC, MEM_MGT_EXTENSIONS, BLOCK_MULTICAST_LOOPBACK, MEM_WINDOW_TYPE_2B, RAW_IP_CSUM, MANAGED_FLOW_STEERING, RESIZE_MAX_WR>
4: mlx5_3: node_type ca fw 2.8.9999 node_guid 5254:00c0:fe12:345a sys_image_guid 5254:00c0:fe12:345a
    caps: <BAD_PKEY_CNTR, BAD_QKEY_CNTR, CHANGE_PHY_POR, PORT_ACTIVE_EVENT, SYS_IMAGE_GUID, RC_RNR_NAK_GEN, MEM_WINDOW, UD_IP_CSUM, UD_TSO, XRC, MEM_MGT_EXTENSIONS, BLOCK_MULTICAST_LOOPBACK, MEM_WINDOW_TYPE_2B, RAW_IP_CSUM, MANAGED_FLOW_STEERING, RESIZE_MAX_WR>
5: mlx5_4: node_type ca fw 2.8.9999 node_guid 5254:00c0:fe12:345b sys_image_guid 5254:00c0:fe12:345b
    caps: <BAD_PKEY_CNTR, BAD_QKEY_CNTR, CHANGE_PHY_POR, PORT_ACTIVE_EVENT, SYS_IMAGE_GUID, RC_RNR_NAK_GEN, MEM_WINDOW, UD_IP_CSUM, UD_TSO, XRC, MEM_MGT_EXTENSIONS, BLOCK_MULTICAST_LOOPBACK, MEM_WINDOW_TYPE_2B, RAW_IP_CSUM, MANAGED_FLOW_STEERING, RESIZE_MAX_WR>

### Specific device
$ rdma dev show mlx5_4
5: mlx5_4: node_type ca fw 2.8.9999 node_guid 5254:00c0:fe12:345b sys_image_guid 5254:00c0:fe12:345b

### Specific device in detailed mode
$ rdma dev show mlx5_4 -d
5: mlx5_4: node_type ca fw 2.8.9999 node_guid 5254:00c0:fe12:345b sys_image_guid 5254:00c0:fe12:345b
    caps: <BAD_PKEY_CNTR, BAD_QKEY_CNTR, CHANGE_PHY_POR, PORT_ACTIVE_EVENT, SYS_IMAGE_GUID, RC_RNR_NAK_GEN, MEM_WINDOW, UD_IP_CSUM, UD_TSO, XRC, MEM_MGT_EXTENSIONS, BLOCK_MULTICAST_LOOPBACK, MEM_WINDOW_TYPE_2B, RAW_IP_CSUM, MANAGED_FLOW_STEERING, RESIZE_MAX_WR>

### Unknown command (caps)
$ rdma dev show mlx5_4 caps
Unknown parameter 'caps'.

### Link properties without device name
$ rdma link
1/1: mlx5_0/1: subnet_prefix fe80:0000:0000:0000 lid 13399 sm_lid 49151 lmc 0 state ACTIVE physical_state LINK_UP
2/1: mlx5_1/1: subnet_prefix fe80:0000:0000:0000 lid 13400 sm_lid 49151 lmc 0 state ACTIVE physical_state LINK_UP
3/1: mlx5_2/1: subnet_prefix fe80:0000:0000:0000 lid 13401 sm_lid 49151 lmc 0 state ACTIVE physical_state LINK_UP
4/1: mlx5_3/1: state DOWN physical_state DISABLED
5/1: mlx5_4/1: subnet_prefix fe80:0000:0000:0000 lid 13403 sm_lid 49151 lmc 0 state ACTIVE physical_state LINK_UP

### Link properties in detailed mode
$ rdma link -d
1/1: mlx5_0/1: subnet_prefix fe80:0000:0000:0000 lid 13399 sm_lid 49151 lmc 0 state ACTIVE physical_state LINK_UP
    caps: <AUTO_MIGR>
2/1: mlx5_1/1: subnet_prefix fe80:0000:0000:0000 lid 13400 sm_lid 49151 lmc 0 state ACTIVE physical_state LINK_UP
    caps: <AUTO_MIGR>
3/1: mlx5_2/1: subnet_prefix fe80:0000:0000:0000 lid 13401 sm_lid 49151 lmc 0 state ACTIVE physical_state LINK_UP
    caps: <AUTO_MIGR>
4/1: mlx5_3/1: state DOWN physical_state DISABLED
    caps: <CM, IP_BASED_GIDS>
5/1: mlx5_4/1: subnet_prefix fe80:0000:0000:0000 lid 13403 sm_lid 49151 lmc 0 state ACTIVE physical_state LINK_UP
    caps: <AUTO_MIGR>

### All links for specific device
$ rdma link show mlx5_3
1/1: mlx5_0/1: subnet_prefix fe80:0000:0000:0000 lid 13399 sm_lid 49151 lmc 0 state ACTIVE physical_state LINK_UP

### Detailed link properties for specific device
$ rdma link -d show mlx5_3
1/1: mlx5_0/1: subnet_prefix fe80:0000:0000:0000 lid 13399 sm_lid 49151 lmc 0 state ACTIVE physical_state LINK_UP
    caps: <AUTO_MIGR>

### Specific port for specific device
$ rdma link show mlx5_4/1
1/1: mlx5_0/1: subnet_prefix fe80:0000:0000:0000 lid 13399 sm_lid 49151 lmc 0 state ACTIVE physical_state LINK_UP

### Unknown parameter
$ rdma link show mlx5_4/1 caps
Unknown parameter 'caps'.

Thanks

Available in the "topic/rdmatool-netlink-v5" topic branch of this git repo:
git://git.kernel.org/pub/scm/linux/kernel/git/leon/iproute2.git

Or for browsing:
https://git.kernel.org/cgit/linux/kernel/git/leon/iproute2.git/log/?h=topic/rdmatool-netlink-v5

Thanks

[1] https://www.spinics.net/lists/linux-rdma/msg49575.html
[2] https://patchwork.kernel.org/patch/9752865/
[3] https://www.spinics.net/lists/linux-rdma/msg50827.html
[4] https://www.spinics.net/lists/linux-rdma/msg51210.html
[5] https://patchwork.kernel.org/patch/9811729/ and https://patchwork.kernel.org/patch/9811731/]

Cc: Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Cc: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Cc: Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
Cc: Jiri Pirko <jiri-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Ariel Almog <ariela-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: David Laight <David.Laight-ZS65k/vG3HxXrIkS9f7CXA@public.gmane.org>
Cc: Linux Netdev <netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>

Leon Romanovsky (8):
  utils: Move BIT macro to common header
  rdma: Add basic infrastructure for RDMA tool
  rdma: Add dev object
  rdma: Add link object
  rdma: Add json and pretty outputs
  rdma: Implement json output for dev object
  rdma: Add json output to link object
  rdma: Add initial manual for the tool

 Makefile             |   2 +-
 devlink/devlink.c    |   2 +-
 include/utils.h      |   2 +
 man/man8/rdma-dev.8  |  55 +++++++++
 man/man8/rdma-link.8 |  55 +++++++++
 man/man8/rdma.8      | 102 +++++++++++++++
 rdma/.gitignore      |   1 +
 rdma/Makefile        |  22 ++++
 rdma/dev.c           | 284 ++++++++++++++++++++++++++++++++++++++++++
 rdma/link.c          | 343 +++++++++++++++++++++++++++++++++++++++++++++++++++
 rdma/rdma.c          | 143 +++++++++++++++++++++
 rdma/rdma.h          |  93 ++++++++++++++
 rdma/utils.c         | 266 +++++++++++++++++++++++++++++++++++++++
 13 files changed, 1368 insertions(+), 2 deletions(-)
 create mode 100644 man/man8/rdma-dev.8
 create mode 100644 man/man8/rdma-link.8
 create mode 100644 man/man8/rdma.8
 create mode 100644 rdma/.gitignore
 create mode 100644 rdma/Makefile
 create mode 100644 rdma/dev.c
 create mode 100644 rdma/link.c
 create mode 100644 rdma/rdma.c
 create mode 100644 rdma/rdma.h
 create mode 100644 rdma/utils.c

--
2.14.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH net-next 3/3 v5] drivers: net: ethernet: qualcomm: rmnet: Initial implementation
From: Subash Abhinov Kasiviswanathan @ 2017-08-17  6:38 UTC (permalink / raw)
  To: Dan Williams
  Cc: netdev, davem, fengguang.wu, jiri, stephen, David.Laight, marcel
In-Reply-To: <1502939764.30484.7.camel@redhat.com>

> I'm probably forgetting a bit of the original context.  Say you have
> one of these "network devices in IP mode".  What happens with the
> device *before* you attach an rmnet child?  Can it pass traffic before
> that point at all, or does the modem expect MAP already?

Hi Dan

All traffic needs to be in MAP format only.

>> +	dev_hold(real_dev);
> 
> I could be entirely wrong, but isn't this mostly handled for you if you
> use netdev_upper_dev_link() like ipvlan and macvlan do?  That looks
> like it provides a couple things: (a) handles the dev_hold() for you
> and (b) provides mechanisms to get the upper device if you need it.
> I'm not actually familiar with the "adjacent device" functionality, but
> it looked similar in purpose.

Does this API modify the data path as well or is it only for 
establishing
a hierarchy between nodes (which I assume should help for easier 
cleanup).
Currently, I register with the real_dev and use the rx_handler to 
intercept
the incoming MAP packets. If netdev_upper_dev_link only modifies the 
device
refcounting, I should be able to easily modify to use it.

> 
>> +	return 0;
>> +}
>> +
>> +static int __rmnet_set_endpoint_config(struct net_device *dev, int
>> config_id,
>> +				       struct rmnet_endpoint *ep)
>> +{
>> +	struct rmnet_endpoint *dev_ep;
>> +
>> +	ASSERT_RTNL();
>> +
>> +	dev_ep = rmnet_get_endpoint(dev, config_id);
>> +
>> +	if (!dev_ep || dev_ep->refcount)
>> +		return -EINVAL;
>> +
>> +	memcpy(dev_ep, ep, sizeof(struct rmnet_endpoint));
>> +	if (config_id == RMNET_LOCAL_LOGICAL_ENDPOINT)
>> +		dev_ep->mux_id = 0;
>> +	else
>> +		dev_ep->mux_id = config_id;
> 
> So... if config_id is LOGICAL_ENDPOINT (-1) this sets mux_id to 0.
> 
> But if config_id is 0, it *also* sets mux_id to 0?
> 
> Can you clarify what mux_id = 0 actually means here?  Can you also talk
> a bit about the difference between local_ep and muxed_ep?
> 

mux_id 0 is for the pass through mode (for the testing scenario).
The MAP packets arriving maybe shipped as is to a different net device
(say one exposed by USB).

While transmitting packets from rmnet dev to real_dev, the local_ep
has the information about the rmnet dev. Based on that, the MAP
header is stamped and packet is transmitted.

muxed_ep is for receiving the packets where the MAP header is
stripped off and the packets is passed to the appropriate rmnet dev.

>> +	case NETDEV_UNREGISTER_FINAL:
> I don't see anywhere that RMNET_INGRESS_FIX_ETHERNET can get set?
> Also, can't that be autodetected?
> 
> 
> Just use ETH_HLEN instead of RMNET_ETHERNET_HEADER_LENGTH.
> 
> But really, I can't see where FIX_ETHERNET ever gets set.  And as
> above, can't this be automatically detected?  Can you describe what the
> issue is here in more detail?
> 
> I know for qmi_wwan.c we had to fix up a number of firmware bugs, but
> all that is done automatically.
> 
The ethernet mode was earlier for some experimental configurations.

>> +	int egress_format = RMNET_EGRESS_FORMAT_MUXING |
>> +			    RMNET_EGRESS_FORMAT_MAP;
>> +	struct net_device *real_dev;
>> +	int mode = RMNET_EPMODE_VND;
>> +	u16 mux_id;
>> +
>> +	real_dev = __dev_get_by_index(src_net,
>> nla_get_u32(tb[IFLA_LINK]));
>> +	if (!real_dev || !dev)
>> +		return -ENODEV;
>> +
>> +	if (!data[IFLA_VLAN_ID])
> 
> Also, I wasn't thinking to actually *use* IFLA_VLAN_ID, but I'll let
> others weigh in.  It does fit in this case, I think, but maybe creating
> an RMNET-specific attribute would be better?
> 

I have implemented a single message for setting up the device based on 
mux
in this patchset so this suffices for me :) . Stephen had suggested to 
reuse
existing structs as much as possible.

>> +struct rmnet_map_control_command {
>> +	u8  command_name;
>> +	u8  cmd_type:2;
>> +	u8  reserved:6;
>> +	u16 reserved2;
>> +	u32 transaction_id;
>> +	union {
>> +		u8  data[65528];
> 
> Um....  that seems really, really odd.  Typically this would go below
> the flow_control struct, and actually be:
> 
> u8 data[0];
> 
> To indicate that the struct member should exist and that you can use
> it, but that it has no specific size (since the size will be determined
> by the skb size or by a protocol field instead).
> 
> Thats all for now...
> 
> Dan
> 

I will change this to u8 data[0];

^ permalink raw reply

* Re: [PATCH net] net: sched: fix NULL pointer dereference when action calls some targets
From: Cong Wang @ 2017-08-17  5:57 UTC (permalink / raw)
  To: Xin Long; +Cc: network dev, David Miller, netfilter-devel, Jamal Hadi Salim
In-Reply-To: <CADvbK_cCQY0McHiZFKSTjdGdAjhB6RBej5n=SH2=hsLYC=Xa7w@mail.gmail.com>

On Wed, Aug 16, 2017 at 1:39 AM, Xin Long <lucien.xin@gmail.com> wrote:
> On Wed, Aug 9, 2017 at 7:33 AM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>> On Mon, Aug 7, 2017 at 7:33 PM, Xin Long <lucien.xin@gmail.com> wrote:
>>> On Tue, Aug 8, 2017 at 9:15 AM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>>>> This looks like a completely API burden?
>>> netfilter xt targets are not really compatible with netsched action.
>>> I've got to say, the patch is just a way to make checkentry return
>>> false and avoid panic. like [1] said
>>
>> I don't doubt you fix a crash, I am thinking if we can
>> "fix" the API instead of fixing the caller.
> Hi, Cong,
>
> For now, I don't think it's possible to change APIs or  some of their targets
> for the panic caused by action xt calling.
>
> The common way should be fixed in net_sched side.
>
> Given that the issue is very easy to triggered,
> let's wait for netfilter's replies for another few days,
> otherwise I will repost the fix, agree ?

Yeah, no objections from me.

By the way, do you know how other callers of this API
use 'entryinfo'? Do they pass the address of the struct
on stack too?

^ permalink raw reply

* Re: [net-next PATCH 07/10] bpf: add access to sock fields and pkt data from sk_skb programs
From: Alexei Starovoitov @ 2017-08-17  5:42 UTC (permalink / raw)
  To: John Fastabend, davem, daniel; +Cc: tgraf, netdev, tom
In-Reply-To: <20170816053309.15445.97681.stgit@john-Precision-Tower-5810>

On 8/15/17 10:33 PM, John Fastabend wrote:
> +static int sk_skb_prologue(struct bpf_insn *insn_buf, bool direct_write,
> +			   const struct bpf_prog *prog)
> +{
> +	struct bpf_insn *insn = insn_buf;
> +
> +	if (!direct_write)
> +		return 0;
> +
> +	/* if (!skb->cloned)
> +	 *       goto start;
> +	 *
> +	 * (Fast-path, otherwise approximation that we might be
> +	 *  a clone, do the rest in helper.)
> +	 */

iirc we're doing something similar in other prologue generator?
can be consolidated?

^ permalink raw reply

* Re: [net-next PATCH 06/10] bpf: sockmap with sk redirect support
From: Alexei Starovoitov @ 2017-08-17  5:40 UTC (permalink / raw)
  To: John Fastabend, davem, daniel; +Cc: tgraf, netdev, tom
In-Reply-To: <20170816053247.15445.69312.stgit@john-Precision-Tower-5810>

On 8/15/17 10:32 PM, John Fastabend wrote:
> +
> +static void smap_do_verdict(struct smap_psock *psock, struct sk_buff *skb)
> +{
> +	struct sock *sock;
> +	int rc;
> +
> +	/* Because we use per cpu values to feed input from sock redirect
> +	 * in BPF program to do_sk_redirect_map() call we need to ensure we
> +	 * are not preempted. RCU read lock is not sufficient in this case
> +	 * with CONFIG_PREEMPT_RCU enabled so we must be explicit here.
> +	 */
> +	preempt_disable();
> +	rc = smap_verdict_func(psock, skb);
> +	switch (rc) {
> +	case SK_REDIRECT:
> +		sock = do_sk_redirect_map();
> +		preempt_enable();
> +		if (likely(sock)) {
> +			struct smap_psock *peer = smap_psock_sk(sock);
> +
> +			if (likely(peer &&
> +				   test_bit(SMAP_TX_RUNNING, &peer->state) &&
> +				   sk_stream_memory_free(peer->sock))) {
> +				peer->sock->sk_wmem_queued += skb->truesize;
> +				sk_mem_charge(peer->sock, skb->truesize);
> +				skb_queue_tail(&peer->rxqueue, skb);
> +				schedule_work(&peer->tx_work);
> +				break;
> +			}
> +		}
> +	/* Fall through and free skb otherwise */
> +	case SK_DROP:
> +	default:
> +		preempt_enable();
> +		kfree_skb(skb);

two preempt_enable() after single preempt_disable()?

> +
> +static void smap_tx_work(struct work_struct *w)
> +{
> +	struct smap_psock *psock;
> +	struct sk_buff *skb;
> +	int rem, off, n;
> +
> +	psock = container_of(w, struct smap_psock, tx_work);
> +
> +	/* lock sock to avoid losing sk_socket at some point during loop */
> +	lock_sock(psock->sock);
> +	if (psock->save_skb) {
> +		skb = psock->save_skb;
> +		rem = psock->save_rem;
> +		off = psock->save_off;
> +		psock->save_skb = NULL;
> +		goto start;
> +	}
> +
> +	while ((skb = skb_dequeue(&psock->rxqueue))) {
> +		rem = skb->len;
> +		off = 0;
> +start:
> +		do {
> +			if (likely(psock->sock->sk_socket))
> +				n = skb_send_sock_locked(psock->sock,
> +							 skb, off, rem);

so this will be hot loop ?
Do you have perf report by any chance? Curious how it looks.

> +	/* reserve BPF programs early so can abort easily on failures */
> +	if (map_flags & BPF_SOCKMAP_STRPARSER) {

why have two 'flags' arguments and new helper just for this?
can normal update() be used and extra bits of flag there?

> -#define BPF_PROG_ATTACH_LAST_FIELD attach_flags
> +#define BPF_PROG_ATTACH_LAST_FIELD attach_bpf_fd2

> +	prog1 = bpf_prog_get_type(attr->attach_bpf_fd, ptype);
> +	if (IS_ERR(prog1)) {
> +		fdput(f);
> +		return PTR_ERR(prog1);
> +	}
> +
> +	prog2 = bpf_prog_get_type(attr->attach_bpf_fd2, ptype);

could you add a comment to uapi on possible uses of this field
otherwise the name is not readable.

^ permalink raw reply

* Re: Regression: Bug 196547 - Since 4.12 - bonding module not working with wireless drivers
From: Jay Vosburgh @ 2017-08-17  5:33 UTC (permalink / raw)
  To: Dan Williams
  Cc: David Miller, james-fvV4AYHggTTR7s880joybQ,
	futur.andy-gM/Ye1E23mwN+BqQ9rBEUg, kvalo-sgV2jX0FEOL9JmXXK+q4OQ,
	arend.vanspriel-dY08KVG/lbpWk0Htik3J/w,
	maheshb-hpIqsD4AKlfQT0dZR+AlfA, andy-QlMahl40kYEqcZcGjlUOXw,
	netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	greearb-my8/4N5VtI7c+919tysfdA
In-Reply-To: <1502935907.30484.4.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

Dan Williams <dcbw-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
[...]
>You'll probably say "aim for the 75% case" or something like that,
>which is fine, but then you're depending on your 75% case to be (a)
>single AP, (b) never move (eg, only bond wifi + ethernet), (c) little
>radio interference.  I'm not sure I'd buy that.  If I've put words in
>your mouth, forgive me.

	The primary use case that I'm aware of for bonding with wireless
devices is in active-backup mode, paired with an ethernet adapter set as
the bonding "primary" device.  I think this is the case that absolutely
should just work.  I don't think bonding cares (or should care) about
(a) - (c) for this use.

	Your point (b) suggests that there are use cases other than the
above; I'm unfamiliar with any use other than wifi + ethernet, can you
elaborate?

	-J

---
	-Jay Vosburgh, jay.vosburgh-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org

^ permalink raw reply

* [PATCH] netfilter: ipset: ipset list may return wrong member count for set with timeout
From: Vishwanath Pai @ 2017-08-17  5:23 UTC (permalink / raw)
  To: pablo, kadlec
  Cc: johunt, pai.vishwain, vpai, netfilter-devel, coreteam, netdev

Simple testcase:

$ ipset create test hash:ip timeout 5
$ ipset add test 1.2.3.4
$ ipset add test 1.2.2.2
$ sleep 5

$ ipset l
Name: test
Type: hash:ip
Revision: 5
Header: family inet hashsize 1024 maxelem 65536 timeout 5
Size in memory: 296
References: 0
Number of entries: 2
Members:

We return "Number of entries: 2" but no members are listed. That is
because mtype_list runs "ip_set_timeout_expired" and does not list the
expired entries, but set->elements is never upated (until mtype_gc
cleans it up later).

Reviewed-by: Joshua Hunt <johunt@akamai.com>
Signed-off-by: Vishwanath Pai <vpai@akamai.com>
---
 net/netfilter/ipset/ip_set_hash_gen.h | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/net/netfilter/ipset/ip_set_hash_gen.h b/net/netfilter/ipset/ip_set_hash_gen.h
index f236c0b..ff3d31c 100644
--- a/net/netfilter/ipset/ip_set_hash_gen.h
+++ b/net/netfilter/ipset/ip_set_hash_gen.h
@@ -1041,12 +1041,22 @@ struct htype {
 static int
 mtype_head(struct ip_set *set, struct sk_buff *skb)
 {
-	const struct htype *h = set->data;
+	struct htype *h = set->data;
 	const struct htable *t;
 	struct nlattr *nested;
 	size_t memsize;
 	u8 htable_bits;
 
+	/* If any members have expired, set->elements will be wrong
+	 * mytype_expire function will update it with the right count.
+	 * we do not hold set->lock here, so grab it first.
+	 */
+	if (SET_WITH_TIMEOUT(set)) {
+		spin_lock_bh(&set->lock);
+		mtype_expire(set, h);
+		spin_unlock_bh(&set->lock);
+	}
+
 	rcu_read_lock_bh();
 	t = rcu_dereference_bh_nfnl(h->table);
 	memsize = mtype_ahash_memsize(h, t) + set->ext_size;
-- 
1.9.1

^ permalink raw reply related

* Re: [PATCH net RESEND] PCI: fix oops when try to find Root Port for a PCI device
From: Michael Ellerman @ 2017-08-17  5:12 UTC (permalink / raw)
  To: Thierry Reding, Bjorn Helgaas
  Cc: Ding Tianhong, mark.rutland, gabriele.paoloni, asit.k.mallick,
	catalin.marinas, will.deacon, linuxarm, alexander.duyck,
	ashok.raj, eric.dumazet, jeffrey.t.kirsher, linux-pci, ganeshgr,
	Bob.Shaw, leedom, patrick.j.cramer, bhelgaas, werner,
	linux-arm-kernel, amira, netdev, linux-kernel, David.Laight,
	Suravee.Suthikulpanit, robin.murphy, davem, l.stach
In-Reply-To: <20170816193303.GA14147@ulmo>

Thierry Reding <thierry.reding@gmail.com> writes:
...
>
> In case of Tegra, dev actually points to the root port. Now if I read
> the above code correctly, highest_pcie_bridge will still be NULL in that
> case, which in turn will return NULL from pci_find_pcie_root_port(). But
> shouldn't it really return dev?
>
> The patch that I used to fix the issue is this:
>
> --->8---
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 2c712dcfd37d..dd56c1c05614 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -514,7 +514,7 @@ EXPORT_SYMBOL(pci_find_resource);
>   */
>  struct pci_dev *pci_find_pcie_root_port(struct pci_dev *dev)
>  {
> -       struct pci_dev *bridge, *highest_pcie_bridge = NULL;
> +       struct pci_dev *bridge, *highest_pcie_bridge = dev;
>  
>         bridge = pci_upstream_bridge(dev);
>         while (bridge && pci_is_pcie(bridge)) {
> --->8---
>
> That works correctly if this function ends up being called on the PCIe
> root port, though perhaps that's not what this function is supposed to
> do. It's somewhat unclear from the kerneldoc what the function should
> be doing when called on a root port device itself.

That also works for me on powerpc (oops reported up thread).

cheers

^ permalink raw reply

* Re: [PATCH net RESEND] PCI: fix oops when try to find Root Port for a PCI device
From: Michael Ellerman @ 2017-08-17  4:59 UTC (permalink / raw)
  To: Ding Tianhong, leedom, ashok.raj, bhelgaas, helgaas, werner,
	ganeshgr, asit.k.mallick, patrick.j.cramer, Suravee.Suthikulpanit,
	Bob.Shaw, l.stach, amira, gabriele.paoloni, David.Laight,
	jeffrey.t.kirsher, catalin.marinas, will.deacon, mark.rutland,
	robin.murphy, davem, alexander.duyck, eric.dumazet,
	linux-arm-kernel, netdev, linux-pci, linux-kernel, linuxarm,
	linuxppc-dev
  Cc: Ding Tianhong
In-Reply-To: <1502810688-12420-1-git-send-email-dingtianhong@huawei.com>

Ding Tianhong <dingtianhong@huawei.com> writes:

> Eric report a oops when booting the system after applying
> the commit a99b646afa8a ("PCI: Disable PCIe Relaxed..."):

I'm seeing a similar oops on powerpc:

[    0.177242] pci_bus 0015:70: root bus resource [bus 70-ff]
[    0.178012] Unable to handle kernel paging request for data at address 0x00000050
[    0.178017] Faulting instruction address: 0xc0000000005f84b4
[    0.178022] Oops: Kernel access of bad area, sig: 11 [#1]
[    0.178024] SMP NR_CPUS=2048 
[    0.178025] NUMA 
[    0.178028] pSeries
[    0.178031] Modules linked in:
[    0.178036] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G        W       4.13.0-rc4-gcc-6.3.1-00167-ga99b646afa8a #407
[    0.178040] task: c0000003f7400000 task.stack: c0000003f7480000
[    0.178043] NIP: c0000000005f84b4 LR: c0000000005f5ccc CTR: 0000000000000000
[    0.178046] REGS: c0000003f74836d0 TRAP: 0380   Tainted: G        W        (4.13.0-rc4-gcc-6.3.1-00167-ga99b646afa8a)
[    0.178050] MSR: 8000000002009033 <SF,VEC,EE,ME,IR,DR,RI,LE>
[    0.178057]   CR: 48000842  XER: 2000000f
[    0.178061] CFAR: c0000000005f840c SOFTE: 1 
[    0.178061] GPR00: c0000000005f5cb4 c0000003f7483950 c000000000fa0000 0000000000000000 
[    0.178061] GPR04: 0000000000000001 0000000000000028 c0000003f7483820 f000000000ff6360 
[    0.178061] GPR08: 00000003fe2f0000 0000000000000000 c0000003f5759000 0000000002001001 
[    0.178061] GPR12: 0000000000000010 c00000000fd80000 c00000000000db08 0000000000000000 
[    0.178061] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
[    0.178061] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
[    0.178061] GPR24: 0000000000000000 c000000000c5f680 c0000003f756b678 c0000003f5759000 
[    0.178061] GPR28: 0000000000000030 c0000003f756b098 c0000003f5759000 c0000003f756b000 
[    0.178110] NIP [c0000000005f84b4] pci_find_pcie_root_port+0xb4/0xd0
[    0.178114] LR [c0000000005f5ccc] pci_device_add+0x32c/0x470
[    0.178117] Call Trace:
[    0.178120] [c0000003f7483950] [c0000000005f5cb4] pci_device_add+0x314/0x470 (unreliable)
[    0.178126] [c0000003f74839f0] [c00000000005b85c] of_create_pci_dev+0x35c/0x400
[    0.178130] [c0000003f7483ab0] [c00000000005ba14] __of_scan_bus+0x114/0x1e0
[    0.178135] [c0000003f7483b20] [c000000000059a9c] pcibios_scan_phb+0x23c/0x270
[    0.178140] [c0000003f7483bc0] [c000000000d8057c] pcibios_init+0x84/0xdc
[    0.178144] [c0000003f7483c40] [c00000000000d680] do_one_initcall+0x60/0x1c0
[    0.178149] [c0000003f7483d00] [c000000000d74454] kernel_init_freeable+0x2c4/0x3a0
[    0.178153] [c0000003f7483dc0] [c00000000000db24] kernel_init+0x24/0x150
[    0.178158] [c0000003f7483e30] [c00000000000bc28] ret_from_kernel_thread+0x5c/0xb4

...


And the patch below fixes it. Thanks.

cheers

> ====================== cut here =============================
>
> It looks like the pci_find_pcie_root_port() was trying to
> find the Root Port for the PCI device which is the Root
> Port already, it will return NULL and trigger the problem,
> so check the highest_pcie_bridge to fix thie problem.
>
> Fixes: a99b646afa8a ("PCI: Disable PCIe Relaxed Ordering if unsupported")
> Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
> ---
>  drivers/pci/pci.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index af0cc34..7e2022f 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -522,7 +522,8 @@ struct pci_dev *pci_find_pcie_root_port(struct pci_dev *dev)
>  		bridge = pci_upstream_bridge(bridge);
>  	}
>  
> -	if (pci_pcie_type(highest_pcie_bridge) != PCI_EXP_TYPE_ROOT_PORT)
> +	if (highest_pcie_bridge &&
> +	    pci_pcie_type(highest_pcie_bridge) != PCI_EXP_TYPE_ROOT_PORT)
>  		return NULL;
>  
>  	return highest_pcie_bridge;
> -- 
> 1.8.3.1

^ permalink raw reply

* Re: [PATCH] tun: make tun_build_skb() thread safe
From: Jason Wang @ 2017-08-17  3:37 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: davem, netdev, linux-kernel, Eric Dumazet
In-Reply-To: <20170816195342-mutt-send-email-mst@kernel.org>



On 2017年08月17日 00:55, Michael S. Tsirkin wrote:
> On Wed, Aug 16, 2017 at 10:14:33PM +0800, Jason Wang wrote:
>> From: Eric Dumazet<eric.dumazet@gmail.com>
>>
>> tun_build_skb() is not thread safe since it uses per queue page frag,
>> this will break things when multiple threads are sending through same
>> queue. Switch to use per-thread generator (no lock involved).
>>
>> Fixes: 66ccbc9c87c2 ("tap: use build_skb() for small packet")
>> Tested-by: Jason Wang<jasowang@redhat.com>
>> Signed-off-by: Eric Dumazet<eric.dumazet@gmail.com>
>> Signed-off-by: Jason Wang<jasowang@redhat.com>
> Acked-by: Michael S. Tsirkin<mst@redhat.com>
>
> Jason, given the switch to task_frag, would it be worth it to look at
> using higher order allocs along the lines of
> 5640f7685831e088fe6c2e1f863a6805962f8e81 as well?
>

I think we've already used high order, don't we?

Thanks

^ permalink raw reply

* Re: Regression: Bug 196547 - Since 4.12 - bonding module not working with wireless drivers
From: Ben Greear @ 2017-08-17  3:32 UTC (permalink / raw)
  To: Dan Williams, David Miller
  Cc: james-fvV4AYHggTTR7s880joybQ, futur.andy-gM/Ye1E23mwN+BqQ9rBEUg,
	kvalo-sgV2jX0FEOL9JmXXK+q4OQ,
	arend.vanspriel-dY08KVG/lbpWk0Htik3J/w,
	maheshb-hpIqsD4AKlfQT0dZR+AlfA, andy-QlMahl40kYEqcZcGjlUOXw,
	netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-wireless-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1502939894.30484.9.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

On 08/16/2017 08:18 PM, Dan Williams wrote:
> On Wed, 2017-08-16 at 19:36 -0700, Ben Greear wrote:
>> On 08/16/2017 07:11 PM, Dan Williams wrote:
>>> On Wed, 2017-08-16 at 14:31 -0700, David Miller wrote:
>>>> From: Dan Williams <dcbw-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
>>>> Date: Wed, 16 Aug 2017 16:22:41 -0500
>>>>
>>>>> My biggest suggestion is that perhaps bonding should grow
>>>>
>>>> hysteresis
>>>>> for link speeds. Since WiFi can change speed every packet, you
>>>>
>>>> probably
>>>>> don't want the bond characteristics changing every couple
>>>>> seconds
>>>>
>>>> just
>>>>> in case your WiFi link is jumping around.  Ethernet won't
>>>>> bounce
>>>>
>>>> around
>>>>> that much, so the hysteresis would have no effect there.  Or,
>>>>> if
>>>>
>>>> people
>>>>> are concerned about response time to speed changes on ethernet
>>>>
>>>> (where
>>>>> you probably do want an instant switch-over) some new flag to
>>>>
>>>> indicate
>>>>> that certain devices don't have stable speeds over time.
>>>>
>>>> Or just report the average of the range the wireless link can
>>>> hit,
>>>> and
>>>> be done with it.
>>>>
>>>> I think you guys are overcomplicating things.
>>>
>>> That range can be from 1 to > 800Mb/s.  No, it won't usually be all
>>> over that range, but it won't be uncommon to fluctuate by hundreds
>>> of
>>> Mb/s.  I'm not sure a simple average is really the answer
>>> here.  Even
>>> doing that would require new knobs to ethtool, since the rate
>>> depends
>>> heavily on card capabilities and also what AP you're connected to
>>> *at
>>> that moment*.  If you roam to another AP, then the max speed can
>>> certainly change.
>>>
>>> You'll probably say "aim for the 75% case" or something like that,
>>> which is fine, but then you're depending on your 75% case to be (a)
>>> single AP, (b) never move (eg, only bond wifi + ethernet), (c)
>>> little
>>> radio interference.  I'm not sure I'd buy that.  If I've put words
>>> in
>>> your mouth, forgive me.
>>
>> If you keep ethtool API simple and just return the last (rx-rate +
>> tx-rate) / 2, or the rate averaged
>> over the last 100 frames or 10 seconds, then the caller can do longer
>> term averaging
>> as it sees fit.  Probably no need for lots of averaging complexity in
>> the kernel.
>
> Yeah, that works too, but I was thinking it was better to present the
> actual data through ethtool so that things other than bonding could use
> it, and since bonding is the thing that actually cares about the
> fluctuation, make it do the more extensive processing.

What do you mean by 'actual data'?  If you want to know the most accurate
transmit/rx rate info, then you need to pay attention to each and every frame's tx/rx rate, as
well as it's ampdu/amsdu, retries, etc.  It is virtually impossible.

So, you will have to settle for something less...  I suggest something simple
to calculate, similar to existing values that are available via debugfs and/or
'iw dev foo station dump', etc.  Let higher layers manipulate the raw data
as they see fit (they can query ethtool as often as they like).

Thanks,
Ben


-- 
Ben Greear <greearb-my8/4N5VtI7c+919tysfdA@public.gmane.org>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply

* [PATCH net v2 2/2] net: ixgbe: Use new PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag
From: Ding Tianhong @ 2017-08-17  3:25 UTC (permalink / raw)
  To: davem, jeffrey.t.kirsher, keescook, linux-kernel, sparclinux,
	intel-wired-lan, alexander.duyck, netdev, linuxarm
  Cc: Ding Tianhong
In-Reply-To: <1502940316-13384-1-git-send-email-dingtianhong@huawei.com>

The ixgbe driver use the compile check to determine if it can
send TLPs to Root Port with the Relaxed Ordering Attribute set,
this is too inconvenient, now the new flag PCI_DEV_FLAGS_NO_RELAXED_ORDERING
has been added to the kernel and we could check the bit4 in the PCIe
Device Control register to determine whether we should use the Relaxed
Ordering Attributes or not, so use this new way in the ixgbe driver.

Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_82598.c  | 37 ++++++++++++-------------
 drivers/net/ethernet/intel/ixgbe/ixgbe_common.c | 32 +++++++++++----------
 2 files changed, 35 insertions(+), 34 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_82598.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_82598.c
index 523f9d0..d1571e3 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_82598.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_82598.c
@@ -175,31 +175,30 @@ static s32 ixgbe_init_phy_ops_82598(struct ixgbe_hw *hw)
  **/
 static s32 ixgbe_start_hw_82598(struct ixgbe_hw *hw)
 {
-#ifndef CONFIG_SPARC
-	u32 regval;
-	u32 i;
-#endif
+	u32 regval, i;
 	s32 ret_val;
+	struct ixgbe_adapter *adapter = hw->back;
 
 	ret_val = ixgbe_start_hw_generic(hw);
 
-#ifndef CONFIG_SPARC
-	/* Disable relaxed ordering */
-	for (i = 0; ((i < hw->mac.max_tx_queues) &&
-	     (i < IXGBE_DCA_MAX_QUEUES_82598)); i++) {
-		regval = IXGBE_READ_REG(hw, IXGBE_DCA_TXCTRL(i));
-		regval &= ~IXGBE_DCA_TXCTRL_DESC_WRO_EN;
-		IXGBE_WRITE_REG(hw, IXGBE_DCA_TXCTRL(i), regval);
-	}
+	if (!pcie_relaxed_ordering_enabled(adapter->pdev)) {
+		/* Disable relaxed ordering */
+		for (i = 0; ((i < hw->mac.max_tx_queues) &&
+		     (i < IXGBE_DCA_MAX_QUEUES_82598)); i++) {
+			regval = IXGBE_READ_REG(hw, IXGBE_DCA_TXCTRL(i));
+			regval &= ~IXGBE_DCA_TXCTRL_DESC_WRO_EN;
+			IXGBE_WRITE_REG(hw, IXGBE_DCA_TXCTRL(i), regval);
+		}
 
-	for (i = 0; ((i < hw->mac.max_rx_queues) &&
-	     (i < IXGBE_DCA_MAX_QUEUES_82598)); i++) {
-		regval = IXGBE_READ_REG(hw, IXGBE_DCA_RXCTRL(i));
-		regval &= ~(IXGBE_DCA_RXCTRL_DATA_WRO_EN |
-			    IXGBE_DCA_RXCTRL_HEAD_WRO_EN);
-		IXGBE_WRITE_REG(hw, IXGBE_DCA_RXCTRL(i), regval);
+		for (i = 0; ((i < hw->mac.max_rx_queues) &&
+		     (i < IXGBE_DCA_MAX_QUEUES_82598)); i++) {
+			regval = IXGBE_READ_REG(hw, IXGBE_DCA_RXCTRL(i));
+			regval &= ~(IXGBE_DCA_RXCTRL_DATA_WRO_EN |
+				    IXGBE_DCA_RXCTRL_HEAD_WRO_EN);
+			IXGBE_WRITE_REG(hw, IXGBE_DCA_RXCTRL(i), regval);
+		}
 	}
-#endif
+
 	if (ret_val)
 		return ret_val;
 
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_common.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_common.c
index d4933d2..d1052ee 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_common.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_common.c
@@ -342,6 +342,7 @@ s32 ixgbe_start_hw_generic(struct ixgbe_hw *hw)
 s32 ixgbe_start_hw_gen2(struct ixgbe_hw *hw)
 {
 	u32 i;
+	struct ixgbe_adapter *adapter = hw->back;
 
 	/* Clear the rate limiters */
 	for (i = 0; i < hw->mac.max_tx_queues; i++) {
@@ -350,25 +351,26 @@ s32 ixgbe_start_hw_gen2(struct ixgbe_hw *hw)
 	}
 	IXGBE_WRITE_FLUSH(hw);
 
-#ifndef CONFIG_SPARC
-	/* Disable relaxed ordering */
-	for (i = 0; i < hw->mac.max_tx_queues; i++) {
-		u32 regval;
+	if (!pcie_relaxed_ordering_enabled(adapter->pdev)) {
+		/* Disable relaxed ordering */
+		for (i = 0; i < hw->mac.max_tx_queues; i++) {
+			u32 regval;
 
-		regval = IXGBE_READ_REG(hw, IXGBE_DCA_TXCTRL_82599(i));
-		regval &= ~IXGBE_DCA_TXCTRL_DESC_WRO_EN;
-		IXGBE_WRITE_REG(hw, IXGBE_DCA_TXCTRL_82599(i), regval);
-	}
+			regval = IXGBE_READ_REG(hw, IXGBE_DCA_TXCTRL_82599(i));
+			regval &= ~IXGBE_DCA_TXCTRL_DESC_WRO_EN;
+			IXGBE_WRITE_REG(hw, IXGBE_DCA_TXCTRL_82599(i), regval);
+		}
 
-	for (i = 0; i < hw->mac.max_rx_queues; i++) {
-		u32 regval;
+		for (i = 0; i < hw->mac.max_rx_queues; i++) {
+			u32 regval;
 
-		regval = IXGBE_READ_REG(hw, IXGBE_DCA_RXCTRL(i));
-		regval &= ~(IXGBE_DCA_RXCTRL_DATA_WRO_EN |
-			    IXGBE_DCA_RXCTRL_HEAD_WRO_EN);
-		IXGBE_WRITE_REG(hw, IXGBE_DCA_RXCTRL(i), regval);
+			regval = IXGBE_READ_REG(hw, IXGBE_DCA_RXCTRL(i));
+			regval &= ~(IXGBE_DCA_RXCTRL_DATA_WRO_EN |
+				    IXGBE_DCA_RXCTRL_HEAD_WRO_EN);
+			IXGBE_WRITE_REG(hw, IXGBE_DCA_RXCTRL(i), regval);
+		}
 	}
-#endif
+
 	return 0;
 }
 
-- 
1.8.3.1

^ permalink raw reply related

* [PATCH net v2 1/2] Revert commit 1a8b6d76dc5b ("net:add one common config...")
From: Ding Tianhong @ 2017-08-17  3:25 UTC (permalink / raw)
  To: davem, jeffrey.t.kirsher, keescook, linux-kernel, sparclinux,
	intel-wired-lan, alexander.duyck, netdev, linuxarm
  Cc: Ding Tianhong
In-Reply-To: <1502940316-13384-1-git-send-email-dingtianhong@huawei.com>

The new flag PCI_DEV_FLAGS_NO_RELAXED_ORDERING has been added
to indicate that Relaxed Ordering Attributes (RO) should not
be used for Transaction Layer Packets (TLP) targeted toward
these affected Root Port, it will clear the bit4 in the PCIe
Device Control register, so the PCIe device drivers could
query PCIe configuration space to determine if it can send
TLPs to Root Port with the Relaxed Ordering Attributes set.

With this new flag  we don't need the config ARCH_WANT_RELAX_ORDER
to control the Relaxed Ordering Attributes for the ixgbe drivers
just like the commit 1a8b6d76dc5b ("net:add one common config...") did,
so revert this commit.

Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
---
 arch/Kconfig                                    | 3 ---
 arch/sparc/Kconfig                              | 1 -
 drivers/net/ethernet/intel/ixgbe/ixgbe_common.c | 2 +-
 3 files changed, 1 insertion(+), 5 deletions(-)

diff --git a/arch/Kconfig b/arch/Kconfig
index 21d0089..00cfc63 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -928,9 +928,6 @@ config STRICT_MODULE_RWX
 	  and non-text memory will be made non-executable. This provides
 	  protection against certain security exploits (e.g. writing to text)
 
-config ARCH_WANT_RELAX_ORDER
-	bool
-
 config REFCOUNT_FULL
 	bool "Perform full reference count validation at the expense of speed"
 	help
diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig
index a4a6261..987a575 100644
--- a/arch/sparc/Kconfig
+++ b/arch/sparc/Kconfig
@@ -44,7 +44,6 @@ config SPARC
 	select ARCH_HAS_SG_CHAIN
 	select CPU_NO_EFFICIENT_FFS
 	select LOCKDEP_SMALL if LOCKDEP
-	select ARCH_WANT_RELAX_ORDER
 
 config SPARC32
 	def_bool !64BIT
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_common.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_common.c
index 4e35e70..d4933d2 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_common.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_common.c
@@ -350,7 +350,7 @@ s32 ixgbe_start_hw_gen2(struct ixgbe_hw *hw)
 	}
 	IXGBE_WRITE_FLUSH(hw);
 
-#ifndef CONFIG_ARCH_WANT_RELAX_ORDER
+#ifndef CONFIG_SPARC
 	/* Disable relaxed ordering */
 	for (i = 0; i < hw->mac.max_tx_queues; i++) {
 		u32 regval;
-- 
1.8.3.1



^ permalink raw reply related

* [PATCH net v2 0/2] net: ixgbe: Use new flag to disable Relaxed Ordering
From: Ding Tianhong @ 2017-08-17  3:25 UTC (permalink / raw)
  To: davem, jeffrey.t.kirsher, keescook, linux-kernel, sparclinux,
	intel-wired-lan, alexander.duyck, netdev, linuxarm
  Cc: Ding Tianhong

The new flag PCI_DEV_FLAGS_NO_RELAXED_ORDERING has been added
to indicate that Relaxed Ordering Attributes (RO) should not
be used for Transaction Layer Packets (TLP) targeted toward
these affected Root Port, it will clear the bit4 in the PCIe
Device Control register, so the PCIe device drivers could
query PCIe configuration space to determine if it can send
TLPs to Root Port with the Relaxed Ordering Attributes set.

The ixgbe driver could use this flag to determine if it can
send TLPs to Root Port with the Relaxed Ordering Attributes set.

v2: Simplify the original program according Alex's suggestion,
    remove the new ixgbe flag2 and only check the bit4 in the
    PCIe Device Control register. 

Ding Tianhong (2):
  Revert commit 1a8b6d76dc5b ("net:add one common config...")
  net: ixgbe: Use new PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag

 arch/Kconfig                                    |  3 --
 arch/sparc/Kconfig                              |  1 -
 drivers/net/ethernet/intel/ixgbe/ixgbe_82598.c  | 37 ++++++++++++-------------
 drivers/net/ethernet/intel/ixgbe/ixgbe_common.c | 32 +++++++++++----------
 4 files changed, 35 insertions(+), 38 deletions(-)

-- 
1.8.3.1

^ permalink raw reply

* Re: Regression: Bug 196547 - Since 4.12 - bonding module not working with wireless drivers
From: Dan Williams @ 2017-08-17  3:18 UTC (permalink / raw)
  To: Ben Greear, David Miller
  Cc: james-fvV4AYHggTTR7s880joybQ, futur.andy-gM/Ye1E23mwN+BqQ9rBEUg,
	kvalo-sgV2jX0FEOL9JmXXK+q4OQ,
	arend.vanspriel-dY08KVG/lbpWk0Htik3J/w,
	maheshb-hpIqsD4AKlfQT0dZR+AlfA, andy-QlMahl40kYEqcZcGjlUOXw,
	netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-wireless-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <2abaf7ec-947a-6a30-e6d3-4f75605dd50d-my8/4N5VtI7c+919tysfdA@public.gmane.org>

On Wed, 2017-08-16 at 19:36 -0700, Ben Greear wrote:
> On 08/16/2017 07:11 PM, Dan Williams wrote:
> > On Wed, 2017-08-16 at 14:31 -0700, David Miller wrote:
> > > From: Dan Williams <dcbw-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> > > Date: Wed, 16 Aug 2017 16:22:41 -0500
> > > 
> > > > My biggest suggestion is that perhaps bonding should grow
> > > 
> > > hysteresis
> > > > for link speeds. Since WiFi can change speed every packet, you
> > > 
> > > probably
> > > > don't want the bond characteristics changing every couple
> > > > seconds
> > > 
> > > just
> > > > in case your WiFi link is jumping around.  Ethernet won't
> > > > bounce
> > > 
> > > around
> > > > that much, so the hysteresis would have no effect there.  Or,
> > > > if
> > > 
> > > people
> > > > are concerned about response time to speed changes on ethernet
> > > 
> > > (where
> > > > you probably do want an instant switch-over) some new flag to
> > > 
> > > indicate
> > > > that certain devices don't have stable speeds over time.
> > > 
> > > Or just report the average of the range the wireless link can
> > > hit,
> > > and
> > > be done with it.
> > > 
> > > I think you guys are overcomplicating things.
> > 
> > That range can be from 1 to > 800Mb/s.  No, it won't usually be all
> > over that range, but it won't be uncommon to fluctuate by hundreds
> > of
> > Mb/s.  I'm not sure a simple average is really the answer
> > here.  Even
> > doing that would require new knobs to ethtool, since the rate
> > depends
> > heavily on card capabilities and also what AP you're connected to
> > *at
> > that moment*.  If you roam to another AP, then the max speed can
> > certainly change.
> > 
> > You'll probably say "aim for the 75% case" or something like that,
> > which is fine, but then you're depending on your 75% case to be (a)
> > single AP, (b) never move (eg, only bond wifi + ethernet), (c)
> > little
> > radio interference.  I'm not sure I'd buy that.  If I've put words
> > in
> > your mouth, forgive me.
> 
> If you keep ethtool API simple and just return the last (rx-rate +
> tx-rate) / 2, or the rate averaged
> over the last 100 frames or 10 seconds, then the caller can do longer
> term averaging
> as it sees fit.  Probably no need for lots of averaging complexity in
> the kernel.

Yeah, that works too, but I was thinking it was better to present the
actual data through ethtool so that things other than bonding could use
it, and since bonding is the thing that actually cares about the
fluctuation, make it do the more extensive processing.

Dan


> rate-ctrl for wifi basically doesn't happen until you transmit or
> receive a
> fairly steady stream, so it will fluctuate a lot.

^ permalink raw reply

* Re: [PATCH net-next 3/3 v5] drivers: net: ethernet: qualcomm: rmnet: Initial implementation
From: Dan Williams @ 2017-08-17  3:16 UTC (permalink / raw)
  To: Subash Abhinov Kasiviswanathan, netdev, davem, fengguang.wu, jiri,
	stephen, David.Laight, marcel
In-Reply-To: <1502931307-517-4-git-send-email-subashab@codeaurora.org>

On Wed, 2017-08-16 at 18:55 -0600, Subash Abhinov Kasiviswanathan
wrote:
> RmNet driver provides a transport agnostic MAP (multiplexing and
> aggregation protocol) support in embedded module. Module provides
> virtual network devices which can be attached to any IP-mode
> physical device. This will be used to provide all MAP functionality
> on future hardware in a single consistent location.

Some quick review comments, more to come...

> Signed-off-by: Subash Abhinov Kasiviswanathan
> <subashab@codeaurora.org>
> ---
>  Documentation/networking/rmnet.txt                 |  82 ++++
>  drivers/net/ethernet/qualcomm/Kconfig              |   2 +
>  drivers/net/ethernet/qualcomm/Makefile             |   2 +
>  drivers/net/ethernet/qualcomm/rmnet/Kconfig        |  12 +
>  drivers/net/ethernet/qualcomm/rmnet/Makefile       |  14 +
>  drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c | 467
> +++++++++++++++++++++
>  drivers/net/ethernet/qualcomm/rmnet/rmnet_config.h |  58 +++
>  .../net/ethernet/qualcomm/rmnet/rmnet_handlers.c   | 297
> +++++++++++++
>  .../net/ethernet/qualcomm/rmnet/rmnet_handlers.h   |  26 ++
>  drivers/net/ethernet/qualcomm/rmnet/rmnet_main.c   |  37 ++
>  drivers/net/ethernet/qualcomm/rmnet/rmnet_map.h    |  88 ++++
>  .../ethernet/qualcomm/rmnet/rmnet_map_command.c    | 122 ++++++
>  .../net/ethernet/qualcomm/rmnet/rmnet_map_data.c   | 105 +++++
>  .../net/ethernet/qualcomm/rmnet/rmnet_private.h    |  47 +++
>  drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.c    | 267
> ++++++++++++
>  drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.h    |  32 ++
>  16 files changed, 1658 insertions(+)
>  create mode 100644 Documentation/networking/rmnet.txt
>  create mode 100644 drivers/net/ethernet/qualcomm/rmnet/Kconfig
>  create mode 100644 drivers/net/ethernet/qualcomm/rmnet/Makefile
>  create mode 100644
> drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c
>  create mode 100644
> drivers/net/ethernet/qualcomm/rmnet/rmnet_config.h
>  create mode 100644
> drivers/net/ethernet/qualcomm/rmnet/rmnet_handlers.c
>  create mode 100644
> drivers/net/ethernet/qualcomm/rmnet/rmnet_handlers.h
>  create mode 100644 drivers/net/ethernet/qualcomm/rmnet/rmnet_main.c
>  create mode 100644 drivers/net/ethernet/qualcomm/rmnet/rmnet_map.h
>  create mode 100644
> drivers/net/ethernet/qualcomm/rmnet/rmnet_map_command.c
>  create mode 100644
> drivers/net/ethernet/qualcomm/rmnet/rmnet_map_data.c
>  create mode 100644
> drivers/net/ethernet/qualcomm/rmnet/rmnet_private.h
>  create mode 100644 drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.c
>  create mode 100644 drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.h
> 
> diff --git a/Documentation/networking/rmnet.txt
> b/Documentation/networking/rmnet.txt
> new file mode 100644
> index 0000000..6b341ea
> --- /dev/null
> +++ b/Documentation/networking/rmnet.txt
> @@ -0,0 +1,82 @@
> +1. Introduction
> +
> +rmnet driver is used for supporting the Multiplexing and aggregation
> +Protocol (MAP). This protocol is used by all recent chipsets using
> Qualcomm
> +Technologies, Inc. modems.
> +
> +This driver can be used to register onto any physical network device
> in
> +IP mode. Physical transports include USB, HSIC, PCIe and IP
> accelerator.

I'm probably forgetting a bit of the original context.  Say you have
one of these "network devices in IP mode".  What happens with the
device *before* you attach an rmnet child?  Can it pass traffic before
that point at all, or does the modem expect MAP already?

> +Multiplexing allows for creation of logical netdevices (rmnet
> devices) to
> +handle multiple private data networks (PDN) like a default internet,
> tethering,
> +multimedia messaging service (MMS) or IP media subsystem (IMS).
> Hardware sends
> +packets with MAP headers to rmnet. Based on the multiplexer id,
> rmnet
> +routes to the appropriate PDN after removing the MAP header.
> +
> +Aggregation is required to achieve high data rates. This involves
> hardware
> +sending aggregated bunch of MAP frames. rmnet driver will de-
> aggregate
> +these MAP frames and send them to appropriate PDN's.
> +
> +2. Packet format
> +
> +a. MAP packet (data / control)
> +
> +MAP header has the same endianness of the IP packet.
> +
> +Packet format -
> +
> +Bit             0             1           2-7      8 -
> 15           16 - 31
> +Function   Command / Data   Reserved     Pad   Multiplexer
> ID    Payload length
> +Bit            32 - x
> +Function     Raw  Bytes
> +
> +Command (1)/ Data (0) bit value is to indicate if the packet is a
> MAP command
> +or data packet. Control packet is used for transport level flow
> control. Data
> +packets are standard IP packets.
> +
> +Reserved bits are usually zeroed out and to be ignored by receiver.
> +
> +Padding is number of bytes to be added for 4 byte alignment if
> required by
> +hardware.
> +
> +Multiplexer ID is to indicate the PDN on which data has to be sent.
> +
> +Payload length includes the padding length but does not include MAP
> header
> +length.
> +
> +b. MAP packet (command specific)
> +
> +Bit             0             1           2-7      8 -
> 15           16 - 31
> +Function   Command         Reserved     Pad   Multiplexer
> ID    Payload length
> +Bit          32 - 39        40 - 45    46 - 47       48 - 63
> +Function   Command name    Reserved   Command Type   Reserved
> +Bit          64 - 95
> +Function   Transaction ID
> +Bit          96 - 127
> +Function   Command data
> +
> +Command 1 indicates disabling flow while 2 is enabling flow
> +
> +Command types -
> +0 for MAP command request
> +1 is to acknowledge the receipt of a command
> +2 is for unsupported commands
> +3 is for error during processing of commands
> +
> +c. Aggregation
> +
> +Aggregation is multiple MAP packets (can be data or command)
> delivered to
> +rmnet in a single linear skb. rmnet will process the individual
> +packets and either ACK the MAP command or deliver the IP packet to
> the
> +network stack as needed
> +
> +MAP header|IP Packet|Optional padding|MAP header|IP Packet|Optional
> padding....
> +MAP header|IP Packet|Optional padding|MAP header|Command
> Packet|Optional pad...
> +
> +3. Userspace configuration
> +
> +rmnet userspace configuration is done through netlink library
> librmnetctl
> +and command line utility rmnetcli. Utility is hosted in codeaurora
> forum git.
> +The driver uses rtnl_link_ops for communication.
> +
> +https://source.codeaurora.org/quic/la/platform/vendor/qcom-opensourc
> e/dataservices/tree/rmnetctl
> diff --git a/drivers/net/ethernet/qualcomm/Kconfig
> b/drivers/net/ethernet/qualcomm/Kconfig
> index 877675a..f520071 100644
> --- a/drivers/net/ethernet/qualcomm/Kconfig
> +++ b/drivers/net/ethernet/qualcomm/Kconfig
> @@ -59,4 +59,6 @@ config QCOM_EMAC
>  	  low power, Receive-Side Scaling (RSS), and IEEE 1588-2008
>  	  Precision Clock Synchronization Protocol.
>  
> +source "drivers/net/ethernet/qualcomm/rmnet/Kconfig"
> +
>  endif # NET_VENDOR_QUALCOMM
> diff --git a/drivers/net/ethernet/qualcomm/Makefile
> b/drivers/net/ethernet/qualcomm/Makefile
> index 92fa7c4..c4f38bd 100644
> --- a/drivers/net/ethernet/qualcomm/Makefile
> +++ b/drivers/net/ethernet/qualcomm/Makefile
> @@ -9,3 +9,5 @@ obj-$(CONFIG_QCA7000_UART) += qcauart.o
>  qcauart-objs := qca_uart.o
>  
>  obj-y += emac/
> +
> +obj-$(CONFIG_RMNET) += rmnet/
> \ No newline at end of file
> diff --git a/drivers/net/ethernet/qualcomm/rmnet/Kconfig
> b/drivers/net/ethernet/qualcomm/rmnet/Kconfig
> new file mode 100644
> index 0000000..4948f14
> --- /dev/null
> +++ b/drivers/net/ethernet/qualcomm/rmnet/Kconfig
> @@ -0,0 +1,12 @@
> +#
> +# RMNET MAP driver
> +#
> +
> +menuconfig RMNET
> +	depends on NETDEVICES
> +	bool "RmNet MAP driver"
> +	default n
> +	---help---
> +	  If you say Y here, then the rmnet module will be
> statically
> +	  compiled into the kernel. The rmnet module provides MAP
> +	  functionality for embedded and bridged traffic.
> diff --git a/drivers/net/ethernet/qualcomm/rmnet/Makefile
> b/drivers/net/ethernet/qualcomm/rmnet/Makefile
> new file mode 100644
> index 0000000..2b6c9cf
> --- /dev/null
> +++ b/drivers/net/ethernet/qualcomm/rmnet/Makefile
> @@ -0,0 +1,14 @@
> +#
> +# Makefile for the RMNET module
> +#
> +
> +rmnet-y		 := rmnet_main.o
> +rmnet-y		 += rmnet_config.o
> +rmnet-y		 += rmnet_vnd.o
> +rmnet-y		 += rmnet_handlers.o
> +rmnet-y		 += rmnet_map_data.o
> +rmnet-y		 += rmnet_map_command.o
> +rmnet-y		 += rmnet_stats.o
> +obj-$(CONFIG_RMNET) += rmnet.o
> +
> +CFLAGS_rmnet_main.o := -I$(src)
> diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c
> b/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c
> new file mode 100644
> index 0000000..3a6027c
> --- /dev/null
> +++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c
> @@ -0,0 +1,467 @@
> +/* Copyright (c) 2013-2017, The Linux Foundation. All rights
> reserved.
> + *
> + * This program is free software; you can redistribute it and/or
> modify
> + * it under the terms of the GNU General Public License version 2
> and
> + * only version 2 as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * RMNET configuration engine
> + *
> + */
> +
> +#include <net/sock.h>
> +#include <linux/netlink.h>
> +#include <linux/netdevice.h>
> +#include "rmnet_config.h"
> +#include "rmnet_handlers.h"
> +#include "rmnet_vnd.h"
> +#include "rmnet_private.h"
> +
> +/* Local Definitions and Declarations */
> +#define RMNET_LOCAL_LOGICAL_ENDPOINT -1
> +
> +struct rmnet_free_vnd_work {
> +	struct work_struct work;
> +	int vnd_id[RMNET_MAX_VND];
> +	int count;
> +	struct net_device *real_dev;
> +};
> +
> +static inline int
> +rmnet_is_real_dev_registered(const struct net_device *real_dev)
> +{
> +	rx_handler_func_t *rx_handler;
> +
> +	rx_handler = rcu_dereference(real_dev->rx_handler);
> +	return (rx_handler == rmnet_rx_handler);
> +}
> +
> +static inline struct rmnet_real_dev_info*
> +__rmnet_get_real_dev_info(const struct net_device *real_dev)
> +{
> +	if (rmnet_is_real_dev_registered(real_dev))
> +		return (struct rmnet_real_dev_info *)
> +			rcu_dereference(real_dev->rx_handler_data);
> +	else
> +		return 0;
> +}
> +
> +static struct rmnet_endpoint*
> +rmnet_get_endpoint(struct net_device *dev, int config_id)
> +{
> +	struct rmnet_real_dev_info *rdinfo;
> +	struct rmnet_endpoint *ep;
> +
> +	if (!rmnet_is_real_dev_registered(dev)) {
> +		ep = rmnet_vnd_get_endpoint(dev);
> +	} else {
> +		rdinfo = __rmnet_get_real_dev_info(dev);
> +
> +		if (!rdinfo)
> +			return NULL;
> +
> +		if (config_id == RMNET_LOCAL_LOGICAL_ENDPOINT)
> +			ep = &rdinfo->local_ep;
> +		else
> +			ep = &rdinfo->muxed_ep[config_id];
> +	}
> +
> +	return ep;
> +}
> +
> +static int rmnet_unregister_real_device(struct net_device *dev)
> +{
> +	struct rmnet_real_dev_info *rdinfo;
> +	struct rmnet_endpoint *ep;
> +	int config_id;
> +
> +	ASSERT_RTNL();
> +
> +	netdev_info(dev, "Removing device %s\n", dev->name);
> +
> +	if (!rmnet_is_real_dev_registered(dev))
> +		return -EINVAL;
> +
> +	config_id = RMNET_LOCAL_LOGICAL_ENDPOINT;
> +	for (; config_id < RMNET_MAX_LOGICAL_EP; config_id++) {
> +		ep = rmnet_get_endpoint(dev, config_id);
> +		if (ep && ep->refcount)
> +			return -EINVAL;
> +	}
> +
> +	rdinfo = __rmnet_get_real_dev_info(dev);
> +	kfree(rdinfo);
> +
> +	netdev_rx_handler_unregister(dev);
> +
> +	dev_put(dev);
> +	return 0;
> +}
> +
> +static int rmnet_set_ingress_data_format(struct net_device *dev, u32
> idf)
> +{
> +	struct rmnet_real_dev_info *rdinfo;
> +
> +	ASSERT_RTNL();
> +
> +	netdev_info(dev, "Ingress format 0x%08X\n", idf);
> +
> +	rdinfo = __rmnet_get_real_dev_info(dev);
> +	if (!rdinfo)
> +		return -EINVAL;
> +
> +	rdinfo->ingress_data_format = idf;
> +
> +	return 0;
> +}
> +
> +static int rmnet_set_egress_data_format(struct net_device *dev, u32
> edf,
> +					u16 agg_size, u16 agg_count)
> +{
> +	struct rmnet_real_dev_info *rdinfo;
> +
> +	ASSERT_RTNL();
> +
> +	netdev_info(dev, "Egress format 0x%08X agg size %d cnt
> %d\n",
> +		    edf, agg_size, agg_count);
> +
> +	rdinfo = __rmnet_get_real_dev_info(dev);
> +	if (!rdinfo)
> +		return -EINVAL;
> +
> +	rdinfo->egress_data_format = edf;
> +
> +	return 0;
> +}
> +
> +static int rmnet_register_real_device(struct net_device *real_dev)
> +{
> +	struct rmnet_real_dev_info *rdinfo;
> +	int rc;
> +
> +	ASSERT_RTNL();
> +
> +	if (rmnet_is_real_dev_registered(real_dev)) {
> +		netdev_info(real_dev, "cannot register with this
> dev\n");
> +		return -EINVAL;
> +	}
> +
> +	rdinfo = kzalloc(sizeof(*rdinfo), GFP_ATOMIC);
> +	if (!rdinfo)
> +		return -ENOMEM;
> +
> +	rdinfo->dev = real_dev;
> +	rc = netdev_rx_handler_register(real_dev, rmnet_rx_handler,
> rdinfo);
> +
> +	if (rc) {
> +		kfree(rdinfo);
> +		return -EBUSY;
> +	}
> +
> +	dev_hold(real_dev);

I could be entirely wrong, but isn't this mostly handled for you if you
use netdev_upper_dev_link() like ipvlan and macvlan do?  That looks
like it provides a couple things: (a) handles the dev_hold() for you
and (b) provides mechanisms to get the upper device if you need it. 
I'm not actually familiar with the "adjacent device" functionality, but
it looked similar in purpose.

> +	return 0;
> +}
> +
> +static int __rmnet_set_endpoint_config(struct net_device *dev, int
> config_id,
> +				       struct rmnet_endpoint *ep)
> +{
> +	struct rmnet_endpoint *dev_ep;
> +
> +	ASSERT_RTNL();
> +
> +	dev_ep = rmnet_get_endpoint(dev, config_id);
> +
> +	if (!dev_ep || dev_ep->refcount)
> +		return -EINVAL;
> +
> +	memcpy(dev_ep, ep, sizeof(struct rmnet_endpoint));
> +	if (config_id == RMNET_LOCAL_LOGICAL_ENDPOINT)
> +		dev_ep->mux_id = 0;
> +	else
> +		dev_ep->mux_id = config_id;

So... if config_id is LOGICAL_ENDPOINT (-1) this sets mux_id to 0.

But if config_id is 0, it *also* sets mux_id to 0?

Can you clarify what mux_id = 0 actually means here?  Can you also talk
a bit about the difference between local_ep and muxed_ep?

> +	dev_hold(dev_ep->egress_dev);
> +	return 0;
> +}
> +
> +static int __rmnet_unset_endpoint_config(struct net_device *dev, int
> config_id)
> +{
> +	struct rmnet_endpoint *ep = 0;
> +
> +	ASSERT_RTNL();
> +
> +	ep = rmnet_get_endpoint(dev, config_id);
> +
> +	if (!ep || !ep->refcount)
> +		return -EINVAL;
> +
> +	dev_put(ep->egress_dev);
> +	memset(ep, 0, sizeof(struct rmnet_endpoint));
> +
> +	return 0;
> +}
> +
> +static int rmnet_set_endpoint_config(struct net_device *dev,
> +				     int config_id, u8 rmnet_mode,
> +				     struct net_device *egress_dev)
> +{
> +	struct rmnet_endpoint ep;
> +
> +	netdev_info(dev, "id %d mode %d dev %s\n",
> +		    config_id, rmnet_mode, egress_dev->name);
> +
> +	if (config_id < RMNET_LOCAL_LOGICAL_ENDPOINT ||
> +	    config_id >= RMNET_MAX_LOGICAL_EP)
> +		return -EINVAL;
> +
> +	memset(&ep, 0, sizeof(struct rmnet_endpoint));
> +	ep.refcount = 1;
> +	ep.rmnet_mode = rmnet_mode;
> +	ep.egress_dev = egress_dev;
> +
> +	return __rmnet_set_endpoint_config(dev, config_id, &ep);
> +}
> +
> +static int rmnet_unset_endpoint_config(struct net_device *dev, int
> config_id)
> +{
> +	netdev_info(dev, "id %d\n", config_id);
> +
> +	if (config_id < RMNET_LOCAL_LOGICAL_ENDPOINT ||
> +	    config_id >= RMNET_MAX_LOGICAL_EP)
> +		return -EINVAL;
> +
> +	return __rmnet_unset_endpoint_config(dev, config_id);
> +}
> +
> +static int rmnet_free_vnd(struct net_device *real_dev, int
> rmnet_dev_id)
> +{
> +	return rmnet_vnd_free_dev(real_dev, rmnet_dev_id);
> +}
> +
> +static void rmnet_free_vnd_later(struct work_struct *work)
> +{
> +	struct rmnet_free_vnd_work *fwork;
> +	int i;
> +
> +	fwork = container_of(work, struct rmnet_free_vnd_work,
> work);
> +
> +	for (i = 0; i < fwork->count; i++)
> +		rmnet_free_vnd(fwork->real_dev, fwork->vnd_id[i]);
> +	kfree(fwork);
> +}
> +
> +static void rmnet_force_unassociate_device(struct net_device *dev)
> +{
> +	struct rmnet_free_vnd_work *vnd_work;
> +	struct rmnet_real_dev_info *rdinfo;
> +	struct net_device *rmnet_dev;
> +	struct rmnet_endpoint *ep;
> +	int i, j;
> +
> +	ASSERT_RTNL();
> +
> +	if (!rmnet_is_real_dev_registered(dev)) {
> +		netdev_info(dev, "Unassociated device, skipping\n");
> +		return;
> +	}
> +
> +	vnd_work = kzalloc(sizeof(*vnd_work), GFP_KERNEL);
> +	if (!vnd_work)
> +		return;
> +
> +	INIT_WORK(&vnd_work->work, rmnet_free_vnd_later);
> +	vnd_work->real_dev = dev;
> +
> +	/* Check the VNDs for offending mappings */
> +	for (i = 0, j = 0; i < RMNET_MAX_VND && j < RMNET_MAX_VND;
> i++) {
> +		rmnet_dev = rmnet_vnd_get_by_id(dev, i);
> +		if (!rmnet_dev)
> +			continue;
> +
> +		ep = rmnet_vnd_get_endpoint(rmnet_dev);
> +		if (!ep)
> +			continue;
> +
> +		if (ep->refcount && (ep->egress_dev == dev)) {
> +			/* Make sure the device is down before
> clearing any of
> +			 * the mappings. Otherwise we could see a
> potential
> +			 * race condition if packets are actively
> being
> +			 * transmitted.
> +			 */
> +			dev_close(rmnet_dev);
> +			rmnet_unset_endpoint_config(rmnet_dev,
> +						RMNET_LOCAL_LOGICAL_
> ENDPOINT);
> +			vnd_work->vnd_id[j] = i;
> +			j++;
> +		}
> +	}
> +	if (j > 0) {
> +		vnd_work->count = j;
> +		schedule_work(&vnd_work->work);
> +	} else {
> +		kfree(vnd_work);
> +	}
> +
> +	rdinfo = __rmnet_get_real_dev_info(dev);
> +
> +	if (rdinfo) {
> +		ep = &rdinfo->local_ep;
> +
> +		if (ep && ep->refcount)
> +			rmnet_unset_endpoint_config
> +			(ep->egress_dev,
> RMNET_LOCAL_LOGICAL_ENDPOINT);
> +	}
> +
> +	/* Clear the mappings on the phys ep */
> +	rmnet_unset_endpoint_config(dev,
> RMNET_LOCAL_LOGICAL_ENDPOINT);
> +	for (i = 0; i < RMNET_MAX_LOGICAL_EP; i++)
> +		rmnet_unset_endpoint_config(dev, i);
> +	rmnet_unregister_real_device(dev);
> +}
> +
> +static int rmnet_config_notify_cb(struct notifier_block *nb,
> +				  unsigned long event, void *data)
> +{
> +	struct net_device *dev = netdev_notifier_info_to_dev(data);
> +
> +	if (!dev)
> +		return NOTIFY_DONE;
> +
> +	switch (event) {
> +	case NETDEV_UNREGISTER_FINAL:
> +	case NETDEV_UNREGISTER:
> +		netdev_info(dev, "Kernel unregister\n");
> +		rmnet_force_unassociate_device(dev);
> +		break;
> +
> +	default:
> +		break;
> +	}
> +
> +	return NOTIFY_DONE;
> +}
> +
> +static struct notifier_block rmnet_dev_notifier __read_mostly = {
> +	.notifier_call = rmnet_config_notify_cb,
> +};
> +
> +static int rmnet_newlink(struct net *src_net, struct net_device
> *dev,
> +			 struct nlattr *tb[], struct nlattr *data[],
> +			 struct netlink_ext_ack *extack)
> +{
> +	int ingress_format = RMNET_INGRESS_FORMAT_DEMUXING |
> +			     RMNET_INGRESS_FORMAT_DEAGGREGATION |
> +			     RMNET_INGRESS_FORMAT_MAP;

I don't see anywhere that RMNET_INGRESS_FIX_ETHERNET can get set? 
Also, can't that be autodetected?

> +	int egress_format = RMNET_EGRESS_FORMAT_MUXING |
> +			    RMNET_EGRESS_FORMAT_MAP;
> +	struct net_device *real_dev;
> +	int mode = RMNET_EPMODE_VND;
> +	u16 mux_id;
> +
> +	real_dev = __dev_get_by_index(src_net,
> nla_get_u32(tb[IFLA_LINK]));
> +	if (!real_dev || !dev)
> +		return -ENODEV;
> +
> +	if (!data[IFLA_VLAN_ID])

Also, I wasn't thinking to actually *use* IFLA_VLAN_ID, but I'll let
others weigh in.  It does fit in this case, I think, but maybe creating
an RMNET-specific attribute would be better?

> +		return -EINVAL;
> +
> +	mux_id = nla_get_u16(data[IFLA_VLAN_ID]);
> +
> +	rmnet_register_real_device(real_dev);
> +
> +	if (rmnet_vnd_newlink(real_dev, mux_id, dev))
> +		return -EINVAL;
> +
> +	rmnet_set_egress_data_format(real_dev, egress_format, 0, 0);
> +	rmnet_set_ingress_data_format(real_dev, ingress_format);
> +	rmnet_set_endpoint_config(real_dev, mux_id, mode, dev);
> +	rmnet_set_endpoint_config(dev, mux_id, mode, real_dev);
> +	return 0;
> +}
> +
> +static void rmnet_delink(struct net_device *dev, struct list_head
> *head)
> +{
> +	struct net_device *real_dev;
> +	struct rmnet_endpoint *ep;
> +	int mux_id;
> +
> +	ep = rmnet_vnd_get_endpoint(dev);
> +	real_dev = ep->egress_dev;
> +	if (ep && ep->refcount) {
> +		mux_id = rmnet_vnd_is_vnd(real_dev, dev);
> +
> +		/* rmnet_vnd_is_vnd() gives mux_id + 1,
> +		 * so subtract 1 to get the correct mux_id
> +		 */
> +		mux_id--;
> +		__rmnet_unset_endpoint_config(real_dev, mux_id);
> +		__rmnet_unset_endpoint_config(dev, mux_id);
> +		rmnet_vnd_remove_ref_dev(real_dev, mux_id);
> +		rmnet_unregister_real_device(real_dev);
> +	}
> +
> +	unregister_netdevice_queue(dev, head);
> +}
> +
> +static int rmnet_rtnl_validate(struct nlattr *tb[], struct nlattr
> *data[],
> +			       struct netlink_ext_ack *extack)
> +{
> +	u16 mux_id;
> +
> +	if (!data || !data[IFLA_VLAN_ID])
> +		return -EINVAL;
> +
> +	mux_id = nla_get_u16(data[IFLA_VLAN_ID]);
> +	if (!mux_id || mux_id > (RMNET_MAX_LOGICAL_EP - 1))
> +		return -ERANGE;
> +
> +	return 0;
> +}
> +
> +static size_t rmnet_get_size(const struct net_device *dev)
> +{
> +	return nla_total_size(2); /* IFLA_VLAN_ID */
> +}
> +
> +struct rtnl_link_ops rmnet_link_ops __read_mostly = {
> +	.kind		= "rmnet",
> +	.maxtype	= __IFLA_VLAN_MAX,
> +	.priv_size	= sizeof(struct rmnet_priv),
> +	.setup		= rmnet_vnd_setup,
> +	.validate	= rmnet_rtnl_validate,
> +	.newlink	= rmnet_newlink,
> +	.dellink	= rmnet_delink,
> +	.get_size	= rmnet_get_size,
> +};
> +
> +struct rmnet_real_dev_info*
> +rmnet_get_real_dev_info(struct net_device *real_dev)
> +{
> +	return __rmnet_get_real_dev_info(real_dev);
> +}
> +
> +int rmnet_config_init(void)
> +{
> +	int rc;
> +
> +	rc = register_netdevice_notifier(&rmnet_dev_notifier);
> +	if (rc != 0)
> +		return rc;
> +
> +	rc = rtnl_link_register(&rmnet_link_ops);
> +	if (rc != 0) {
> +		unregister_netdevice_notifier(&rmnet_dev_notifier);
> +		return rc;
> +	}
> +	return rc;
> +}
> +
> +void rmnet_config_exit(void)
> +{
> +	unregister_netdevice_notifier(&rmnet_dev_notifier);
> +	rtnl_link_unregister(&rmnet_link_ops);
> +}
> diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.h
> b/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.h
> new file mode 100644
> index 0000000..809988b
> --- /dev/null
> +++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.h
> @@ -0,0 +1,58 @@
> +/* Copyright (c) 2013-2014, 2016-2017 The Linux Foundation. All
> rights reserved.
> + *
> + * This program is free software; you can redistribute it and/or
> modify
> + * it under the terms of the GNU General Public License version 2
> and
> + * only version 2 as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * RMNET Data configuration engine
> + *
> + */
> +
> +#include <linux/skbuff.h>
> +
> +#ifndef _RMNET_CONFIG_H_
> +#define _RMNET_CONFIG_H_
> +
> +#define RMNET_MAX_LOGICAL_EP 255
> +#define RMNET_MAX_VND        32
> +
> +/* Information about the next device to deliver the packet to.
> + * Exact usage of this parameter depends on the rmnet_mode.
> + */
> +struct rmnet_endpoint {
> +	u8 refcount;
> +	u8 rmnet_mode;
> +	u8 mux_id;
> +	struct net_device *egress_dev;
> +};
> +
> +/* One instance of this structure is instantiated for each real_dev
> associated
> + * with rmnet.
> + */
> +struct rmnet_real_dev_info {
> +	struct net_device *dev;
> +	struct rmnet_endpoint local_ep;
> +	struct rmnet_endpoint muxed_ep[RMNET_MAX_LOGICAL_EP];
> +	u32 ingress_data_format;
> +	u32 egress_data_format;
> +	struct net_device *rmnet_devices[RMNET_MAX_VND];
> +};
> +
> +extern struct rtnl_link_ops rmnet_link_ops;
> +
> +struct rmnet_priv {
> +	struct rmnet_endpoint local_ep;
> +};
> +
> +struct rmnet_real_dev_info*
> +rmnet_get_real_dev_info(struct net_device *real_dev);
> +
> +int rmnet_config_init(void);
> +void rmnet_config_exit(void);
> +
> +#endif /* _RMNET_CONFIG_H_ */
> diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_handlers.c
> b/drivers/net/ethernet/qualcomm/rmnet/rmnet_handlers.c
> new file mode 100644
> index 0000000..34386ce4
> --- /dev/null
> +++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_handlers.c
> @@ -0,0 +1,297 @@
> +/* Copyright (c) 2013-2017, The Linux Foundation. All rights
> reserved.
> + *
> + * This program is free software; you can redistribute it and/or
> modify
> + * it under the terms of the GNU General Public License version 2
> and
> + * only version 2 as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * RMNET Data ingress/egress handler
> + *
> + */
> +
> +#include <linux/netdevice.h>
> +#include <linux/netdev_features.h>
> +#include "rmnet_private.h"
> +#include "rmnet_config.h"
> +#include "rmnet_vnd.h"
> +#include "rmnet_map.h"
> +#include "rmnet_handlers.h"
> +
> +#define RMNET_IP_VERSION_4 0x40
> +#define RMNET_IP_VERSION_6 0x60
> +
> +/* Helper Functions */
> +
> +static inline void rmnet_set_skb_proto(struct sk_buff *skb)
> +{
> +	switch (skb->data[0] & 0xF0) {
> +	case RMNET_IP_VERSION_4:
> +		skb->protocol = htons(ETH_P_IP);
> +		break;
> +	case RMNET_IP_VERSION_6:
> +		skb->protocol = htons(ETH_P_IPV6);
> +		break;
> +	default:
> +		skb->protocol = htons(ETH_P_MAP);
> +		break;
> +	}
> +}
> +
> +/* Generic handler */
> +
> +static rx_handler_result_t
> +rmnet_bridge_handler(struct sk_buff *skb, struct rmnet_endpoint *ep)
> +{
> +	if (!ep->egress_dev)
> +		kfree_skb(skb);
> +	else
> +		rmnet_egress_handler(skb, ep);
> +
> +	return RX_HANDLER_CONSUMED;
> +}
> +
> +static rx_handler_result_t
> +rmnet_deliver_skb(struct sk_buff *skb, struct rmnet_endpoint *ep)
> +{
> +	switch (ep->rmnet_mode) {
> +	case RMNET_EPMODE_NONE:
> +		return RX_HANDLER_PASS;
> +
> +	case RMNET_EPMODE_BRIDGE:
> +		return rmnet_bridge_handler(skb, ep);
> +
> +	case RMNET_EPMODE_VND:
> +		skb_reset_transport_header(skb);
> +		skb_reset_network_header(skb);
> +		switch (rmnet_vnd_rx_fixup(skb, skb->dev)) {
> +		case RX_HANDLER_CONSUMED:
> +			return RX_HANDLER_CONSUMED;
> +
> +		case RX_HANDLER_PASS:
> +			skb->pkt_type = PACKET_HOST;
> +			skb_set_mac_header(skb, 0);
> +			netif_receive_skb(skb);
> +			return RX_HANDLER_CONSUMED;
> +		}
> +		return RX_HANDLER_PASS;
> +
> +	default:
> +		kfree_skb(skb);
> +		return RX_HANDLER_CONSUMED;
> +	}
> +}
> +
> +static rx_handler_result_t
> +rmnet_ingress_deliver_packet(struct sk_buff *skb,
> +			     struct rmnet_real_dev_info *rdinfo)
> +{
> +	if (!rdinfo) {
> +		kfree_skb(skb);
> +		return RX_HANDLER_CONSUMED;
> +	}
> +
> +	if (!(rdinfo->local_ep.refcount)) {
> +		kfree_skb(skb);
> +		return RX_HANDLER_CONSUMED;
> +	}
> +
> +	skb->dev = rdinfo->local_ep.egress_dev;
> +
> +	return rmnet_deliver_skb(skb, &rdinfo->local_ep);
> +}
> +
> +/* MAP handler */
> +
> +static rx_handler_result_t
> +__rmnet_map_ingress_handler(struct sk_buff *skb,
> +			    struct rmnet_real_dev_info *rdinfo)
> +{
> +	struct rmnet_endpoint *ep;
> +	u8 mux_id;
> +	u16 len;
> +
> +	if (RMNET_MAP_GET_CD_BIT(skb)) {
> +		if (rdinfo->ingress_data_format
> +		    & RMNET_INGRESS_FORMAT_MAP_COMMANDS)
> +			return rmnet_map_command(skb, rdinfo);
> +
> +		kfree_skb(skb);
> +		return RX_HANDLER_CONSUMED;
> +	}
> +
> +	mux_id = RMNET_MAP_GET_MUX_ID(skb);
> +	len = RMNET_MAP_GET_LENGTH(skb) - RMNET_MAP_GET_PAD(skb);
> +
> +	if (mux_id >= RMNET_MAX_LOGICAL_EP) {
> +		kfree_skb(skb);
> +		return RX_HANDLER_CONSUMED;
> +	}
> +
> +	ep = &rdinfo->muxed_ep[mux_id];
> +
> +	if (!ep->refcount) {
> +		kfree_skb(skb);
> +		return RX_HANDLER_CONSUMED;
> +	}
> +
> +	if (rdinfo->ingress_data_format &
> RMNET_INGRESS_FORMAT_DEMUXING)
> +		skb->dev = ep->egress_dev;
> +
> +	/* Subtract MAP header */
> +	skb_pull(skb, sizeof(struct rmnet_map_header));
> +	skb_trim(skb, len);
> +	rmnet_set_skb_proto(skb);
> +	return rmnet_deliver_skb(skb, ep);
> +}
> +
> +static rx_handler_result_t
> +rmnet_map_ingress_handler(struct sk_buff *skb,
> +			  struct rmnet_real_dev_info *rdinfo)
> +{
> +	struct sk_buff *skbn;
> +	int rc;
> +
> +	if (rdinfo->ingress_data_format &
> RMNET_INGRESS_FORMAT_DEAGGREGATION) {
> +		while ((skbn = rmnet_map_deaggregate(skb, rdinfo))
> != NULL)
> +			__rmnet_map_ingress_handler(skbn, rdinfo);
> +
> +		consume_skb(skb);
> +		rc = RX_HANDLER_CONSUMED;
> +	} else {
> +		rc = __rmnet_map_ingress_handler(skb, rdinfo);
> +	}
> +
> +	return rc;
> +}
> +
> +static int rmnet_map_egress_handler(struct sk_buff *skb,
> +				    struct rmnet_real_dev_info
> *rdinfo,
> +				    struct rmnet_endpoint *ep,
> +				    struct net_device *orig_dev)
> +{
> +	int required_headroom, additional_header_len;
> +	struct rmnet_map_header *map_header;
> +
> +	additional_header_len = 0;
> +	required_headroom = sizeof(struct rmnet_map_header);
> +
> +	if (skb_headroom(skb) < required_headroom) {
> +		if (pskb_expand_head(skb, required_headroom, 0,
> GFP_KERNEL))
> +			return RMNET_MAP_CONSUMED;
> +	}
> +
> +	map_header = rmnet_map_add_map_header(skb,
> additional_header_len, 0);
> +	if (!map_header)
> +		return RMNET_MAP_CONSUMED;
> +
> +	if (rdinfo->egress_data_format & RMNET_EGRESS_FORMAT_MUXING)
> {
> +		if (ep->mux_id == 0xff)
> +			map_header->mux_id = 0;
> +		else
> +			map_header->mux_id = ep->mux_id;
> +	}
> +
> +	skb->protocol = htons(ETH_P_MAP);
> +
> +	return RMNET_MAP_SUCCESS;
> +}
> +
> +/* Ingress / Egress Entry Points */
> +
> +/* Processes packet as per ingress data format for receiving device.
> Logical
> + * endpoint is determined from packet inspection. Packet is then
> sent to the
> + * egress device listed in the logical endpoint configuration.
> + */
> +rx_handler_result_t rmnet_rx_handler(struct sk_buff **pskb)
> +{
> +	struct rmnet_real_dev_info *rdinfo;
> +	struct sk_buff *skb = *pskb;
> +	struct net_device *dev;
> +	int rc;
> +
> +	if (!skb)
> +		return RX_HANDLER_CONSUMED;
> +
> +	dev = skb->dev;
> +	rdinfo = rmnet_get_real_dev_info(dev);
> +
> +	/* Sometimes devices operate in ethernet mode even thouth
> there is no
> +	 * ethernet header. This causes the skb->protocol to contain
> a bogus
> +	 * value and the skb->data pointer to be off by 14 bytes.
> Fix it if
> +	 * configured to do so
> +	 */
> +	if (rdinfo->ingress_data_format &
> RMNET_INGRESS_FIX_ETHERNET) {
> +		skb_push(skb, RMNET_ETHERNET_HEADER_LENGTH);

Just use ETH_HLEN instead of RMNET_ETHERNET_HEADER_LENGTH.

But really, I can't see where FIX_ETHERNET ever gets set.  And as
above, can't this be automatically detected?  Can you describe what the
issue is here in more detail?

I know for qmi_wwan.c we had to fix up a number of firmware bugs, but
all that is done automatically.

> +		rmnet_set_skb_proto(skb);
> +	}
> +
> +	if (rdinfo->ingress_data_format & RMNET_INGRESS_FORMAT_MAP)
> {
> +		rc = rmnet_map_ingress_handler(skb, rdinfo);
> +	} else {
> +		switch (ntohs(skb->protocol)) {
> +		case ETH_P_MAP:
> +			if (rdinfo->local_ep.rmnet_mode ==
> +				RMNET_EPMODE_BRIDGE) {
> +				rc =
> rmnet_ingress_deliver_packet(skb, rdinfo);
> +			} else {
> +				kfree_skb(skb);
> +				rc = RX_HANDLER_CONSUMED;
> +			}
> +			break;
> +
> +		case ETH_P_ARP:
> +		case ETH_P_IP:
> +		case ETH_P_IPV6:
> +			rc = rmnet_ingress_deliver_packet(skb,
> rdinfo);
> +			break;
> +
> +		default:
> +			rc = RX_HANDLER_PASS;
> +		}
> +	}
> +
> +	return rc;
> +}
> +
> +/* Modifies packet as per logical endpoint configuration and egress
> data format
> + * for egress device configured in logical endpoint. Packet is then
> transmitted
> + * on the egress device.
> + */
> +void rmnet_egress_handler(struct sk_buff *skb,
> +			  struct rmnet_endpoint *ep)
> +{
> +	struct rmnet_real_dev_info *rdinfo;
> +	struct net_device *orig_dev;
> +
> +	orig_dev = skb->dev;
> +	skb->dev = ep->egress_dev;
> +
> +	rdinfo = rmnet_get_real_dev_info(skb->dev);
> +	if (!rdinfo) {
> +		kfree_skb(skb);
> +		return;
> +	}
> +
> +	if (rdinfo->egress_data_format & RMNET_EGRESS_FORMAT_MAP) {
> +		switch (rmnet_map_egress_handler(skb, rdinfo, ep,
> orig_dev)) {
> +		case RMNET_MAP_CONSUMED:
> +			return;
> +
> +		case RMNET_MAP_SUCCESS:
> +			break;
> +
> +		default:
> +			kfree_skb(skb);
> +			return;
> +		}
> +	}
> +
> +	if (ep->rmnet_mode == RMNET_EPMODE_VND)
> +		rmnet_vnd_tx_fixup(skb, orig_dev);
> +
> +	dev_queue_xmit(skb);
> +}
> diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_handlers.h
> b/drivers/net/ethernet/qualcomm/rmnet/rmnet_handlers.h
> new file mode 100644
> index 0000000..f2638cf
> --- /dev/null
> +++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_handlers.h
> @@ -0,0 +1,26 @@
> +/* Copyright (c) 2013, 2016-2017 The Linux Foundation. All rights
> reserved.
> + *
> + * This program is free software; you can redistribute it and/or
> modify
> + * it under the terms of the GNU General Public License version 2
> and
> + * only version 2 as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * RMNET Data ingress/egress handler
> + *
> + */
> +
> +#ifndef _RMNET_HANDLERS_H_
> +#define _RMNET_HANDLERS_H_
> +
> +#include "rmnet_config.h"
> +
> +void rmnet_egress_handler(struct sk_buff *skb,
> +			  struct rmnet_endpoint *ep);
> +
> +rx_handler_result_t rmnet_rx_handler(struct sk_buff **pskb);
> +
> +#endif /* _RMNET_HANDLERS_H_ */
> diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_main.c
> b/drivers/net/ethernet/qualcomm/rmnet/rmnet_main.c
> new file mode 100644
> index 0000000..80c3920
> --- /dev/null
> +++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_main.c
> @@ -0,0 +1,37 @@
> +/* Copyright (c) 2013-2014, 2016-2017 The Linux Foundation. All
> rights reserved.
> + *
> + * This program is free software; you can redistribute it and/or
> modify
> + * it under the terms of the GNU General Public License version 2
> and
> + * only version 2 as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + *
> + * RMNET Data generic framework
> + *
> + */
> +
> +#include <linux/module.h>
> +#include "rmnet_private.h"
> +#include "rmnet_config.h"
> +#include "rmnet_vnd.h"
> +
> +/* Startup/Shutdown */
> +
> +static int __init rmnet_init(void)
> +{
> +	rmnet_config_init();
> +	return 0;
> +}
> +
> +static void __exit rmnet_exit(void)
> +{
> +	rmnet_config_exit();
> +}
> +
> +module_init(rmnet_init)
> +module_exit(rmnet_exit)
> +MODULE_LICENSE("GPL v2");
> diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_map.h
> b/drivers/net/ethernet/qualcomm/rmnet/rmnet_map.h
> new file mode 100644
> index 0000000..9696145
> --- /dev/null
> +++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_map.h
> @@ -0,0 +1,88 @@
> +/* Copyright (c) 2013-2017, The Linux Foundation. All rights
> reserved.
> + *
> + * This program is free software; you can redistribute it and/or
> modify
> + * it under the terms of the GNU General Public License version 2
> and
> + * only version 2 as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + */
> +
> +#ifndef _RMNET_MAP_H_
> +#define _RMNET_MAP_H_
> +
> +struct rmnet_map_control_command {
> +	u8  command_name;
> +	u8  cmd_type:2;
> +	u8  reserved:6;
> +	u16 reserved2;
> +	u32 transaction_id;
> +	union {
> +		u8  data[65528];

Um....  that seems really, really odd.  Typically this would go below
the flow_control struct, and actually be:

u8 data[0];

To indicate that the struct member should exist and that you can use
it, but that it has no specific size (since the size will be determined
by the skb size or by a protocol field instead).

Thats all for now...

Dan


> +		struct {
> +			u16 ip_family:2;
> +			u16 reserved:14;
> +			u16 flow_control_seq_num;
> +			u32 qos_id;
> +		} flow_control;
> +	};
> +}  __aligned(1);
> +
> +enum rmnet_map_results {
> +	RMNET_MAP_SUCCESS,
> +	RMNET_MAP_CONSUMED,
> +	RMNET_MAP_GENERAL_FAILURE,
> +	RMNET_MAP_NOT_ENABLED,
> +	RMNET_MAP_FAILED_AGGREGATION,
> +	RMNET_MAP_FAILED_MUX
> +};
> +
> +enum rmnet_map_commands {
> +	RMNET_MAP_COMMAND_NONE,
> +	RMNET_MAP_COMMAND_FLOW_DISABLE,
> +	RMNET_MAP_COMMAND_FLOW_ENABLE,
> +	/* These should always be the last 2 elements */
> +	RMNET_MAP_COMMAND_UNKNOWN,
> +	RMNET_MAP_COMMAND_ENUM_LENGTH
> +};
> +
> +struct rmnet_map_header {
> +	u8  pad_len:6;
> +	u8  reserved_bit:1;
> +	u8  cd_bit:1;
> +	u8  mux_id;
> +	u16 pkt_len;
> +}  __aligned(1);
> +
> +#define RMNET_MAP_GET_MUX_ID(Y) (((struct rmnet_map_header *) \
> +				 (Y)->data)->mux_id)
> +#define RMNET_MAP_GET_CD_BIT(Y) (((struct rmnet_map_header *) \
> +				(Y)->data)->cd_bit)
> +#define RMNET_MAP_GET_PAD(Y) (((struct rmnet_map_header *) \
> +				(Y)->data)->pad_len)
> +#define RMNET_MAP_GET_CMD_START(Y) ((struct
> rmnet_map_control_command *) \
> +				    ((Y)->data + \
> +				      sizeof(struct
> rmnet_map_header)))
> +#define RMNET_MAP_GET_LENGTH(Y) (ntohs(((struct rmnet_map_header *)
> \
> +					(Y)->data)->pkt_len))
> +
> +#define RMNET_MAP_COMMAND_REQUEST     0
> +#define RMNET_MAP_COMMAND_ACK         1
> +#define RMNET_MAP_COMMAND_UNSUPPORTED 2
> +#define RMNET_MAP_COMMAND_INVALID     3
> +
> +#define RMNET_MAP_NO_PAD_BYTES        0
> +#define RMNET_MAP_ADD_PAD_BYTES       1
> +
> +u8 rmnet_map_demultiplex(struct sk_buff *skb);
> +struct sk_buff *rmnet_map_deaggregate(struct sk_buff *skb,
> +				      struct rmnet_real_dev_info
> *rdinfo);
> +
> +struct rmnet_map_header *rmnet_map_add_map_header(struct sk_buff
> *skb,
> +						  int hdrlen, int
> pad);
> +rx_handler_result_t rmnet_map_command(struct sk_buff *skb,
> +				      struct rmnet_real_dev_info
> *rdinfo);
> +
> +#endif /* _RMNET_MAP_H_ */
> diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_map_command.c
> b/drivers/net/ethernet/qualcomm/rmnet/rmnet_map_command.c
> new file mode 100644
> index 0000000..c0af5b8
> --- /dev/null
> +++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_map_command.c
> @@ -0,0 +1,122 @@
> +/* Copyright (c) 2013-2017, The Linux Foundation. All rights
> reserved.
> + *
> + * This program is free software; you can redistribute it and/or
> modify
> + * it under the terms of the GNU General Public License version 2
> and
> + * only version 2 as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + */
> +
> +#include <linux/netdevice.h>
> +#include "rmnet_config.h"
> +#include "rmnet_map.h"
> +#include "rmnet_private.h"
> +#include "rmnet_vnd.h"
> +
> +static u8 rmnet_map_do_flow_control(struct sk_buff *skb,
> +				    struct rmnet_real_dev_info
> *rdinfo,
> +				    int enable)
> +{
> +	struct rmnet_map_control_command *cmd;
> +	struct rmnet_endpoint *ep;
> +	struct net_device *vnd;
> +	u16 ip_family;
> +	u16 fc_seq;
> +	u32 qos_id;
> +	u8 mux_id;
> +	int r;
> +
> +	if (unlikely(!skb || !rdinfo))
> +		return RX_HANDLER_CONSUMED;
> +
> +	mux_id = RMNET_MAP_GET_MUX_ID(skb);
> +	cmd = RMNET_MAP_GET_CMD_START(skb);
> +
> +	if (mux_id >= RMNET_MAX_LOGICAL_EP) {
> +		kfree_skb(skb);
> +		return RX_HANDLER_CONSUMED;
> +	}
> +
> +	ep = &rdinfo->muxed_ep[mux_id];
> +
> +	if (!ep->refcount) {
> +		kfree_skb(skb);
> +		return RX_HANDLER_CONSUMED;
> +	}
> +
> +	vnd = ep->egress_dev;
> +
> +	ip_family = cmd->flow_control.ip_family;
> +	fc_seq = ntohs(cmd->flow_control.flow_control_seq_num);
> +	qos_id = ntohl(cmd->flow_control.qos_id);
> +
> +	/* Ignore the ip family and pass the sequence number for
> both v4 and v6
> +	 * sequence. User space does not support creating dedicated
> flows for
> +	 * the 2 protocols
> +	 */
> +	r = rmnet_vnd_do_flow_control(rdinfo, vnd, enable);
> +	if (r) {
> +		kfree_skb(skb);
> +		return RMNET_MAP_COMMAND_UNSUPPORTED;
> +	} else {
> +		return RMNET_MAP_COMMAND_ACK;
> +	}
> +}
> +
> +static void rmnet_map_send_ack(struct sk_buff *skb,
> +			       unsigned char type,
> +			       struct rmnet_real_dev_info *rdinfo)
> +{
> +	struct rmnet_map_control_command *cmd;
> +	int xmit_status;
> +
> +	if (unlikely(!skb))
> +		return;
> +
> +	skb->protocol = htons(ETH_P_MAP);
> +
> +	cmd = RMNET_MAP_GET_CMD_START(skb);
> +	cmd->cmd_type = type & 0x03;
> +
> +	netif_tx_lock(skb->dev);
> +	xmit_status = skb->dev->netdev_ops->ndo_start_xmit(skb, skb-
> >dev);
> +	netif_tx_unlock(skb->dev);
> +}
> +
> +/* Process MAP command frame and send N/ACK message as appropriate.
> Message cmd
> + * name is decoded here and appropriate handler is called.
> + */
> +rx_handler_result_t rmnet_map_command(struct sk_buff *skb,
> +				      struct rmnet_real_dev_info
> *rdinfo)
> +{
> +	struct rmnet_map_control_command *cmd;
> +	unsigned char command_name;
> +	unsigned char rc = 0;
> +
> +	if (unlikely(!skb))
> +		return RX_HANDLER_CONSUMED;
> +
> +	cmd = RMNET_MAP_GET_CMD_START(skb);
> +	command_name = cmd->command_name;
> +
> +	switch (command_name) {
> +	case RMNET_MAP_COMMAND_FLOW_ENABLE:
> +		rc = rmnet_map_do_flow_control(skb, rdinfo, 1);
> +		break;
> +
> +	case RMNET_MAP_COMMAND_FLOW_DISABLE:
> +		rc = rmnet_map_do_flow_control(skb, rdinfo, 0);
> +		break;
> +
> +	default:
> +		rc = RMNET_MAP_COMMAND_UNSUPPORTED;
> +		kfree_skb(skb);
> +		break;
> +	}
> +	if (rc == RMNET_MAP_COMMAND_ACK)
> +		rmnet_map_send_ack(skb, rc, rdinfo);
> +	return RX_HANDLER_CONSUMED;
> +}
> diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_map_data.c
> b/drivers/net/ethernet/qualcomm/rmnet/rmnet_map_data.c
> new file mode 100644
> index 0000000..6d16c6ac
> --- /dev/null
> +++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_map_data.c
> @@ -0,0 +1,105 @@
> +/* Copyright (c) 2013-2017, The Linux Foundation. All rights
> reserved.
> + *
> + * This program is free software; you can redistribute it and/or
> modify
> + * it under the terms of the GNU General Public License version 2
> and
> + * only version 2 as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * RMNET Data MAP protocol
> + *
> + */
> +
> +#include <linux/netdevice.h>
> +#include "rmnet_config.h"
> +#include "rmnet_map.h"
> +#include "rmnet_private.h"
> +
> +#define RMNET_MAP_DEAGGR_SPACING  64
> +#define RMNET_MAP_DEAGGR_HEADROOM (RMNET_MAP_DEAGGR_SPACING / 2)
> +
> +/* Adds MAP header to front of skb->data
> + * Padding is calculated and set appropriately in MAP header. Mux ID
> is
> + * initialized to 0.
> + */
> +struct rmnet_map_header *rmnet_map_add_map_header(struct sk_buff
> *skb,
> +						  int hdrlen, int
> pad)
> +{
> +	struct rmnet_map_header *map_header;
> +	u32 padding, map_datalen;
> +	u8 *padbytes;
> +
> +	if (skb_headroom(skb) < sizeof(struct rmnet_map_header))
> +		return 0;
> +
> +	map_datalen = skb->len - hdrlen;
> +	map_header = (struct rmnet_map_header *)
> +			skb_push(skb, sizeof(struct
> rmnet_map_header));
> +	memset(map_header, 0, sizeof(struct rmnet_map_header));
> +
> +	if (pad == RMNET_MAP_NO_PAD_BYTES) {
> +		map_header->pkt_len = htons(map_datalen);
> +		return map_header;
> +	}
> +
> +	padding = ALIGN(map_datalen, 4) - map_datalen;
> +
> +	if (padding == 0)
> +		goto done;
> +
> +	if (skb_tailroom(skb) < padding)
> +		return 0;
> +
> +	padbytes = (u8 *)skb_put(skb, padding);
> +	memset(padbytes, 0, padding);
> +
> +done:
> +	map_header->pkt_len = htons(map_datalen + padding);
> +	map_header->pad_len = padding & 0x3F;
> +
> +	return map_header;
> +}
> +
> +/* Deaggregates a single packet
> + * A whole new buffer is allocated for each portion of an aggregated
> frame.
> + * Caller should keep calling deaggregate() on the source skb until
> 0 is
> + * returned, indicating that there are no more packets to
> deaggregate. Caller
> + * is responsible for freeing the original skb.
> + */
> +struct sk_buff *rmnet_map_deaggregate(struct sk_buff *skb,
> +				      struct rmnet_real_dev_info
> *rdinfo)
> +{
> +	struct rmnet_map_header *maph;
> +	struct sk_buff *skbn;
> +	u32 packet_len;
> +
> +	if (skb->len == 0)
> +		return 0;
> +
> +	maph = (struct rmnet_map_header *)skb->data;
> +	packet_len = ntohs(maph->pkt_len) + sizeof(struct
> rmnet_map_header);
> +
> +	if (((int)skb->len - (int)packet_len) < 0)
> +		return 0;
> +
> +	skbn = alloc_skb(packet_len + RMNET_MAP_DEAGGR_SPACING,
> GFP_ATOMIC);
> +	if (!skbn)
> +		return 0;
> +
> +	skbn->dev = skb->dev;
> +	skb_reserve(skbn, RMNET_MAP_DEAGGR_HEADROOM);
> +	skb_put(skbn, packet_len);
> +	memcpy(skbn->data, skb->data, packet_len);
> +	skb_pull(skb, packet_len);
> +
> +	/* Some hardware can send us empty frames. Catch them */
> +	if (ntohs(maph->pkt_len) == 0) {
> +		kfree_skb(skb);
> +		return 0;
> +	}
> +
> +	return skbn;
> +}
> diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_private.h
> b/drivers/net/ethernet/qualcomm/rmnet/rmnet_private.h
> new file mode 100644
> index 0000000..48e7614
> --- /dev/null
> +++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_private.h
> @@ -0,0 +1,47 @@
> +/* Copyright (c) 2013-2014, 2016-2017 The Linux Foundation. All
> rights reserved.
> + *
> + * This program is free software; you can redistribute it and/or
> modify
> + * it under the terms of the GNU General Public License version 2
> and
> + * only version 2 as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + */
> +
> +#ifndef _RMNET_PRIVATE_H_
> +#define _RMNET_PRIVATE_H_
> +
> +#define RMNET_MAX_VND              32
> +#define RMNET_MAX_PACKET_SIZE      16384
> +#define RMNET_DFLT_PACKET_SIZE     1500
> +#define RMNET_DEV_NAME_STR         "rmnet"

This isn't used anywhere.

> +#define RMNET_NEEDED_HEADROOM      16
> +#define RMNET_TX_QUEUE_LEN         1000
> +#define RMNET_ETHERNET_HEADER_LENGTH    14
> +
> +/* Constants */
> +#define RMNET_EGRESS_FORMAT__RESERVED__         BIT(0)
> +#define RMNET_EGRESS_FORMAT_MAP                 BIT(1)
> +#define RMNET_EGRESS_FORMAT_AGGREGATION         BIT(2)
> +#define RMNET_EGRESS_FORMAT_MUXING              BIT(3)
> +#define RMNET_EGRESS_FORMAT_MAP_CKSUMV3         BIT(4)
> +#define RMNET_EGRESS_FORMAT_MAP_CKSUMV4         BIT(5)
> +
> +#define RMNET_INGRESS_FIX_ETHERNET              BIT(0)
> +#define RMNET_INGRESS_FORMAT_MAP                BIT(1)
> +#define RMNET_INGRESS_FORMAT_DEAGGREGATION      BIT(2)
> +#define RMNET_INGRESS_FORMAT_DEMUXING           BIT(3)
> +#define RMNET_INGRESS_FORMAT_MAP_COMMANDS       BIT(4)
> +#define RMNET_INGRESS_FORMAT_MAP_CKSUMV3        BIT(5)
> +#define RMNET_INGRESS_FORMAT_MAP_CKSUMV4        BIT(6)
> +
> +/* Pass the frame up the stack with no modifications to skb->dev */
> +#define RMNET_EPMODE_NONE (0)
> +/* Replace skb->dev to a virtual rmnet device and pass up the stack
> */
> +#define RMNET_EPMODE_VND (1)
> +/* Pass the frame directly to another device with dev_queue_xmit()
> */
> +#define RMNET_EPMODE_BRIDGE (2)
> +
> +#endif /* _RMNET_PRIVATE_H_ */
> diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.c
> b/drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.c
> new file mode 100644
> index 0000000..b9ec070
> --- /dev/null
> +++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.c
> @@ -0,0 +1,267 @@
> +/* Copyright (c) 2013-2017, The Linux Foundation. All rights
> reserved.
> + *
> + * This program is free software; you can redistribute it and/or
> modify
> + * it under the terms of the GNU General Public License version 2
> and
> + * only version 2 as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + *
> + * RMNET Data virtual network driver
> + *
> + */
> +
> +#include <linux/etherdevice.h>
> +#include <linux/if_arp.h>
> +#include <net/pkt_sched.h>
> +#include "rmnet_config.h"
> +#include "rmnet_handlers.h"
> +#include "rmnet_private.h"
> +#include "rmnet_map.h"
> +#include "rmnet_vnd.h"
> +
> +/* RX/TX Fixup */
> +
> +int rmnet_vnd_rx_fixup(struct sk_buff *skb, struct net_device *dev)
> +{
> +	if (unlikely(!dev || !skb))
> +		return RX_HANDLER_CONSUMED;
> +
> +	dev->stats.rx_packets++;
> +	dev->stats.rx_bytes += skb->len;
> +
> +	return RX_HANDLER_PASS;
> +}
> +
> +int rmnet_vnd_tx_fixup(struct sk_buff *skb, struct net_device *dev)
> +{
> +	struct rmnet_priv *priv;
> +
> +	priv = netdev_priv(dev);
> +
> +	if (unlikely(!dev || !skb))
> +		return RX_HANDLER_CONSUMED;
> +
> +	dev->stats.tx_packets++;
> +	dev->stats.tx_bytes += skb->len;
> +
> +	return RX_HANDLER_PASS;
> +}
> +
> +/* Network Device Operations */
> +
> +static netdev_tx_t rmnet_vnd_start_xmit(struct sk_buff *skb,
> +					struct net_device *dev)
> +{
> +	struct rmnet_priv *priv;
> +
> +	priv = netdev_priv(dev);
> +	if (priv->local_ep.egress_dev) {
> +		rmnet_egress_handler(skb, &priv->local_ep);
> +	} else {
> +		dev->stats.tx_dropped++;
> +		kfree_skb(skb);
> +	}
> +	return NETDEV_TX_OK;
> +}
> +
> +static int rmnet_vnd_change_mtu(struct net_device *rmnet_dev, int
> new_mtu)
> +{
> +	if (new_mtu < 0 || new_mtu > RMNET_MAX_PACKET_SIZE)
> +		return -EINVAL;
> +
> +	rmnet_dev->mtu = new_mtu;
> +	return 0;
> +}
> +
> +static const struct net_device_ops rmnet_vnd_ops = {
> +	.ndo_start_xmit = rmnet_vnd_start_xmit,
> +	.ndo_change_mtu = rmnet_vnd_change_mtu,
> +};
> +
> +/* Called by kernel whenever a new rmnet<n> device is created. Sets
> MTU,
> + * flags, ARP type, needed headroom, etc...
> + */
> +void rmnet_vnd_setup(struct net_device *rmnet_dev)
> +{
> +	struct rmnet_priv *priv;
> +
> +	/* Clear out private data */
> +	priv = netdev_priv(rmnet_dev);
> +	memset(priv, 0, sizeof(struct rmnet_priv));
> +
> +	netdev_info(rmnet_dev, "Setting up device %s\n", rmnet_dev-
> >name);
> +
> +	rmnet_dev->netdev_ops = &rmnet_vnd_ops;
> +	rmnet_dev->mtu = RMNET_DFLT_PACKET_SIZE;
> +	rmnet_dev->needed_headroom = RMNET_NEEDED_HEADROOM;
> +	random_ether_addr(rmnet_dev->dev_addr);
> +	rmnet_dev->tx_queue_len = RMNET_TX_QUEUE_LEN;
> +
> +	/* Raw IP mode */
> +	rmnet_dev->header_ops = 0;  /* No header */
> +	rmnet_dev->type = ARPHRD_RAWIP;
> +	rmnet_dev->hard_header_len = 0;
> +	rmnet_dev->flags &= ~(IFF_BROADCAST | IFF_MULTICAST);
> +
> +	rmnet_dev->needs_free_netdev = true;
> +}
> +
> +/* Exposed API */
> +
> +int rmnet_vnd_newlink(struct net_device *real_dev, int id,
> +		      struct net_device *rmnet_dev)
> +{
> +	struct rmnet_real_dev_info *rdinfo;
> +	int rc;
> +
> +	rdinfo = rmnet_get_real_dev_info(real_dev);
> +
> +	if (rdinfo->rmnet_devices[id])
> +		return -EINVAL;
> +
> +	rc = register_netdevice(rmnet_dev);
> +	if (!rc) {
> +		rdinfo->rmnet_devices[id] = rmnet_dev;
> +		rmnet_dev->rtnl_link_ops = &rmnet_link_ops;
> +	}
> +
> +	return rc;
> +}
> +
> +/* Unregisters the virtual network device node and frees it.
> + * unregister_netdev locks the rtnl mutex, so the mutex must not be
> locked
> + * by the caller of the function. unregister_netdev enqueues the
> request to
> + * unregister the device into a TODO queue. The requests in the TODO
> queue
> + * are only done after rtnl mutex is unlocked, therefore free_netdev
> has to
> + * called after unlocking rtnl mutex.
> + */
> +int rmnet_vnd_free_dev(struct net_device *real_dev, int id)
> +{
> +	struct rmnet_real_dev_info *rdinfo;
> +	struct net_device *rmnet_dev;
> +	struct rmnet_endpoint *ep;
> +
> +	rdinfo = rmnet_get_real_dev_info(real_dev);
> +
> +	rtnl_lock();
> +	if (id < 0 || id >= RMNET_MAX_VND || !rdinfo-
> >rmnet_devices[id]) {
> +		rtnl_unlock();
> +		return -EINVAL;
> +	}
> +
> +	ep = rmnet_vnd_get_endpoint(rdinfo->rmnet_devices[id]);
> +	if (ep && ep->refcount) {
> +		rtnl_unlock();
> +		return -EINVAL;
> +	}
> +
> +	rmnet_dev = rdinfo->rmnet_devices[id];
> +	rdinfo->rmnet_devices[id] = 0;
> +	rtnl_unlock();
> +
> +	if (rmnet_dev) {
> +		unregister_netdev(rmnet_dev);
> +		free_netdev(rmnet_dev);
> +		return 0;
> +	} else {
> +		return -EINVAL;
> +	}
> +}
> +
> +int rmnet_vnd_remove_ref_dev(struct net_device *real_dev, int id)
> +{
> +	struct rmnet_real_dev_info *rdinfo;
> +	struct rmnet_endpoint *ep;
> +
> +	rdinfo = rmnet_get_real_dev_info(real_dev);
> +
> +	if (id < 0 || id >= RMNET_MAX_VND || !rdinfo-
> >rmnet_devices[id])
> +		return -EINVAL;
> +
> +	ep = rmnet_vnd_get_endpoint(rdinfo->rmnet_devices[id]);
> +	if (ep && ep->refcount)
> +		return -EBUSY;
> +
> +	rdinfo->rmnet_devices[id] = 0;
> +	return 0;
> +}
> +
> +/* Searches through list of known RmNet virtual devices. This
> function is O(n)
> + * and should not be used in the data path.
> + *
> + * To get the read id, subtract this result by 1.
> + */
> +int rmnet_vnd_is_vnd(struct net_device *real_dev, struct net_device
> *rmnet_dev)
> +{
> +	/* This is not an efficient search, but, this will only be
> called in
> +	 * a configuration context, and the list is small.
> +	 */
> +	struct rmnet_real_dev_info *rdinfo;
> +	int i;
> +
> +	rdinfo = rmnet_get_real_dev_info(real_dev);
> +
> +	if (!rmnet_dev)
> +		return 0;
> +
> +	for (i = 0; i < RMNET_MAX_VND; i++)
> +		if (rmnet_dev == rdinfo->rmnet_devices[i])
> +			return i + 1;
> +
> +	return 0;
> +}
> +
> +/* Gets the logical endpoint configuration for a RmNet virtual
> network device
> + * node. Caller should confirm that devices is a RmNet VND before
> calling.
> + */
> +struct rmnet_endpoint *rmnet_vnd_get_endpoint(struct net_device
> *rmnet_dev)
> +{
> +	struct rmnet_priv *priv;
> +
> +	if (!rmnet_dev)
> +		return 0;
> +
> +	priv = netdev_priv(rmnet_dev);
> +	if (!priv)
> +		return 0;
> +
> +	return &priv->local_ep;
> +}
> +
> +int rmnet_vnd_do_flow_control(struct rmnet_real_dev_info *rdinfo,
> +			      struct net_device *rmnet_dev, int
> enable)
> +{
> +	struct rmnet_priv *priv;
> +
> +	priv = netdev_priv(rmnet_dev);
> +	if (unlikely(!priv))
> +		return -EINVAL;
> +
> +	netdev_info(rmnet_dev, "Setting VND TX queue state to %d\n",
> enable);
> +	/* Although we expect similar number of enable/disable
> +	 * commands, optimize for the disable. That is more
> +	 * latency sensitive than enable
> +	 */
> +	if (unlikely(enable))
> +		netif_wake_queue(rmnet_dev);
> +	else
> +		netif_stop_queue(rmnet_dev);
> +
> +	return 0;
> +}
> +
> +struct net_device *rmnet_vnd_get_by_id(struct net_device *real_dev,
> int id)
> +{
> +	struct rmnet_real_dev_info *rdinfo;
> +
> +	rdinfo = rmnet_get_real_dev_info(real_dev);
> +
> +	if (id < 0 || id >= RMNET_MAX_VND)
> +		return 0;
> +
> +	return rdinfo->rmnet_devices[id];
> +}
> diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.h
> b/drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.h
> new file mode 100644
> index 0000000..cf5aac8
> --- /dev/null
> +++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.h
> @@ -0,0 +1,32 @@
> +/* Copyright (c) 2013-2017, The Linux Foundation. All rights
> reserved.
> + *
> + * This program is free software; you can redistribute it and/or
> modify
> + * it under the terms of the GNU General Public License version 2
> and
> + * only version 2 as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * RMNET Data Virtual Network Device APIs
> + *
> + */
> +
> +#ifndef _RMNET_VND_H_
> +#define _RMNET_VND_H_
> +
> +int rmnet_vnd_do_flow_control(struct rmnet_real_dev_info *rdinfo,
> +			      struct net_device *dev, int enable);
> +struct rmnet_endpoint *rmnet_vnd_get_endpoint(struct net_device
> *dev);
> +int rmnet_vnd_free_dev(struct net_device *real_dev, int id);
> +int rmnet_vnd_remove_ref_dev(struct net_device *real_dev, int id);
> +int rmnet_vnd_rx_fixup(struct sk_buff *skb, struct net_device *dev);
> +int rmnet_vnd_tx_fixup(struct sk_buff *skb, struct net_device *dev);
> +int rmnet_vnd_is_vnd(struct net_device *real_dev, struct net_device
> *dev);
> +struct net_device *rmnet_vnd_get_by_id(struct net_device *real_dev,
> int id);
> +void rmnet_vnd_setup(struct net_device *dev);
> +int rmnet_vnd_newlink(struct net_device *real_dev, int id,
> +		      struct net_device *new_device);
> +
> +#endif /* _RMNET_VND_H_ */
> -- 
> 1.9.1
> 

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox