Netdev List
 help / color / mirror / Atom feed
* Re: [RESEND PATCH] Allow passing tid or pid in SCM_CREDENTIALS without CAP_SYS_ADMIN
From: David Miller @ 2017-08-29 23:02 UTC (permalink / raw)
  To: prakash.sangappa; +Cc: linux-kernel, netdev, ebiederm, drepper
In-Reply-To: <1503965540-30393-1-git-send-email-prakash.sangappa@oracle.com>

From: Prakash Sangappa <prakash.sangappa@oracle.com>
Date: Mon, 28 Aug 2017 17:12:20 -0700

> Currently passing tid(gettid(2)) of a thread in struct ucred in
> SCM_CREDENTIALS message requires CAP_SYS_ADMIN capability otherwise
> it fails with EPERM error. Some applications deal with thread id
> of a thread(tid) and so it would help to allow tid in SCM_CREDENTIALS
> message. Basically, either tgid(pid of the process) or the tid of
> the thread should be allowed without the need for CAP_SYS_ADMIN capability.
> 
> SCM_CREDENTIALS will be used to determine the global id of a process or
> a thread running inside a pid namespace.
> 
> This patch adds necessary check to accept tid in SCM_CREDENTIALS
> struct ucred.
> 
> Signed-off-by: Prakash Sangappa <prakash.sangappa@oracle.com>

I'm pretty sure that by the descriptions in previous changes to this
function, what you are proposing is basically a minor form of PID
spoofing which we only want someone with CAP_SYS_ADMIN over the
PID namespace to be able to do.

Sorry, I'm not applying this.

^ permalink raw reply

* Re: [PATCH ] net: frag: print frag_mem_limit value in sockstat proc file
From: David Miller @ 2017-08-29 23:04 UTC (permalink / raw)
  To: liujian56; +Cc: kuznet, yoshfuji, brouer, fw, netdev
In-Reply-To: <1504011934-17308-1-git-send-email-liujian56@huawei.com>

From: <liujian56@huawei.com>
Date: Tue, 29 Aug 2017 21:05:34 +0800

> From: liujian <liujian56@huawei.com>
> 
> From 6d7b857d5( net: use lib/percpu_counter API for fragmentation mem
> accounting),
> frag_mem_limit and sum_frag_mem_limit have different value if there are
> multiple NIC RX CPU.
> Print frag_mem_limit value, then we can get more debug info.
> 
> Signed-off-by: liujian <liujian56@huawei.com>

Sorry, I don't think it is justified to change the deprecated procfs
output just for this.

Use tracepoints and other similar facilities to debug situations
like this if you like.

^ permalink raw reply

* Re: [PATCH] net: dsa: make some structures const
From: David Miller @ 2017-08-29 23:05 UTC (permalink / raw)
  To: bhumirks
  Cc: julia.lawall, andrew, vivien.didelot, f.fainelli, netdev,
	linux-kernel
In-Reply-To: <1504025272-8304-1-git-send-email-bhumirks@gmail.com>

From: Bhumika Goyal <bhumirks@gmail.com>
Date: Tue, 29 Aug 2017 22:17:52 +0530

> Make these const as they are not modified anywhere.
> 
> Signed-off-by: Bhumika Goyal <bhumirks@gmail.com>

Applied.

^ permalink raw reply

* Re: [PATCH net] sch_htb: fix crash on init failure
From: David Miller @ 2017-08-29 23:06 UTC (permalink / raw)
  To: nikolay; +Cc: netdev, edumazet, jhs, xiyou.wangcong, jiri, roopa
In-Reply-To: <1504029487-7085-1-git-send-email-nikolay@cumulusnetworks.com>


I expect that you will resubmit all of these similar fixes as a patch
series after you have sorted everything out.

Correct?

^ permalink raw reply

* Re: [PATCH net] nfp: double free on error in probe
From: David Miller @ 2017-08-29 23:07 UTC (permalink / raw)
  To: dan.carpenter
  Cc: jakub.kicinski, simon.horman, oss-drivers, netdev,
	kernel-janitors
In-Reply-To: <20170829190855.fabfktv2dg57x3gs@mwanda>

From: Dan Carpenter <dan.carpenter@oracle.com>
Date: Tue, 29 Aug 2017 22:15:16 +0300

> Both the nfp_net_pf_app_start() and the nfp_net_pci_probe() functions
> call nfp_net_pf_app_stop_ctrl(pf) so there is a double free.  The free
> should be done from the probe function because it's allocated there so
> I have removed the call from nfp_net_pf_app_start().
> 
> Fixes: 02082701b974 ("nfp: create control vNICs and wire up rx/tx")
> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>

Applied.

^ permalink raw reply

* Re: [PATCH V2 net-next] liquidio: show NIC's U-Boot version in a dev_info() message
From: David Miller @ 2017-08-29 23:09 UTC (permalink / raw)
  To: felix.manlunas
  Cc: netdev, raghu.vatsavayi, derek.chickles, satananda.burla,
	weilin.chang
In-Reply-To: <20170829191957.GA12243@felix-thinkpad.cavium.com>

From: Felix Manlunas <felix.manlunas@cavium.com>
Date: Tue, 29 Aug 2017 12:19:57 -0700

> From: Weilin Chang <weilin.chang@cavium.com>
> 
> Signed-off-by: Weilin Chang <weilin.chang@cavium.com>
> Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
> ---
> Patch Change Log:
>   V1 -> V2:
>     * Move octeon_get_uboot_version() to a proper place to avoid forward
>       declaration.
>     * Remove complicated for-loops that search for substrings; replace them
>       with calls to strstr().
>     * Don't add unnecessary fields to struct octeon_device.

Applied.

^ permalink raw reply

* Re: [PATCH net-next v2] net: bcmgenet: Use correct I/O accessors
From: David Miller @ 2017-08-29 23:09 UTC (permalink / raw)
  To: f.fainelli; +Cc: netdev, opendmb, jaedon.shin
In-Reply-To: <1504034731-31613-1-git-send-email-f.fainelli@gmail.com>

From: Florian Fainelli <f.fainelli@gmail.com>
Date: Tue, 29 Aug 2017 12:25:31 -0700

> The GENET driver currently uses __raw_{read,write}l which means
> native I/O endian. This works correctly for an ARM LE kernel (default)
> but fails miserably on an ARM BE (BE8) kernel where registers are kept
> little endian, so replace uses with {read,write}l_relaxed here which is
> what we want because this is all performance sensitive code.
> 
> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>

Applied.

^ permalink raw reply

* Re: [PATCH] net: remove dmaengine.h inclusion from netdevice.h
From: David Miller @ 2017-08-29 23:10 UTC (permalink / raw)
  To: dave.jiang; +Cc: netdev
In-Reply-To: <150403787157.72820.10644938115323539701.stgit@djiang5-desk3.ch.intel.com>

From: Dave Jiang <dave.jiang@intel.com>
Date: Tue, 29 Aug 2017 13:17:51 -0700

> Since the removal of NET_DMA, dmaengine.h header file shouldn't be needed
> by netdevice.h anymore.
> 
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>

Applied to net-next, but it would have been really great for you to
have provided a proper "Fixes: " tag referencing the NET_DMA
removal change.

Thanks.

^ permalink raw reply

* Re: [PATCH net-next] neigh: increase queue_len_bytes to match wmem_default
From: David Miller @ 2017-08-29 23:11 UTC (permalink / raw)
  To: eric.dumazet; +Cc: f.fainelli, netdev
In-Reply-To: <1504044961.11498.100.camel@edumazet-glaptop3.roam.corp.google.com>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Tue, 29 Aug 2017 15:16:01 -0700

> From: Eric Dumazet <edumazet@google.com>
> 
> Florian reported UDP xmit drops that could be root caused to the
> too small neigh limit.
> 
> Current limit is 64 KB, meaning that even a single UDP socket would hit
> it, since its default sk_sndbuf comes from net.core.wmem_default
> (~212992 bytes on 64bit arches).
> 
> Once ARP/ND resolution is in progress, we should allow a little more
> packets to be queued, at least for one producer.
> 
> Once neigh arp_queue is filled, a rogue socket should hit its sk_sndbuf
> limit and either block in sendmsg() or return -EAGAIN.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Reported-by: Florian Fainelli <f.fainelli@gmail.com>

Applied, thanks for following up on this.

^ permalink raw reply

* Re: [PATCH net] sch_htb: fix crash on init failure
From: Nikolay Aleksandrov @ 2017-08-29 23:11 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, edumazet, jhs, xiyou.wangcong, jiri, roopa
In-Reply-To: <20170829.160625.1623439754611810483.davem@davemloft.net>

On 30.08.2017 02:06, David Miller wrote:
> 
> I expect that you will resubmit all of these similar fixes as a patch
> series after you have sorted everything out.
> 
> Correct?
> 

Yes, I will. There are a few more places that need fixing, I'll resubmit
them all as a set.

^ permalink raw reply

* Re: [PATCH] drivers: net: xgene: Correct probe sequence handling
From: David Miller @ 2017-08-29 23:13 UTC (permalink / raw)
  To: isubramanian; +Cc: netdev, linux-arm-kernel, dnelson, patches, qnguyen
In-Reply-To: <1504046592-24779-1-git-send-email-isubramanian@apm.com>

From: Iyappan Subramanian <isubramanian@apm.com>
Date: Tue, 29 Aug 2017 15:43:12 -0700

> From: Quan Nguyen <qnguyen@apm.com>
> 
> The phy is connected at early stage of probe but not properly
> disconnected if error occurs.  This patch fixes the issue.
> 
> Also changing the return type of xgene_enet_check_phy_handle(),
> since this function always returns success.
> 
> Signed-off-by: Quan Nguyen <qnguyen@apm.com>
> Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>

Applied, thank you.

^ permalink raw reply

* Re: [PATCH net-next] neigh: increase queue_len_bytes to match wmem_default
From: David Miller @ 2017-08-29 23:17 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev
In-Reply-To: <1504048528.11498.109.camel@edumazet-glaptop3.roam.corp.google.com>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Tue, 29 Aug 2017 16:15:28 -0700

> On Tue, 2017-08-29 at 15:16 -0700, Eric Dumazet wrote:
>> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
>> index
>> 568ccfd6dd371d88136ffabe5cfcc36f099786b6..7616cd76f6f6a62f395da897baef2c66c0098193 100644
>> --- a/net/ipv4/tcp_input.c
>> +++ b/net/ipv4/tcp_input.c
>> @@ -6086,9 +6086,9 @@ int tcp_conn_request(struct request_sock_ops
>> *rsk_ops,
>>         struct tcp_sock *tp = tcp_sk(sk);
>>         struct net *net = sock_net(sk);
>>         struct sock *fastopen_sk = NULL;
>> -       struct dst_entry *dst = NULL;
>>         struct request_sock *req;
>>         bool want_cookie = false;
>> +       struct dst_entry *dst;
>>         struct flowi fl;
>>  
>>         /* TW buckets are converted to open req
> 
> This part was meant to belong to a separate patch :/
> 
> No big deal, this was also one of your suggestion David.

Yeah, no big deal.  But thanks for pointing it out.

^ permalink raw reply

* Re: [PATCH net-next] neigh: increase queue_len_bytes to match wmem_default
From: Eric Dumazet @ 2017-08-29 23:15 UTC (permalink / raw)
  To: David Miller; +Cc: netdev
In-Reply-To: <1504044961.11498.100.camel@edumazet-glaptop3.roam.corp.google.com>

On Tue, 2017-08-29 at 15:16 -0700, Eric Dumazet wrote:
> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> index
> 568ccfd6dd371d88136ffabe5cfcc36f099786b6..7616cd76f6f6a62f395da897baef2c66c0098193 100644
> --- a/net/ipv4/tcp_input.c
> +++ b/net/ipv4/tcp_input.c
> @@ -6086,9 +6086,9 @@ int tcp_conn_request(struct request_sock_ops
> *rsk_ops,
>         struct tcp_sock *tp = tcp_sk(sk);
>         struct net *net = sock_net(sk);
>         struct sock *fastopen_sk = NULL;
> -       struct dst_entry *dst = NULL;
>         struct request_sock *req;
>         bool want_cookie = false;
> +       struct dst_entry *dst;
>         struct flowi fl;
>  
>         /* TW buckets are converted to open req

This part was meant to belong to a separate patch :/

No big deal, this was also one of your suggestion David.

^ permalink raw reply

* [PATCH v2 net-next 0/6] flow_dissector: Protocol specific flow dissector offload
From: Tom Herbert @ 2017-08-29 23:27 UTC (permalink / raw)
  To: davem; +Cc: netdev, Tom Herbert

This patch set adds a new offload type to perform flow dissection for
specific protocols (either by EtherType or by IP protocol). This is
primary useful to crack open UDP encapsulations (like VXLAN, GUE) for
the purposes of parsing the encapsulated packet.

Items in this patch set:
- Constify skb argument to UDP lookup functions
- Create new protocol case in __skb_dissect for ETH_P_TEB. This is based
  on the code in the GRE dissect function and the special handling in
  GRE can now be removed (it sets protocol to ETH_P_TEB and returns so
  goto proto_again is done)
- Add infrastructure for protocol specific flow dissection offload
- Add infrastructure to perform UDP flow dissection. Uses same model of
  GRO where a flow_dissect callback can be associated with a UDP
  socket
- Use the infrastructure to support flow dissection of VXLAN and GUE

Tested:

Forced RPS to call flow dissection for VXLAN, FOU, and GUE. Observed
that inner packet was being properly dissected.

v2: Add signed off

Tom Herbert (6):
  flow_dissector: Move ETH_P_TEB processing to main switch
  udp: Constify skb argument in lookup functions
  flow_dissector: Add protocol specific flow dissection offload
  udp: flow dissector offload
  fou: Support flow dissection
  vxlan: support flow dissect

 drivers/net/vxlan.c          |  50 ++++++++++++
 include/linux/netdevice.h    |   7 ++
 include/linux/udp.h          |   8 ++
 include/net/flow_dissector.h |   9 +++
 include/net/ip.h             |   2 +-
 include/net/sock_reuseport.h |   2 +-
 include/net/udp.h            |  19 +++--
 include/net/udp_tunnel.h     |   8 ++
 net/core/dev.c               |  14 ++++
 net/core/flow_dissector.c    | 176 +++++++++++++++++++++++++++++--------------
 net/core/sock_reuseport.c    |   5 +-
 net/ipv4/fou.c               |  63 ++++++++++++++++
 net/ipv4/route.c             |   4 +-
 net/ipv4/udp.c               |  11 +--
 net/ipv4/udp_offload.c       |  45 +++++++++++
 net/ipv4/udp_tunnel.c        |   1 +
 net/ipv6/udp.c               |  10 +--
 net/ipv6/udp_offload.c       |  13 ++++
 18 files changed, 369 insertions(+), 78 deletions(-)

-- 
2.11.0

^ permalink raw reply

* [PATCH v2 net-next 1/6] flow_dissector: Move ETH_P_TEB processing to main switch
From: Tom Herbert @ 2017-08-29 23:27 UTC (permalink / raw)
  To: davem; +Cc: netdev, Tom Herbert
In-Reply-To: <20170829232711.1465-1-tom@quantonium.net>

Support for processing TEB is currently in GRE flow dissection as a
special case. This can be moved to be a case the main proto switch in
__skb_flow_dissect.

Signed-off-by: Tom Herbert <tom@quantonium.net>
---
 net/core/flow_dissector.c | 44 +++++++++++++++++++++++---------------------
 1 file changed, 23 insertions(+), 21 deletions(-)

diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
index e2eaa1ff948d..12302acdb073 100644
--- a/net/core/flow_dissector.c
+++ b/net/core/flow_dissector.c
@@ -288,27 +288,8 @@ __skb_flow_dissect_gre(const struct sk_buff *skb,
 	if (hdr->flags & GRE_SEQ)
 		offset += sizeof(((struct pptp_gre_header *) 0)->seq);
 
-	if (gre_ver == 0) {
-		if (*p_proto == htons(ETH_P_TEB)) {
-			const struct ethhdr *eth;
-			struct ethhdr _eth;
-
-			eth = __skb_header_pointer(skb, *p_nhoff + offset,
-						   sizeof(_eth),
-						   data, *p_hlen, &_eth);
-			if (!eth)
-				return FLOW_DISSECT_RET_OUT_BAD;
-			*p_proto = eth->h_proto;
-			offset += sizeof(*eth);
-
-			/* Cap headers that we access via pointers at the
-			 * end of the Ethernet header as our maximum alignment
-			 * at that point is only 2 bytes.
-			 */
-			if (NET_IP_ALIGN)
-				*p_hlen = *p_nhoff + offset;
-		}
-	} else { /* version 1, must be PPTP */
+	/* version 1, must be PPTP */
+	if (gre_ver == 1) {
 		u8 _ppp_hdr[PPP_HDRLEN];
 		u8 *ppp_hdr;
 
@@ -573,6 +554,27 @@ bool __skb_flow_dissect(const struct sk_buff *skb,
 
 		break;
 	}
+	case htons(ETH_P_TEB): {
+		const struct ethhdr *eth;
+		struct ethhdr _eth;
+
+		eth = __skb_header_pointer(skb, nhoff, sizeof(_eth),
+					   data, hlen, &_eth);
+		if (!eth)
+			goto out_bad;
+
+		proto = eth->h_proto;
+		nhoff += sizeof(*eth);
+
+		/* Cap headers that we access via pointers at the
+		 * end of the Ethernet header as our maximum alignment
+		 * at that point is only 2 bytes.
+		 */
+		if (NET_IP_ALIGN)
+			hlen = nhoff;
+
+		goto proto_again;
+	}
 	case htons(ETH_P_8021AD):
 	case htons(ETH_P_8021Q): {
 		const struct vlan_hdr *vlan;
-- 
2.11.0

^ permalink raw reply related

* [PATCH v2 net-next 2/6] udp: Constify skb argument in lookup functions
From: Tom Herbert @ 2017-08-29 23:27 UTC (permalink / raw)
  To: davem; +Cc: netdev, Tom Herbert
In-Reply-To: <20170829232711.1465-1-tom@quantonium.net>

For UDP socket lookup functions, and associateed functions that take an
skbuf as argument, declare the skb argument as constant.

One caveat is that reuseport_select_sock can be called from the UDP
lookup functions with an skb argument. This function temporarily
modifies the skbuff data pointer (in bpf_run via a pull/push sequence).
To resolve compiler warning I added a local skbuf declaration that is
not const and assigned to the skb argument with an explicit cast.

Signed-off-by: Tom Herbert <tom@quantonium.net>
---
 include/net/ip.h             |  2 +-
 include/net/sock_reuseport.h |  2 +-
 include/net/udp.h            | 11 ++++++-----
 net/core/sock_reuseport.c    |  5 +++--
 net/ipv4/udp.c               | 11 ++++++-----
 net/ipv6/udp.c               | 10 +++++-----
 6 files changed, 22 insertions(+), 19 deletions(-)

diff --git a/include/net/ip.h b/include/net/ip.h
index 9896f46cbbf1..8c0d84ffc659 100644
--- a/include/net/ip.h
+++ b/include/net/ip.h
@@ -79,7 +79,7 @@ struct ipcm_cookie {
 #define PKTINFO_SKB_CB(skb) ((struct in_pktinfo *)((skb)->cb))
 
 /* return enslaved device index if relevant */
-static inline int inet_sdif(struct sk_buff *skb)
+static inline int inet_sdif(const struct sk_buff *skb)
 {
 #if IS_ENABLED(CONFIG_NET_L3_MASTER_DEV)
 	if (skb && ipv4_l3mdev_skb(IPCB(skb)->flags))
diff --git a/include/net/sock_reuseport.h b/include/net/sock_reuseport.h
index aecd30308d50..d25352a848d9 100644
--- a/include/net/sock_reuseport.h
+++ b/include/net/sock_reuseport.h
@@ -20,7 +20,7 @@ extern int reuseport_add_sock(struct sock *sk, struct sock *sk2);
 extern void reuseport_detach_sock(struct sock *sk);
 extern struct sock *reuseport_select_sock(struct sock *sk,
 					  u32 hash,
-					  struct sk_buff *skb,
+					  const struct sk_buff *skb,
 					  int hdr_len);
 extern struct bpf_prog *reuseport_attach_prog(struct sock *sk,
 					      struct bpf_prog *prog);
diff --git a/include/net/udp.h b/include/net/udp.h
index 4e5f23fec35e..f3d1de6f0983 100644
--- a/include/net/udp.h
+++ b/include/net/udp.h
@@ -167,7 +167,7 @@ static inline void udp_csum_pull_header(struct sk_buff *skb)
 	UDP_SKB_CB(skb)->cscov -= sizeof(struct udphdr);
 }
 
-typedef struct sock *(*udp_lookup_t)(struct sk_buff *skb, __be16 sport,
+typedef struct sock *(*udp_lookup_t)(const struct sk_buff *skb, __be16 sport,
 				     __be16 dport);
 
 struct sk_buff **udp_gro_receive(struct sk_buff **head, struct sk_buff *skb,
@@ -288,8 +288,9 @@ struct sock *udp4_lib_lookup(struct net *net, __be32 saddr, __be16 sport,
 			     __be32 daddr, __be16 dport, int dif);
 struct sock *__udp4_lib_lookup(struct net *net, __be32 saddr, __be16 sport,
 			       __be32 daddr, __be16 dport, int dif, int sdif,
-			       struct udp_table *tbl, struct sk_buff *skb);
-struct sock *udp4_lib_lookup_skb(struct sk_buff *skb,
+			       struct udp_table *tbl,
+			       const struct sk_buff *skb);
+struct sock *udp4_lib_lookup_skb(const struct sk_buff *skb,
 				 __be16 sport, __be16 dport);
 struct sock *udp6_lib_lookup(struct net *net,
 			     const struct in6_addr *saddr, __be16 sport,
@@ -299,8 +300,8 @@ struct sock *__udp6_lib_lookup(struct net *net,
 			       const struct in6_addr *saddr, __be16 sport,
 			       const struct in6_addr *daddr, __be16 dport,
 			       int dif, int sdif, struct udp_table *tbl,
-			       struct sk_buff *skb);
-struct sock *udp6_lib_lookup_skb(struct sk_buff *skb,
+			       const struct sk_buff *skb);
+struct sock *udp6_lib_lookup_skb(const struct sk_buff *skb,
 				 __be16 sport, __be16 dport);
 
 /* UDP uses skb->dev_scratch to cache as much information as possible and avoid
diff --git a/net/core/sock_reuseport.c b/net/core/sock_reuseport.c
index eed1ebf7f29d..a17f13b33189 100644
--- a/net/core/sock_reuseport.c
+++ b/net/core/sock_reuseport.c
@@ -164,9 +164,10 @@ void reuseport_detach_sock(struct sock *sk)
 EXPORT_SYMBOL(reuseport_detach_sock);
 
 static struct sock *run_bpf(struct sock_reuseport *reuse, u16 socks,
-			    struct bpf_prog *prog, struct sk_buff *skb,
+			    struct bpf_prog *prog, const struct sk_buff *_skb,
 			    int hdr_len)
 {
+	struct sk_buff *skb = (struct sk_buff *)_skb; /* Override const */
 	struct sk_buff *nskb = NULL;
 	u32 index;
 
@@ -205,7 +206,7 @@ static struct sock *run_bpf(struct sock_reuseport *reuse, u16 socks,
  */
 struct sock *reuseport_select_sock(struct sock *sk,
 				   u32 hash,
-				   struct sk_buff *skb,
+				   const struct sk_buff *skb,
 				   int hdr_len)
 {
 	struct sock_reuseport *reuse;
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index bf6c406bf5e7..a851026ef28b 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -135,7 +135,8 @@ EXPORT_SYMBOL(udp_memory_allocated);
 #define PORTS_PER_CHAIN (MAX_UDP_PORTS / UDP_HTABLE_SIZE_MIN)
 
 /* IPCB reference means this can not be used from early demux */
-static bool udp_lib_exact_dif_match(struct net *net, struct sk_buff *skb)
+static bool udp_lib_exact_dif_match(struct net *net,
+				    const struct sk_buff *skb)
 {
 #if IS_ENABLED(CONFIG_NET_L3_MASTER_DEV)
 	if (!net->ipv4.sysctl_udp_l3mdev_accept &&
@@ -445,7 +446,7 @@ static struct sock *udp4_lib_lookup2(struct net *net,
 				     __be32 daddr, unsigned int hnum,
 				     int dif, int sdif, bool exact_dif,
 				     struct udp_hslot *hslot2,
-				     struct sk_buff *skb)
+				     const struct sk_buff *skb)
 {
 	struct sock *sk, *result;
 	int score, badness, matches = 0, reuseport = 0;
@@ -484,7 +485,7 @@ static struct sock *udp4_lib_lookup2(struct net *net,
  */
 struct sock *__udp4_lib_lookup(struct net *net, __be32 saddr,
 		__be16 sport, __be32 daddr, __be16 dport, int dif,
-		int sdif, struct udp_table *udptable, struct sk_buff *skb)
+		int sdif, struct udp_table *udptable, const struct sk_buff *skb)
 {
 	struct sock *sk, *result;
 	unsigned short hnum = ntohs(dport);
@@ -552,7 +553,7 @@ struct sock *__udp4_lib_lookup(struct net *net, __be32 saddr,
 }
 EXPORT_SYMBOL_GPL(__udp4_lib_lookup);
 
-static inline struct sock *__udp4_lib_lookup_skb(struct sk_buff *skb,
+static inline struct sock *__udp4_lib_lookup_skb(const struct sk_buff *skb,
 						 __be16 sport, __be16 dport,
 						 struct udp_table *udptable)
 {
@@ -563,7 +564,7 @@ static inline struct sock *__udp4_lib_lookup_skb(struct sk_buff *skb,
 				 inet_sdif(skb), udptable, skb);
 }
 
-struct sock *udp4_lib_lookup_skb(struct sk_buff *skb,
+struct sock *udp4_lib_lookup_skb(const struct sk_buff *skb,
 				 __be16 sport, __be16 dport)
 {
 	return __udp4_lib_lookup_skb(skb, sport, dport, &udp_table);
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 976f30391356..e9aa4db3ba53 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -56,7 +56,7 @@
 #include <trace/events/skb.h>
 #include "udp_impl.h"
 
-static bool udp6_lib_exact_dif_match(struct net *net, struct sk_buff *skb)
+static bool udp6_lib_exact_dif_match(struct net *net, const struct sk_buff *skb)
 {
 #if defined(CONFIG_NET_L3_MASTER_DEV)
 	if (!net->ipv4.sysctl_udp_l3mdev_accept &&
@@ -181,7 +181,7 @@ static struct sock *udp6_lib_lookup2(struct net *net,
 		const struct in6_addr *saddr, __be16 sport,
 		const struct in6_addr *daddr, unsigned int hnum,
 		int dif, int sdif, bool exact_dif,
-		struct udp_hslot *hslot2, struct sk_buff *skb)
+		struct udp_hslot *hslot2, const struct sk_buff *skb)
 {
 	struct sock *sk, *result;
 	int score, badness, matches = 0, reuseport = 0;
@@ -221,7 +221,7 @@ struct sock *__udp6_lib_lookup(struct net *net,
 			       const struct in6_addr *saddr, __be16 sport,
 			       const struct in6_addr *daddr, __be16 dport,
 			       int dif, int sdif, struct udp_table *udptable,
-			       struct sk_buff *skb)
+			       const struct sk_buff *skb)
 {
 	struct sock *sk, *result;
 	unsigned short hnum = ntohs(dport);
@@ -290,7 +290,7 @@ struct sock *__udp6_lib_lookup(struct net *net,
 }
 EXPORT_SYMBOL_GPL(__udp6_lib_lookup);
 
-static struct sock *__udp6_lib_lookup_skb(struct sk_buff *skb,
+static struct sock *__udp6_lib_lookup_skb(const struct sk_buff *skb,
 					  __be16 sport, __be16 dport,
 					  struct udp_table *udptable)
 {
@@ -301,7 +301,7 @@ static struct sock *__udp6_lib_lookup_skb(struct sk_buff *skb,
 				 inet6_sdif(skb), udptable, skb);
 }
 
-struct sock *udp6_lib_lookup_skb(struct sk_buff *skb,
+struct sock *udp6_lib_lookup_skb(const struct sk_buff *skb,
 				 __be16 sport, __be16 dport)
 {
 	const struct ipv6hdr *iph = ipv6_hdr(skb);
-- 
2.11.0

^ permalink raw reply related

* [PATCH v2 net-next 3/6] flow_dissector: Add protocol specific flow dissection offload
From: Tom Herbert @ 2017-08-29 23:27 UTC (permalink / raw)
  To: davem; +Cc: netdev, Tom Herbert
In-Reply-To: <20170829232711.1465-1-tom@quantonium.net>

Add offload capability for performing protocol specific flow dissection
(either by EtherType or IP protocol).

Specifically:

- Add flow_dissect to offload callbacks
- Move flow_dissect_ret enum to flow_dissector.h, cleanup names and add a
  couple of values
- Create GOTO_BY_RESULT macro to use in the main flow dissector switch to
  simplify handling of functions that return flow_dissect_ret enum
- In __skb_flow_dissect, add default case for switch(proto) as well as
  switch(ip_proto) that looks up and calls protocol specific flow
  dissection

Signed-off-by: Tom Herbert <tom@quantonium.net>
---
 include/linux/netdevice.h    |   7 +++
 include/net/flow_dissector.h |   9 +++
 net/core/dev.c               |  14 +++++
 net/core/flow_dissector.c    | 132 +++++++++++++++++++++++++++++++------------
 net/ipv4/route.c             |   4 +-
 5 files changed, 128 insertions(+), 38 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index c5475b37a631..90ccb434e127 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2208,6 +2208,12 @@ struct offload_callbacks {
 	struct sk_buff		**(*gro_receive)(struct sk_buff **head,
 						 struct sk_buff *skb);
 	int			(*gro_complete)(struct sk_buff *skb, int nhoff);
+	enum flow_dissect_ret (*flow_dissect)(const struct sk_buff *skb,
+			struct flow_dissector_key_control *key_control,
+			struct flow_dissector *flow_dissector,
+			void *target_container, void *data,
+			__be16 *p_proto, u8 *p_ip_proto, int *p_nhoff,
+			int *p_hlen, unsigned int flags);
 };
 
 struct packet_offload {
@@ -3253,6 +3259,7 @@ struct sk_buff *napi_get_frags(struct napi_struct *napi);
 gro_result_t napi_gro_frags(struct napi_struct *napi);
 struct packet_offload *gro_find_receive_by_type(__be16 type);
 struct packet_offload *gro_find_complete_by_type(__be16 type);
+struct packet_offload *flow_dissect_find_by_type(__be16 type);
 
 static inline void napi_free_frags(struct napi_struct *napi)
 {
diff --git a/include/net/flow_dissector.h b/include/net/flow_dissector.h
index e2663e900b0a..ad75bbfd1c9c 100644
--- a/include/net/flow_dissector.h
+++ b/include/net/flow_dissector.h
@@ -19,6 +19,14 @@ struct flow_dissector_key_control {
 #define FLOW_DIS_FIRST_FRAG	BIT(1)
 #define FLOW_DIS_ENCAPSULATION	BIT(2)
 
+enum flow_dissect_ret {
+	FLOW_DISSECT_RET_OUT_GOOD,
+	FLOW_DISSECT_RET_OUT_BAD,
+	FLOW_DISSECT_RET_PROTO_AGAIN,
+	FLOW_DISSECT_RET_IPPROTO_AGAIN,
+	FLOW_DISSECT_RET_CONTINUE,
+};
+
 /**
  * struct flow_dissector_key_basic:
  * @thoff: Transport header offset
@@ -205,6 +213,7 @@ enum flow_dissector_key_id {
 #define FLOW_DISSECTOR_F_STOP_AT_L3		BIT(1)
 #define FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL	BIT(2)
 #define FLOW_DISSECTOR_F_STOP_AT_ENCAP		BIT(3)
+#define FLOW_DISSECTOR_F_STOP_AT_L4		BIT(4)
 
 struct flow_dissector_key {
 	enum flow_dissector_key_id key_id;
diff --git a/net/core/dev.c b/net/core/dev.c
index 270b54754821..22ea8daa930c 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4860,6 +4860,20 @@ struct packet_offload *gro_find_receive_by_type(__be16 type)
 }
 EXPORT_SYMBOL(gro_find_receive_by_type);
 
+struct packet_offload *flow_dissect_find_by_type(__be16 type)
+{
+	struct list_head *offload_head = &offload_base;
+	struct packet_offload *ptype;
+
+	list_for_each_entry_rcu(ptype, offload_head, list) {
+		if (ptype->type != type || !ptype->callbacks.flow_dissect)
+			continue;
+		return ptype;
+	}
+	return NULL;
+}
+EXPORT_SYMBOL(flow_dissect_find_by_type);
+
 struct packet_offload *gro_find_complete_by_type(__be16 type)
 {
 	struct list_head *offload_head = &offload_base;
diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
index 12302acdb073..6a2cf240069a 100644
--- a/net/core/flow_dissector.c
+++ b/net/core/flow_dissector.c
@@ -9,6 +9,7 @@
 #include <net/ipv6.h>
 #include <net/gre.h>
 #include <net/pptp.h>
+#include <net/protocol.h>
 #include <linux/igmp.h>
 #include <linux/icmp.h>
 #include <linux/sctp.h>
@@ -115,12 +116,6 @@ __be32 __skb_flow_get_ports(const struct sk_buff *skb, int thoff, u8 ip_proto,
 }
 EXPORT_SYMBOL(__skb_flow_get_ports);
 
-enum flow_dissect_ret {
-	FLOW_DISSECT_RET_OUT_GOOD,
-	FLOW_DISSECT_RET_OUT_BAD,
-	FLOW_DISSECT_RET_OUT_PROTO_AGAIN,
-};
-
 static enum flow_dissect_ret
 __skb_flow_dissect_mpls(const struct sk_buff *skb,
 			struct flow_dissector *flow_dissector,
@@ -322,7 +317,7 @@ __skb_flow_dissect_gre(const struct sk_buff *skb,
 	if (flags & FLOW_DISSECTOR_F_STOP_AT_ENCAP)
 		return FLOW_DISSECT_RET_OUT_GOOD;
 
-	return FLOW_DISSECT_RET_OUT_PROTO_AGAIN;
+	return FLOW_DISSECT_RET_PROTO_AGAIN;
 }
 
 static void
@@ -383,6 +378,27 @@ __skb_flow_dissect_ipv6(const struct sk_buff *skb,
 	key_ip->ttl = iph->hop_limit;
 }
 
+#define GOTO_BY_RESULT(ret) do {				\
+	switch (ret) {						\
+	case FLOW_DISSECT_RET_OUT_GOOD:				\
+		goto out_good;					\
+	case FLOW_DISSECT_RET_PROTO_AGAIN:			\
+		goto proto_again;				\
+	case FLOW_DISSECT_RET_IPPROTO_AGAIN:			\
+		goto ip_proto_again;				\
+	case FLOW_DISSECT_RET_OUT_BAD:				\
+	default:						\
+		goto out_bad;					\
+	}							\
+} while (0)
+
+#define GOTO_OR_CONT_BY_RESULT(ret) do {			\
+	enum flow_dissect_ret __ret = (ret);			\
+								\
+	if (__ret != FLOW_DISSECT_RET_CONTINUE)			\
+		GOTO_BY_RESULT(__ret);				\
+} while (0)
+
 /**
  * __skb_flow_dissect - extract the flow_keys struct and return it
  * @skb: sk_buff to extract the flow from, can be NULL if the rest are specified
@@ -659,15 +675,10 @@ bool __skb_flow_dissect(const struct sk_buff *skb,
 	case htons(ETH_P_MPLS_UC):
 	case htons(ETH_P_MPLS_MC):
 mpls:
-		switch (__skb_flow_dissect_mpls(skb, flow_dissector,
-						target_container, data,
-						nhoff, hlen)) {
-		case FLOW_DISSECT_RET_OUT_GOOD:
-			goto out_good;
-		case FLOW_DISSECT_RET_OUT_BAD:
-		default:
-			goto out_bad;
-		}
+		GOTO_BY_RESULT(__skb_flow_dissect_mpls(skb, flow_dissector,
+						       target_container, data,
+						       nhoff, hlen));
+
 	case htons(ETH_P_FCOE):
 		if ((hlen - nhoff) < FCOE_HEADER_LEN)
 			goto out_bad;
@@ -677,32 +688,44 @@ bool __skb_flow_dissect(const struct sk_buff *skb,
 
 	case htons(ETH_P_ARP):
 	case htons(ETH_P_RARP):
-		switch (__skb_flow_dissect_arp(skb, flow_dissector,
-					       target_container, data,
-					       nhoff, hlen)) {
-		case FLOW_DISSECT_RET_OUT_GOOD:
-			goto out_good;
-		case FLOW_DISSECT_RET_OUT_BAD:
-		default:
-			goto out_bad;
+		GOTO_BY_RESULT(__skb_flow_dissect_arp(skb, flow_dissector,
+						      target_container, data,
+						      nhoff, hlen));
+
+	default: {
+		struct packet_offload *ptype;
+		enum flow_dissect_ret ret;
+
+		rcu_read_lock();
+
+		ptype = flow_dissect_find_by_type(proto);
+
+		if (ptype) {
+			ret = ptype->callbacks.flow_dissect(skb, key_control,
+						flow_dissector,
+						target_container,
+						data, &proto, &ip_proto, &nhoff,
+						&hlen, flags);
+			rcu_read_unlock();
+
+			GOTO_BY_RESULT(ret);
+		} else {
+			rcu_read_unlock();
 		}
-	default:
+
 		goto out_bad;
 	}
+	}
 
 ip_proto_again:
 	switch (ip_proto) {
 	case IPPROTO_GRE:
-		switch (__skb_flow_dissect_gre(skb, key_control, flow_dissector,
-					       target_container, data,
-					       &proto, &nhoff, &hlen, flags)) {
-		case FLOW_DISSECT_RET_OUT_GOOD:
-			goto out_good;
-		case FLOW_DISSECT_RET_OUT_BAD:
-			goto out_bad;
-		case FLOW_DISSECT_RET_OUT_PROTO_AGAIN:
-			goto proto_again;
-		}
+		GOTO_BY_RESULT(__skb_flow_dissect_gre(skb, key_control,
+						      flow_dissector,
+						      target_container, data,
+						      &proto, &nhoff, &hlen,
+						      flags));
+
 	case NEXTHDR_HOP:
 	case NEXTHDR_ROUTING:
 	case NEXTHDR_DEST: {
@@ -768,9 +791,43 @@ bool __skb_flow_dissect(const struct sk_buff *skb,
 		__skb_flow_dissect_tcp(skb, flow_dissector, target_container,
 				       data, nhoff, hlen);
 		break;
-	default:
+	default: {
+		const struct net_offload *ops = NULL;
+
+		if (flags & FLOW_DISSECTOR_F_STOP_AT_L4)
+			break;
+
+		rcu_read_lock();
+
+		switch (proto) {
+		case htons(ETH_P_IP):
+			ops = rcu_dereference(inet_offloads[ip_proto]);
+			break;
+		case htons(ETH_P_IPV6):
+			ops = rcu_dereference(inet6_offloads[ip_proto]);
+			break;
+		default:
+			break;
+		}
+
+		if (ops && ops->callbacks.flow_dissect) {
+			enum flow_dissect_ret ret;
+
+			ret = ops->callbacks.flow_dissect(skb, key_control,
+						flow_dissector,
+						target_container,
+						data, &proto, &ip_proto, &nhoff,
+						&hlen, flags);
+			rcu_read_unlock();
+
+			GOTO_OR_CONT_BY_RESULT(ret);
+		} else {
+			rcu_read_unlock();
+		}
+
 		break;
 	}
+	}
 
 	if (dissector_uses_key(flow_dissector,
 			       FLOW_DISSECTOR_KEY_PORTS)) {
@@ -935,7 +992,8 @@ static inline u32 ___skb_get_hash(const struct sk_buff *skb,
 				  struct flow_keys *keys, u32 keyval)
 {
 	skb_flow_dissect_flow_keys(skb, keys,
-				   FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL);
+				   FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL |
+				   FLOW_DISSECTOR_F_STOP_AT_L4);
 
 	return __flow_hash_from_keys(keys, keyval);
 }
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 94d4cd2d5ea4..85f12b8e0b7f 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -1811,7 +1811,9 @@ int fib_multipath_hash(const struct fib_info *fi, const struct flowi4 *fl4,
 	case 1:
 		/* skb is currently provided only when forwarding */
 		if (skb) {
-			unsigned int flag = FLOW_DISSECTOR_F_STOP_AT_ENCAP;
+			unsigned int flag = FLOW_DISSECTOR_F_STOP_AT_ENCAP |
+					    FLOW_DISSECTOR_F_STOP_AT_L4;
+;
 			struct flow_keys keys;
 
 			/* short-circuit if we already have L4 hash present */
-- 
2.11.0

^ permalink raw reply related

* [PATCH v2 net-next 4/6] udp: flow dissector offload
From: Tom Herbert @ 2017-08-29 23:27 UTC (permalink / raw)
  To: davem; +Cc: netdev, Tom Herbert
In-Reply-To: <20170829232711.1465-1-tom@quantonium.net>

Add support to perform UDP specific flow dissection. This is
primarily intended for dissecting encapsulated packets in UDP
encapsulation.

This patch adds a flow_dissect offload for UDP4 and UDP6. The backend
function performs a socket lookup and calls the flow_dissect function
if a socket is found.

Signed-off-by: Tom Herbert <tom@quantonium.net>
---
 include/linux/udp.h      |  8 ++++++++
 include/net/udp.h        |  8 ++++++++
 include/net/udp_tunnel.h |  8 ++++++++
 net/ipv4/udp_offload.c   | 45 +++++++++++++++++++++++++++++++++++++++++++++
 net/ipv4/udp_tunnel.c    |  1 +
 net/ipv6/udp_offload.c   | 13 +++++++++++++
 6 files changed, 83 insertions(+)

diff --git a/include/linux/udp.h b/include/linux/udp.h
index eaea63bc79bb..2e90b189ef6a 100644
--- a/include/linux/udp.h
+++ b/include/linux/udp.h
@@ -79,6 +79,14 @@ struct udp_sock {
 	int			(*gro_complete)(struct sock *sk,
 						struct sk_buff *skb,
 						int nhoff);
+	/* Flow dissector function for a UDP socket */
+	enum flow_dissect_ret (*flow_dissect)(struct sock *sk,
+			const struct sk_buff *skb,
+			struct flow_dissector_key_control *key_control,
+			struct flow_dissector *flow_dissector,
+			void *target_container, void *data,
+			__be16 *p_proto, u8 *p_ip_proto, int *p_nhoff,
+			int *p_hlen, unsigned int flags);
 
 	/* udp_recvmsg try to use this before splicing sk_receive_queue */
 	struct sk_buff_head	reader_queue ____cacheline_aligned_in_smp;
diff --git a/include/net/udp.h b/include/net/udp.h
index f3d1de6f0983..499e4faf8b14 100644
--- a/include/net/udp.h
+++ b/include/net/udp.h
@@ -174,6 +174,14 @@ struct sk_buff **udp_gro_receive(struct sk_buff **head, struct sk_buff *skb,
 				 struct udphdr *uh, udp_lookup_t lookup);
 int udp_gro_complete(struct sk_buff *skb, int nhoff, udp_lookup_t lookup);
 
+enum flow_dissect_ret udp_flow_dissect(const struct sk_buff *skb,
+			udp_lookup_t lookup,
+			struct flow_dissector_key_control *key_control,
+			struct flow_dissector *flow_dissector,
+			void *target_container, void *data,
+			__be16 *p_proto, u8 *p_ip_proto, int *p_nhoff,
+			int *p_hlen, unsigned int flags);
+
 static inline struct udphdr *udp_gro_udphdr(struct sk_buff *skb)
 {
 	struct udphdr *uh;
diff --git a/include/net/udp_tunnel.h b/include/net/udp_tunnel.h
index 10cce0dd4450..b7102e0f41a9 100644
--- a/include/net/udp_tunnel.h
+++ b/include/net/udp_tunnel.h
@@ -69,6 +69,13 @@ typedef struct sk_buff **(*udp_tunnel_gro_receive_t)(struct sock *sk,
 						     struct sk_buff *skb);
 typedef int (*udp_tunnel_gro_complete_t)(struct sock *sk, struct sk_buff *skb,
 					 int nhoff);
+typedef enum flow_dissect_ret (*udp_tunnel_flow_dissect_t)(struct sock *sk,
+			const struct sk_buff *skb,
+			struct flow_dissector_key_control *key_control,
+			struct flow_dissector *flow_dissector,
+			void *target_container, void *data,
+			__be16 *p_proto, u8 *p_ip_proto, int *p_nhoff,
+			int *p_hlen, unsigned int flags);
 
 struct udp_tunnel_sock_cfg {
 	void *sk_user_data;     /* user data used by encap_rcv call back */
@@ -78,6 +85,7 @@ struct udp_tunnel_sock_cfg {
 	udp_tunnel_encap_destroy_t encap_destroy;
 	udp_tunnel_gro_receive_t gro_receive;
 	udp_tunnel_gro_complete_t gro_complete;
+	udp_tunnel_flow_dissect_t flow_dissect;
 };
 
 /* Setup the given (UDP) sock to receive UDP encapsulated packets */
diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
index 97658bfc1b58..7f0a7ed4a6f7 100644
--- a/net/ipv4/udp_offload.c
+++ b/net/ipv4/udp_offload.c
@@ -328,11 +328,56 @@ static int udp4_gro_complete(struct sk_buff *skb, int nhoff)
 	return udp_gro_complete(skb, nhoff, udp4_lib_lookup_skb);
 }
 
+enum flow_dissect_ret udp_flow_dissect(const struct sk_buff *skb,
+			udp_lookup_t lookup,
+			struct flow_dissector_key_control *key_control,
+			struct flow_dissector *flow_dissector,
+			void *target_container, void *data,
+			__be16 *p_proto, u8 *p_ip_proto, int *p_nhoff,
+			int *p_hlen, unsigned int flags)
+{
+	enum flow_dissect_ret ret = FLOW_DISSECT_RET_CONTINUE;
+	struct udphdr *uh, _uh;
+	struct sock *sk;
+
+	uh = __skb_header_pointer(skb, *p_nhoff, sizeof(_uh), data,
+				  *p_hlen, &_uh);
+	if (!uh)
+		return FLOW_DISSECT_RET_OUT_BAD;
+
+	rcu_read_lock();
+
+	sk = (*lookup)(skb, uh->source, uh->dest);
+
+	if (sk && udp_sk(sk)->flow_dissect)
+		ret = udp_sk(sk)->flow_dissect(sk, skb, key_control,
+					       flow_dissector, target_container,
+					       data, p_proto, p_ip_proto,
+					       p_nhoff, p_hlen, flags);
+	rcu_read_unlock();
+
+	return ret;
+}
+EXPORT_SYMBOL(udp_flow_dissect);
+
+static enum flow_dissect_ret udp4_flow_dissect(const struct sk_buff *skb,
+			struct flow_dissector_key_control *key_control,
+			struct flow_dissector *flow_dissector,
+			void *target_container, void *data,
+			__be16 *p_proto, u8 *p_ip_proto, int *p_nhoff,
+			int *p_hlen, unsigned int flags)
+{
+	return udp_flow_dissect(skb, udp4_lib_lookup_skb, key_control,
+				flow_dissector, target_container, data,
+				p_proto, p_ip_proto, p_nhoff, p_hlen, flags);
+}
+
 static const struct net_offload udpv4_offload = {
 	.callbacks = {
 		.gso_segment = udp4_tunnel_segment,
 		.gro_receive  =	udp4_gro_receive,
 		.gro_complete =	udp4_gro_complete,
+		.flow_dissect = udp4_flow_dissect,
 	},
 };
 
diff --git a/net/ipv4/udp_tunnel.c b/net/ipv4/udp_tunnel.c
index 6539ff15e9a3..a4eec2a044d2 100644
--- a/net/ipv4/udp_tunnel.c
+++ b/net/ipv4/udp_tunnel.c
@@ -71,6 +71,7 @@ void setup_udp_tunnel_sock(struct net *net, struct socket *sock,
 	udp_sk(sk)->encap_destroy = cfg->encap_destroy;
 	udp_sk(sk)->gro_receive = cfg->gro_receive;
 	udp_sk(sk)->gro_complete = cfg->gro_complete;
+	udp_sk(sk)->flow_dissect = cfg->flow_dissect;
 
 	udp_tunnel_encap_enable(sock);
 }
diff --git a/net/ipv6/udp_offload.c b/net/ipv6/udp_offload.c
index 455fd4e39333..99ade504eaf7 100644
--- a/net/ipv6/udp_offload.c
+++ b/net/ipv6/udp_offload.c
@@ -73,11 +73,24 @@ static int udp6_gro_complete(struct sk_buff *skb, int nhoff)
 	return udp_gro_complete(skb, nhoff, udp6_lib_lookup_skb);
 }
 
+static enum flow_dissect_ret udp6_flow_dissect(const struct sk_buff *skb,
+			struct flow_dissector_key_control *key_control,
+			struct flow_dissector *flow_dissector,
+			void *target_container, void *data,
+			__be16 *p_proto, u8 *p_ip_proto, int *p_nhoff,
+			int *p_hlen, unsigned int flags)
+{
+	return udp_flow_dissect(skb, udp6_lib_lookup_skb, key_control,
+				flow_dissector, target_container, data,
+				p_proto, p_ip_proto, p_nhoff, p_hlen, flags);
+}
+
 static const struct net_offload udpv6_offload = {
 	.callbacks = {
 		.gso_segment	=	udp6_tunnel_segment,
 		.gro_receive	=	udp6_gro_receive,
 		.gro_complete	=	udp6_gro_complete,
+		.flow_dissect	=	udp6_flow_dissect,
 	},
 };
 
-- 
2.11.0

^ permalink raw reply related

* [PATCH v2 net-next 5/6] fou: Support flow dissection
From: Tom Herbert @ 2017-08-29 23:27 UTC (permalink / raw)
  To: davem; +Cc: netdev, Tom Herbert
In-Reply-To: <20170829232711.1465-1-tom@quantonium.net>

Populate offload flow_dissect callabck appropriately for fou and gue.

Signed-off-by: Tom Herbert <tom@quantonium.net>
---
 net/ipv4/fou.c | 63 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 63 insertions(+)

diff --git a/net/ipv4/fou.c b/net/ipv4/fou.c
index 1540db65241a..a831dd49fb28 100644
--- a/net/ipv4/fou.c
+++ b/net/ipv4/fou.c
@@ -282,6 +282,20 @@ static int fou_gro_complete(struct sock *sk, struct sk_buff *skb,
 	return err;
 }
 
+static enum flow_dissect_ret fou_flow_dissect(struct sock *sk,
+			const struct sk_buff *skb,
+			struct flow_dissector_key_control *key_control,
+			struct flow_dissector *flow_dissector,
+			void *target_container, void *data,
+			__be16 *p_proto, u8 *p_ip_proto, int *p_nhoff,
+			int *p_hlen, unsigned int flags)
+{
+	*p_ip_proto = fou_from_sock(sk)->protocol;
+	*p_nhoff += sizeof(struct udphdr);
+
+	return FLOW_DISSECT_RET_IPPROTO_AGAIN;
+}
+
 static struct guehdr *gue_gro_remcsum(struct sk_buff *skb, unsigned int off,
 				      struct guehdr *guehdr, void *data,
 				      size_t hdrlen, struct gro_remcsum *grc,
@@ -500,6 +514,53 @@ static int gue_gro_complete(struct sock *sk, struct sk_buff *skb, int nhoff)
 	return err;
 }
 
+static enum flow_dissect_ret gue_flow_dissect(struct sock *sk,
+			const struct sk_buff *skb,
+			struct flow_dissector_key_control *key_control,
+			struct flow_dissector *flow_dissector,
+			void *target_container, void *data,
+			__be16 *p_proto, u8 *p_ip_proto, int *p_nhoff,
+			int *p_hlen, unsigned int flags)
+{
+	struct guehdr *guehdr, _guehdr;
+
+	guehdr = __skb_header_pointer(skb, *p_nhoff + sizeof(struct udphdr),
+				      sizeof(_guehdr), data, *p_hlen, &_guehdr);
+	if (!guehdr)
+		return FLOW_DISSECT_RET_OUT_BAD;
+
+	switch (guehdr->version) {
+	case 0:
+		if (unlikely(guehdr->control))
+			return FLOW_DISSECT_RET_CONTINUE;
+
+		*p_ip_proto = guehdr->proto_ctype;
+		*p_nhoff += sizeof(struct udphdr) +
+		    sizeof(*guehdr) + (guehdr->hlen << 2);
+
+		break;
+	case 1:
+		switch (((struct iphdr *)guehdr)->version) {
+		case 4:
+			*p_ip_proto = IPPROTO_IPIP;
+			break;
+		case 6:
+			*p_ip_proto = IPPROTO_IPV6;
+			break;
+		default:
+			return FLOW_DISSECT_RET_CONTINUE;
+		}
+
+		*p_nhoff += sizeof(struct udphdr);
+
+		break;
+	default:
+		return FLOW_DISSECT_RET_CONTINUE;
+	}
+
+	return FLOW_DISSECT_RET_IPPROTO_AGAIN;
+}
+
 static int fou_add_to_port_list(struct net *net, struct fou *fou)
 {
 	struct fou_net *fn = net_generic(net, fou_net_id);
@@ -570,12 +631,14 @@ static int fou_create(struct net *net, struct fou_cfg *cfg,
 		tunnel_cfg.encap_rcv = fou_udp_recv;
 		tunnel_cfg.gro_receive = fou_gro_receive;
 		tunnel_cfg.gro_complete = fou_gro_complete;
+		tunnel_cfg.flow_dissect = fou_flow_dissect;
 		fou->protocol = cfg->protocol;
 		break;
 	case FOU_ENCAP_GUE:
 		tunnel_cfg.encap_rcv = gue_udp_recv;
 		tunnel_cfg.gro_receive = gue_gro_receive;
 		tunnel_cfg.gro_complete = gue_gro_complete;
+		tunnel_cfg.flow_dissect = gue_flow_dissect;
 		break;
 	default:
 		err = -EINVAL;
-- 
2.11.0

^ permalink raw reply related

* [PATCH v2 net-next 6/6] vxlan: support flow dissect
From: Tom Herbert @ 2017-08-29 23:27 UTC (permalink / raw)
  To: davem; +Cc: netdev, Tom Herbert
In-Reply-To: <20170829232711.1465-1-tom@quantonium.net>

Populate offload flow_dissect callback appropriately for VXLAN and
VXLAN-GPE.

Signed-off-by: Tom Herbert <tom@quantonium.net>
---
 drivers/net/vxlan.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 50 insertions(+)

diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index ae3a1da703c2..41e50de40af4 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -1336,6 +1336,55 @@ static bool vxlan_ecn_decapsulate(struct vxlan_sock *vs, void *oiph,
 	return err <= 1;
 }
 
+static enum flow_dissect_ret vxlan_flow_dissect(struct sock *sk,
+			const struct sk_buff *skb,
+			struct flow_dissector_key_control *key_control,
+			struct flow_dissector *flow_dissector,
+			void *target_container, void *data,
+			__be16 *p_proto, u8 *p_ip_proto, int *p_nhoff,
+			int *p_hlen, unsigned int flags)
+{
+	__be16 protocol = htons(ETH_P_TEB);
+	struct vxlanhdr *vhdr, _vhdr;
+	struct vxlan_sock *vs;
+
+	vhdr = __skb_header_pointer(skb, *p_nhoff + sizeof(struct udphdr),
+				    sizeof(_vhdr), data, *p_hlen, &_vhdr);
+	if (!vhdr)
+		return FLOW_DISSECT_RET_OUT_BAD;
+
+	vs = rcu_dereference_sk_user_data(sk);
+	if (!vs)
+		return FLOW_DISSECT_RET_OUT_BAD;
+
+	if (vs->flags & VXLAN_F_GPE) {
+		struct vxlanhdr_gpe *gpe = (struct vxlanhdr_gpe *)vhdr;
+
+		/* Need to have Next Protocol set for interfaces in GPE mode. */
+		if (gpe->version != 0 || !gpe->np_applied || gpe->oam_flag)
+			return FLOW_DISSECT_RET_CONTINUE;
+
+		switch (gpe->next_protocol) {
+		case VXLAN_GPE_NP_IPV4:
+			protocol = htons(ETH_P_IP);
+			break;
+		case VXLAN_GPE_NP_IPV6:
+			protocol = htons(ETH_P_IPV6);
+			break;
+		case VXLAN_GPE_NP_ETHERNET:
+			protocol = htons(ETH_P_TEB);
+			break;
+		default:
+			return FLOW_DISSECT_RET_CONTINUE;
+		}
+	}
+
+	*p_nhoff += sizeof(struct udphdr) + sizeof(_vhdr);
+	*p_proto = protocol;
+
+	return FLOW_DISSECT_RET_PROTO_AGAIN;
+}
+
 /* Callback from net/ipv4/udp.c to receive packets */
 static int vxlan_rcv(struct sock *sk, struct sk_buff *skb)
 {
@@ -2864,6 +2913,7 @@ static struct vxlan_sock *vxlan_socket_create(struct net *net, bool ipv6,
 	tunnel_cfg.encap_destroy = NULL;
 	tunnel_cfg.gro_receive = vxlan_gro_receive;
 	tunnel_cfg.gro_complete = vxlan_gro_complete;
+	tunnel_cfg.flow_dissect = vxlan_flow_dissect;
 
 	setup_udp_tunnel_sock(net, sock, &tunnel_cfg);
 
-- 
2.11.0

^ permalink raw reply related

* Re: [RESEND PATCH] Allow passing tid or pid in SCM_CREDENTIALS without CAP_SYS_ADMIN
From: prakash.sangappa @ 2017-08-29 23:59 UTC (permalink / raw)
  To: David Miller; +Cc: linux-kernel, netdev, ebiederm, drepper
In-Reply-To: <20170829.160232.1901318933754673000.davem@davemloft.net>



On 08/29/2017 04:02 PM, David Miller wrote:
> From: Prakash Sangappa <prakash.sangappa@oracle.com>
> Date: Mon, 28 Aug 2017 17:12:20 -0700
>
>> Currently passing tid(gettid(2)) of a thread in struct ucred in
>> SCM_CREDENTIALS message requires CAP_SYS_ADMIN capability otherwise
>> it fails with EPERM error. Some applications deal with thread id
>> of a thread(tid) and so it would help to allow tid in SCM_CREDENTIALS
>> message. Basically, either tgid(pid of the process) or the tid of
>> the thread should be allowed without the need for CAP_SYS_ADMIN capability.
>>
>> SCM_CREDENTIALS will be used to determine the global id of a process or
>> a thread running inside a pid namespace.
>>
>> This patch adds necessary check to accept tid in SCM_CREDENTIALS
>> struct ucred.
>>
>> Signed-off-by: Prakash Sangappa <prakash.sangappa@oracle.com>
> I'm pretty sure that by the descriptions in previous changes to this
> function, what you are proposing is basically a minor form of PID
> spoofing which we only want someone with CAP_SYS_ADMIN over the
> PID namespace to be able to do.

The fix is to allow passing tid of the calling thread itself not of any
other thread or process. Curious why would this be considered
as pid spoofing?

This change would enable a thread in a multi threaded process, running
inside a pid namespace to be identified by the recipient of the
message easily.


>
> Sorry, I'm not applying this.

^ permalink raw reply

* Re: [RESEND PATCH] Allow passing tid or pid in SCM_CREDENTIALS without CAP_SYS_ADMIN
From: Eric W. Biederman @ 2017-08-30  0:10 UTC (permalink / raw)
  To: prakash.sangappa; +Cc: David Miller, linux-kernel, netdev, drepper
In-Reply-To: <c9ea9bec-6b0e-70d6-3f74-9b483358edd2@oracle.com>

"prakash.sangappa" <prakash.sangappa@oracle.com> writes:

> On 08/29/2017 04:02 PM, David Miller wrote:
>> From: Prakash Sangappa <prakash.sangappa@oracle.com>
>> Date: Mon, 28 Aug 2017 17:12:20 -0700
>>
>>> Currently passing tid(gettid(2)) of a thread in struct ucred in
>>> SCM_CREDENTIALS message requires CAP_SYS_ADMIN capability otherwise
>>> it fails with EPERM error. Some applications deal with thread id
>>> of a thread(tid) and so it would help to allow tid in SCM_CREDENTIALS
>>> message. Basically, either tgid(pid of the process) or the tid of
>>> the thread should be allowed without the need for CAP_SYS_ADMIN capability.
>>>
>>> SCM_CREDENTIALS will be used to determine the global id of a process or
>>> a thread running inside a pid namespace.
>>>
>>> This patch adds necessary check to accept tid in SCM_CREDENTIALS
>>> struct ucred.
>>>
>>> Signed-off-by: Prakash Sangappa <prakash.sangappa@oracle.com>
>> I'm pretty sure that by the descriptions in previous changes to this
>> function, what you are proposing is basically a minor form of PID
>> spoofing which we only want someone with CAP_SYS_ADMIN over the
>> PID namespace to be able to do.
>
> The fix is to allow passing tid of the calling thread itself not of any
> other thread or process. Curious why would this be considered
> as pid spoofing?
>
> This change would enable a thread in a multi threaded process, running
> inside a pid namespace to be identified by the recipient of the
> message easily.

I think a more practical problem is that change, changes what is being
passed in the SCM_CREDENTIALS from a pid of a process to a tid of a
thread.  That could be confusing and that confusion could be exploited.

It is definitely confusing because in some instances a value can be both
a tgid and a tid.

I definitely think this needs to be talked about in terms of changing
what is passed in that field and what the consequences could be.

I suspect you are ok.  As nothing allows passing a tid today.  But I
don't see any analysis on why passing a tid instead of a tgid will not
confuse the receiving application, and in such confusion introduce a
security hole.

Eric

^ permalink raw reply

* Re: Fwd: DA850-evm MAC Address is random
From: Adam Ford @ 2017-08-30  0:49 UTC (permalink / raw)
  To: Sekhar Nori; +Cc: Tony Lindgren, Grygorii Strashko, linux-omap, netdev
In-Reply-To: <CAHCN7x+paAdW7cXrALhtLZZiJbkE6v_8_qRXBec7hxLJPvSMbw@mail.gmail.com>

On Tue, Aug 29, 2017 at 10:20 AM, Adam Ford <aford173@gmail.com> wrote:
> On Tue, Aug 29, 2017 at 10:16 AM, Sekhar Nori <nsekhar@ti.com> wrote:
>> On Tuesday 29 August 2017 05:32 PM, Adam Ford wrote:
>>> On Tue, Aug 29, 2017 at 6:42 AM, Sekhar Nori <nsekhar@ti.com> wrote:
>>>> On Tuesday 29 August 2017 03:53 PM, Adam Ford wrote:
>>>>> On Tue, Aug 29, 2017 at 3:23 AM, Sekhar Nori <nsekhar@ti.com> wrote:
>>>>>> On Tuesday 29 August 2017 02:42 AM, Tony Lindgren wrote:
>>>>>>> * Adam Ford <aford173@gmail.com> [170828 13:33]:
>>>>>>>> On Mon, Aug 28, 2017 at 1:54 PM, Grygorii Strashko
>>>>>>>> <grygorii.strashko@ti.com> wrote:
>>>>>>>>> Cc: Sekhar
>>>>>>>>>
>>>>>>>>> On 08/28/2017 10:32 AM, Adam Ford wrote:
>>>>>>>>>>
>>>>>>>>>> The davinvi_emac MAC address seems to attempt a call to
>>>>>>>>>> ti_cm_get_macid in cpsw-common.c but it returns the message
>>>>>>>>>> 'davinci_emac davinci_emac.1: incompatible machine/device type for
>>>>>>>>>> reading mac address ' and then generates a random MAC address.
>>>>>>>>>>
>>>>>>>>>> The function appears to lookup varions boards using
>>>>>>>>>> 'of_machine_is_compaible' and supports dm8148, am33xx, am3517, dm816,
>>>>>>>>>> am4372 and dra7.  I don't see the ti,davinci-dm6467-emac which is
>>>>>>>>>> what's shown in the da850 device tree.
>>>>>>>>>>
>>>>>>>>>> Is there a patch somewhere for supporting the da850-evm?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Not sure if MAC address can be read from Control module.
>>>>>>>>> May be Sekhar can say more?
>>>>>>>>
>>>>>>>> My understanding is that the MAC address is programmed by Logic PD
>>>>>>>> into the SPI flash.  The Bootloader reads this from either SPI or its
>>>>>>>> env variables.  Looking at the partition info listed in the
>>>>>>>> da850-evm.dts file, it appears as if they've reserved space for it.
>>>>>>>> Unfortunately, I don't see any code that reads it out.  I was hoping
>>>>>>
>>>>>> This code is present in U-Boot sources at
>>>>>> board/davinci/da8xxevm/da850evm.c. See the function get_mac_addr() and
>>>>>> its usage in misc_init_r().
>>>>>>
>>>>>>>> there might be a way to just pass cmdline parameter from the
>>>>>>>> bootloader to the kernel to accept the MAC address.
>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> If not, is there a way to pass the MAC address from U-Boot to the
>>>>>>>>>> driver so it doesn't generate a random MAC?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> "local-mac-address" dt porp
>>>>>>>>
>>>>>>>> The downside here, is that we'd have to have the Bootloader modify the
>>>>>>>> device tree.
>>>>>>>
>>>>>>> That piece of code exists somewhere in u-boot already. Note how
>>>>>>
>>>>>> Yes, it is fdt_fixup_ethernet() and its usage is in common/image-fdt.c.
>>>>>>
>>>>>>> we are populating the mac address for USB Ethernet drivers in
>>>>>>> u-boot and then the Ethernet driver code parses it. See commit
>>>>>>> 055d31de7158 ("ARM: omap3: beagleboard-xm: dt: Add ethernet to
>>>>>>> the device tree") for some more information.
>>>>>>>
>>>>>>> I think u-boot needs the ethernet alias for finding the interface.
>>>>>>
>>>>>> That's exactly what was missing. I have sent a patch for fixing that and
>>>>>> copied you there.
>>>>>
>>>>> Thanks for doing that.
>>>>>
>>>>>>
>>>>>> Adam, if I can get your Tested-by, I will make an attempt to send it for
>>>>>> v4.13 itself.
>>>>>
>>>>> I will test it.  Do need to run some instruction or do something
>>>>> special in U-Boot to pass this in the proper place for the kernel to
>>>>> pull it?  Tony's patch reference showed
>>>>> command for fdt set, but I am not sure I fully understand the
>>>>> parameters that went along with that.
>>>>
>>>> Nope, just applying the patch and booting the with the new dtb should
>>>> result in the random mac address going away.
>>>
>>> Unfortunately, I am not seeing any change with the patch (at least
>>> with Kernel 4.12.9 from stable).
>>>
>>> netconsole: network logging started
>>> davinci_emac davinci_emac.1: incompatible machine/device type for
>>> reading mac address
>>> davinci_emac davinci_emac.1: using random MAC addr: ee:74:c3:3a:15:be
>>>
>>> Looking at the source for cpsw-common.c function ti_cm_get_macid()
>>> doesn't have a case for the  ti,davinci-dm6467-emac so I wonder if
>>> there might be more to it.
>>
>> Hmm, it did solve the issue for me when I tried latest -next. And
>> reverting the patch brought back the random mac address usage. Could you
>> try latest mainline or -next?
>>
>> Meanwhile let me see whats going on with the observations you have.
>
> I will try again with -next this afternoon and see what I can find.
> Can you tell me which U-Boot version you're using? I want to match
> your setup. I want to see if something is missing during the hand-off
> between the Bootloader and Linux.
>

I wonder if U-Boot isn't pushing something to Linux because it doesn't
appear to be running some of the da850 specific code even when I run
linux-next.  Can you tell me what verision of U-Boot you're using?
Other than using davinci_all_defconfig, did you change the
configuration at all?

[    1.411107] netconsole: network logging started
[    1.416237] davinci_emac davinci_emac.1: incompatible machine/device type for
 reading mac address
[    1.424496] davinci_emac davinci_emac.1: using random MAC addr: be:e2:84:ed:3
c:87

I also confirmed the SPI-flash has the MAC programmed:

# hexdump /dev/mtd5ro
0000000 0800 04ee 8e32 ffff ffff ffff ffff ffff
0000010 ffff ffff ffff ffff ffff ffff ffff ffff



Here is my full log:

[    0.000000] Booting Linux on physical CPU 0x0
[    0.000000] Linux version 4.13.0-rc7-next-20170829 (aford@ubuntu16) (gcc vers
ion 4.8.3 20140320 (prerelease) (Sourcery CodeBench Lite 2014.05-29)) #1 PREEMPT
 Tue Aug 29 19:11:06 CDT 2017
[    0.000000] CPU: ARM926EJ-S [41069265] revision 5 (ARMv5TEJ), cr=0005317f
[    0.000000] CPU: VIVT data cache, VIVT instruction cache
[    0.000000] OF: fdt: Machine model: DA850/AM1808/OMAP-L138 EVM
[    0.000000] Memory policy: Data cache writethrough
[    0.000000] DaVinci da850/omap-l138 variant 0x0
[    0.000000] On node 0 totalpages: 8192
[    0.000000] free_area_init_node: node 0, pgdat c05dfec8, node_mem_map c1fb900
0
[    0.000000]   DMA zone: 64 pages used for memmap
[    0.000000]   DMA zone: 0 pages reserved
[    0.000000]   DMA zone: 8192 pages, LIFO batch:0
[    0.000000] random: fast init done
[    0.000000] pcpu-alloc: s0 r0 d32768 u32768 alloc=1*32768
[    0.000000] pcpu-alloc: [0] 0
[    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 8128
[    0.000000] Kernel command line: mem=32M console=ttyS2,115200n8 root=/dev/mmc
blk0p2 rw wait noinitrd
[    0.000000] PID hash table entries: 128 (order: -3, 512 bytes)
[    0.000000] Dentry cache hash table entries: 4096 (order: 2, 16384 bytes)
[    0.000000] Inode-cache hash table entries: 2048 (order: 1, 8192 bytes)
[    0.000000] Memory: 26196K/32768K available (4412K kernel code, 330K rwdata,
1012K rodata, 224K init, 146K bss, 6572K reserved, 0K cma-reserved)
[    0.000000] Virtual kernel memory layout:
                   vector  : 0xffff0000 - 0xffff1000   (   4 kB)
                   fixmap  : 0xffc00000 - 0xfff00000   (3072 kB)
                   vmalloc : 0xc2800000 - 0xff800000   ( 976 MB)
                   lowmem  : 0xc0000000 - 0xc2000000   (  32 MB)
                   modules : 0xbf000000 - 0xc0000000   (  16 MB)
                     .text : 0xc0008000 - 0xc04574e8   (4414 kB)
                     .init : 0xc0556000 - 0xc058e000   ( 224 kB)
                     .data : 0xc058e000 - 0xc05e09e0   ( 331 kB)
                      .bss : 0xc05e4ebc - 0xc06097b4   ( 147 kB)\x00c - 0xc06097
b4   ( 147 kB)
[    0.000000] SLUB: HWalign=32, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
[    0.000000] Preemptible hierarchical RCU implementation.
[    0.000000]  Tasks RCU enabled.\x00d.
[    0.000000] NR_IRQS: 245
[    0.000000] clocksource: timer0_1: mask: 0xffffffff max_cycles: 0xffffffff, m
ax_idle_ns: 79635851949 ns
[    0.000000] sched_clock: 32 bits at 24MHz, resolution 41ns, wraps every 89478
484971ns
[    0.000480] Console: colour dummy device 80x30
[    0.000623] Calibrating delay loop... 148.88 BogoMIPS (lpj=744448)
[    0.070151] pid_max: default: 32768 minimum: 301
[    0.070730] Mount-cache hash table entries: 1024 (order: 0, 4096 bytes)
[    0.070802] Mountpoint-cache hash table entries: 1024 (order: 0, 4096 bytes)
[    0.073212] CPU: Testing write buffer coherency: ok
[    0.076384] Setting up static identity map for 0xc0008400 - 0xc0008458
[    0.077078] Hierarchical SRCU implementation.
[    0.081656] devtmpfs: initialized
[    0.102577] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, ma
x_idle_ns: 19112604462750000 ns
[    0.102678] futex hash table entries: 256 (order: -1, 3072 bytes)
[    0.103368] pinctrl core: initialized pinctrl subsystem
[    0.107090] NET: Registered protocol family 16
[    0.108769] DMA: preallocated 256 KiB pool for atomic coherent allocations
[    0.112006] cpuidle: using governor menu
[    0.142220] mux: initialized RTC_ALARM
[    0.142267] mux: Setting register RTC_ALARM
[    0.142313] mux:    PINMUX0 (0x00000000) = 0x44080000 -> 0x24080000
[    0.173799] edma 1c00000.edma: memcpy is disabled
[    0.187509] edma 1c00000.edma: TI EDMA DMA engine driver
[    0.188706] edma 1e30000.edma: memcpy is disabled
[    0.203238] edma 1e30000.edma: TI EDMA DMA engine driver
[    0.208604] i2c_davinci i2c_davinci.1: could not find pctldev for node /soc@1
c00000/pinmux@14120/pinmux_i2c0_pins, deferring probe
[    0.213319] clocksource: Switched to clocksource timer0_1
[    0.272942] NET: Registered protocol family 2
[    0.276271] TCP established hash table entries: 1024 (order: 0, 4096 bytes)
[    0.276399] TCP bind hash table entries: 1024 (order: 0, 4096 bytes)
[    0.276478] TCP: Hash tables configured (established 1024 bind 1024)
[    0.276920] UDP hash table entries: 256 (order: 0, 4096 bytes)
[    0.277028] UDP-Lite hash table entries: 256 (order: 0, 4096 bytes)
[    0.277860] NET: Registered protocol family 1
[    0.279709] RPC: Registered named UNIX socket transport module.
[    0.279757] RPC: Registered udp transport module.
[    0.279779] RPC: Registered tcp transport module.
[    0.279800] RPC: Registered tcp NFSv4.1 backchannel transport module.
[    0.286296] workingset: timestamp_bits=30 max_order=13 bucket_order=0
[    0.334423] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 2
51)
[    0.334488] io scheduler noop registered (default)
[    0.334519] io scheduler mq-deadline registered
[    0.334544] io scheduler kyber registered
[    0.337441] pinctrl-single 1c14120.pinmux: 160 pins at pa fec14120 size 80
[    0.724136] Serial: 8250/16550 driver, 3 ports, IRQ sharing disabled
[    0.731781] serial8250.0: ttyS0 at MMIO 0x1c42000 (irq = 25, base_baud = 9375
000) is a TI DA8xx/66AK2x
[    0.736191] serial8250.1: ttyS1 at MMIO 0x1d0c000 (irq = 53, base_baud = 8250
000) is a TI DA8xx/66AK2x
[    0.740320] serial8250.2: ttyS2 at MMIO 0x1d0d000 (irq = 61, base_baud = 8250
000) is a TI DA8xx/66AK2x
[    1.104485] console [ttyS2] enabled
[    1.112179] brd: module loaded
[    1.115635] libphy: Fixed MDIO Bus: probed
[    1.173556] davinci_mdio davinci_mdio.0: davinci mdio revision 1.5, bus freq
2200000
[    1.180102] davinci_mdio davinci_mdio.0: detected phy mask fffffffe
[    1.188281] libphy: davinci_mdio.0: probed
[    1.191175] davinci_mdio davinci_mdio.0: phy[0]: device davinci_mdio.0:00, dr
iver SMSC LAN8710/LAN8720
[    1.200474] i2c /dev entries driver
[    1.203043] IR NEC protocol handler initialized
[    1.206722] IR RC5(x/sz) protocol handler initialized
[    1.210530] IR RC6 protocol handler initialized
[    1.214207] IR JVC protocol handler initialized
[    1.217493] IR Sony protocol handler initialized
[    1.220843] IR SANYO protocol handler initialized
[    1.224603] IR Sharp protocol handler initialized
[    1.228062] IR MCE Keyboard/mouse protocol handler initialized
[    1.232635] IR XMP protocol handler initialized
[    1.293829] davinci_mmc da830-mmc.0: Using DMA, 4-bit mode
[    1.307464] NET: Registered protocol family 10
[    1.317055] Segment Routing with IPv6
[    1.319797] sit: IPv6, IPv4 and MPLS over IPv4 tunneling driver
[    1.330958] NET: Registered protocol family 17
[    1.362382] mmc0: host does not support reading read-only switch, assuming wr
ite-enable
[    1.374564] pca953x 0-0020: 0-0020 supply vcc not found, using dummy regulato
r
[    1.381327] mmc0: new high speed SDHC card at address b368
[    1.387395] mmcblk0: mmc0:b368 00000 3.75 GiB
[    1.392025] pca953x 0-0020: failed reading register
[    1.398285]  mmcblk0: p1 p2
[    1.403791] pca953x: probe of 0-0020 failed with error -121
[    1.408691] console [netcon0] enabled
[    1.411107] netconsole: network logging started
[    1.416237] davinci_emac davinci_emac.1: incompatible machine/device type for
 reading mac address
[    1.424496] davinci_emac davinci_emac.1: using random MAC addr: be:e2:84:ed:3
c:87
[    1.435043] hctosys: unable to open rtc device (rtc0)
[    1.439533] vbat: disabling
[    1.455953] EXT4-fs (mmcblk0p2): couldn't mount as ext3 due to feature incomp
atibilities
[    1.495292] EXT4-fs (mmcblk0p2): mounted filesystem with ordered data mode. O
pts: (null)
[    1.502362] VFS: Mounted root (ext4 filesystem) on device 179:2.
[    1.509400] devtmpfs: mounted
[    1.512507] Freeing unused kernel memory: 224K
[    1.515909] This architecture does not have kernel memory protection.
[    1.942277] EXT4-fs (mmcblk0p2): re-mounted. Opts: (null)
[    2.381303] udevd[54]: starting version 3.2.2
[    2.611166] udevd[55]: starting eudev-3.2.2
[    4.393796] omap_rtc 1c23000.rtc: rtc core: registered 1c23000.rtc as rtc0
[    4.760433] davinci-wdt davinci-wdt: heartbeat 60 sec
[    5.631862] spi_davinci spi_davinci.1: Controller at 0xfef0e000
[    5.715839] vpif vpif: vpif probe success
[    5.805009] Linux video capture interface: v2.00
[    5.876891] asoc-simple-card sound: tlv320aic3x-hifi <-> davinci-mcasp.0 mapp
ing ok
[    6.281740] adv7343 0-002a: chip found @ 0x54 (DaVinci I2C adapter)
[    6.304545] adv7343 0-002a: Error initializing
[    6.307892] adv7343: probe of 0-002a failed with error -121
[    6.312763] vpif_display vpif_display: Error registering v4l2 subdevice
[    6.503036] tvp514x 0-005d: Write: retry ... 0
[    6.543977] tvp514x 0-005d: Write: retry ... 1
[    6.574113] tvp514x 0-005d: Write: retry ... 2
[    6.603935] tvp514x 0-005d: Write: retry ... 3
[    6.633983] tvp514x 0-005d: Write: retry ... 4
[    6.663965] tvp514x 0-005d: Write: retry ... 5
[    6.693978] tvp514x 0-005d: tvp514x 0-005d decoder driver registered !!
[    6.704907] vpif_capture vpif_capture: registered sub device tvp514x-0
[    6.813679] tvp514x 0-005c: Write: retry ... 0
[    6.844005] tvp514x 0-005c: Write: retry ... 1
[    6.874221] tvp514x 0-005c: Write: retry ... 2
[    6.903956] tvp514x 0-005c: Write: retry ... 3
[    6.934023] tvp514x 0-005c: Write: retry ... 4
[    6.963950] tvp514x 0-005c: Write: retry ... 5
[    6.993967] tvp514x 0-005c: tvp514x 0-005c decoder driver registered !!
[    7.013865] vpif_capture vpif_capture: registered sub device tvp514x-1
[    7.052179] tvp514x 0-005d: Write: retry ... 0
[    7.093976] tvp514x 0-005d: Write: retry ... 1
[    7.123965] tvp514x 0-005d: Write: retry ... 2
[    7.153916] tvp514x 0-005d: Write: retry ... 3
[    7.184160] tvp514x 0-005d: Write: retry ... 4
[    7.219318] tvp514x 0-005d: Write: retry ... 5
[    7.243986] vpif_capture vpif_capture: Failed to set input
[   12.944674] m25p80 spi0.0: m25p64 (8192 Kbytes)
[   12.973994] nand: No NAND device found
[   13.062551] 6 ofpart partitions found on MTD device spi0.0
[   13.067365] Creating 6 MTD partitions on "spi0.0":
[   13.070964] 0x000000000000-0x000000010000 : "U-Boot-SPL"
[   13.084491] 0x000000010000-0x000000090000 : "U-Boot"
[   13.095174] 0x000000090000-0x0000000a0000 : "U-Boot-Env"
[   13.106238] 0x0000000a0000-0x000000320000 : "Kernel"
[   13.116910] 0x000000320000-0x000000720000 : "Filesystem"
[   13.127963] 0x0000007f0000-0x000000800000 : "MAC-Address"

>
>>
>> Thanks,
>> Sekhar

^ permalink raw reply

* [PATCH net-next 0/3 v10] Add support for rmnet driver
From: Subash Abhinov Kasiviswanathan @ 2017-08-30  0:47 UTC (permalink / raw)
  To: netdev, davem, fengguang.wu, dcbw, jiri, stephen, David.Laight,
	marcel, andrew
  Cc: Subash Abhinov Kasiviswanathan

Hi David

I have updated the locking scheme as follows -

The shared resource which needs to be protected is realdev->rx_handler_data.

For the writer path, this is using rtnl_lock(). The writer paths are
rmnet_newlink(), rmnet_dellink() and rmnet_force_unassociate_device(). These
paths are already called with rtnl_lock() acquired in. There is also an
ASSERT_RTNL() to ensure that we are calling with rtnl acquired. For
dereference here, we will need to use rtnl_dereference(). Dev list writing
needs to happen with rtnl_lock() acquired for netdev_master_upper_dev_link().

For the reader path, the real_dev->rx_handler_data is called in the TX / RX
path. We only need rcu_read_lock() for these scenarios. In these cases,
the rcu_read_lock() is held in __dev_queue_xmit() and
netif_receive_skb_internal(), so readers need to use rcu_dereference_rtnl()
to get the relevant information. For dev list reading, we again acquire
rcu_read_lock() in rmnet_dellink() for netdev_master_upper_dev_get_rcu().
We also use unregister_netdevice_many() to free all rmnet devices in
rmnet_force_unassociate_device() so we dont lose the rtnl_lock() and free in
same context.

I have also added this as a comment in rmnet_config.c.

--
v1: Same as the RFC patch with some minor fixes for issues reported by
kbuild test robot.

v1->v2: Change datatypes and remove config IOCTL as mentioned by David.
Also fix checkpatch issues and remove some unused code.

v2->v3: Move location to drivers/net and rename to rmnet. Change the
userspace - netlink communication from custom netlink to rtnl_link_ops.
Refactor some code. Use a fixed config for ingress and egress.

v3->v4: Move location to drivers/net/ethernet/qualcomm/.
Fix comments from Stephen and Jiri -
Split the ether and arp type changes into seperate patches.
Remove debug and custom logging and switch to standard netdevice log.
Remove module parameters. Refactor and change some code style issues.

v4->v5: Rename some structs and variables. Move the initializer
before the for loop start. Put the arp type in correct sequence.

v5->v6: Fix comments from Dan -
Use the upper link API. As a result, remove all the refcounting logic.
Device refcount is explicitly held on real_dev on rx_handler
registration only. Modifiy the flow control struct. Remove the unused
ethernet mode handling.

v6->v7: Fix comments from David - Add newline to end of Makefile. Remove
inline from .c files. Move the module init/exit to rmnet config. Fix an
error reported by kbuild test robot for an unused file.

v7->v8: Use a smaller value for ETH_P_MAP as mentioned by David. Change
netdev_info to netdev_dbg as mentioned by Andew. Fix comments from
Stephen regarding netdev_priv and sparse related errors of using 0 as NULL

v8->v9: Fix comments from David - Remove the CFLAG rule. Change the way
rmnet devices are freed. Instead of using a workqueue to unregister devices
individually, go through the list and free all devices within the rtnl_lock().

v9->v10: Fix the locking scheme as mentioned by David. Change comment near
MAP type definition as mentioned by Dan. Refactor some code.

Subash Abhinov Kasiviswanathan (3):
  net: ether: Add support for multiplexing and aggregation type
  net: arp: Add support for raw IP device
  drivers: net: ethernet: qualcomm: rmnet: Initial implementation

 Documentation/networking/rmnet.txt                 |  82 ++++
 drivers/net/ethernet/qualcomm/Kconfig              |   2 +
 drivers/net/ethernet/qualcomm/Makefile             |   2 +
 drivers/net/ethernet/qualcomm/rmnet/Kconfig        |  12 +
 drivers/net/ethernet/qualcomm/rmnet/Makefile       |  10 +
 drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c | 419 +++++++++++++++++++++
 drivers/net/ethernet/qualcomm/rmnet/rmnet_config.h |  56 +++
 .../net/ethernet/qualcomm/rmnet/rmnet_handlers.c   | 271 +++++++++++++
 .../net/ethernet/qualcomm/rmnet/rmnet_handlers.h   |  26 ++
 drivers/net/ethernet/qualcomm/rmnet/rmnet_map.h    |  88 +++++
 .../ethernet/qualcomm/rmnet/rmnet_map_command.c    | 107 ++++++
 .../net/ethernet/qualcomm/rmnet/rmnet_map_data.c   | 105 ++++++
 .../net/ethernet/qualcomm/rmnet/rmnet_private.h    |  45 +++
 drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.c    | 170 +++++++++
 drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.h    |  29 ++
 include/uapi/linux/if_arp.h                        |   1 +
 include/uapi/linux/if_ether.h                      |   3 +
 17 files changed, 1428 insertions(+)
 create mode 100644 Documentation/networking/rmnet.txt
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/Kconfig
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/Makefile
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/rmnet_config.h
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/rmnet_handlers.c
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/rmnet_handlers.h
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/rmnet_map.h
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/rmnet_map_command.c
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/rmnet_map_data.c
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/rmnet_private.h
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.c
 create mode 100644 drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.h

-- 
1.9.1

^ permalink raw reply

* [PATCH net-next 1/3 v10] net: ether: Add support for multiplexing and aggregation type
From: Subash Abhinov Kasiviswanathan @ 2017-08-30  0:47 UTC (permalink / raw)
  To: netdev, davem, fengguang.wu, dcbw, jiri, stephen, David.Laight,
	marcel, andrew
  Cc: Subash Abhinov Kasiviswanathan
In-Reply-To: <1504054078-10173-1-git-send-email-subashab@codeaurora.org>

Define the Qualcomm multiplexing and aggregation (MAP) ether type 0x00F9.
This is needed for receiving data in the MAP protocol like RMNET. This is
not an officially registered ID.

Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
---
 include/uapi/linux/if_ether.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/include/uapi/linux/if_ether.h b/include/uapi/linux/if_ether.h
index 5bc9bfd..30526db 100644
--- a/include/uapi/linux/if_ether.h
+++ b/include/uapi/linux/if_ether.h
@@ -137,6 +137,9 @@
 #define ETH_P_IEEE802154 0x00F6		/* IEEE802.15.4 frame		*/
 #define ETH_P_CAIF	0x00F7		/* ST-Ericsson CAIF protocol	*/
 #define ETH_P_XDSA	0x00F8		/* Multiplexed DSA protocol	*/
+#define ETH_P_MAP	0x00F9		/* Qualcomm multiplexing and
+					 * aggregation protocol
+					 */
 
 /*
  *	This is an Ethernet frame header.
-- 
1.9.1

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox