Netdev List
 help / color / mirror / Atom feed
* [PATCH iproute2 v2 1/3] link_iptnl: Kill code duplication
From: Serhey Popovych @ 2018-01-02 21:27 UTC (permalink / raw)
  To: netdev
In-Reply-To: <1514928479-30313-1-git-send-email-serhe.popovych@gmail.com>

Both sit and ipip "mode" parameter handling nearly the same.
Except for sit we have "ip6ip" mode: check it only when
configuring sit.

Note that there is no need strcmp(lu->id, "ipip"): if it is
not sit it is "ipip" because we have only these two link util
defined in module.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
---
 ip/link_iptnl.c |   27 +++++----------------------
 1 file changed, 5 insertions(+), 22 deletions(-)

diff --git a/ip/link_iptnl.c b/ip/link_iptnl.c
index 8a8f5dd..d4d935b 100644
--- a/ip/link_iptnl.c
+++ b/ip/link_iptnl.c
@@ -230,11 +230,11 @@ get_failed:
 		} else if (strcmp(lu->id, "sit") == 0 &&
 			   strcmp(*argv, "isatap") == 0) {
 			iflags |= SIT_ISATAP;
-		} else if (strcmp(lu->id, "sit") == 0 &&
-			   strcmp(*argv, "mode") == 0) {
+		} else if (strcmp(*argv, "mode") == 0) {
 			NEXT_ARG();
-			if (strcmp(*argv, "ipv6/ipv4") == 0 ||
-			    strcmp(*argv, "ip6ip") == 0)
+			if (strcmp(lu->id, "sit") == 0 &&
+			    (strcmp(*argv, "ipv6/ipv4") == 0 ||
+			     strcmp(*argv, "ip6ip") == 0))
 				proto = IPPROTO_IPV6;
 			else if (strcmp(*argv, "ipv4/ipv4") == 0 ||
 				 strcmp(*argv, "ipip") == 0 ||
@@ -248,21 +248,6 @@ get_failed:
 				proto = 0;
 			else
 				invarg("Cannot guess tunnel mode.", *argv);
-		} else if (strcmp(lu->id, "ipip") == 0 &&
-			   strcmp(*argv, "mode") == 0) {
-			NEXT_ARG();
-			if (strcmp(*argv, "ipv4/ipv4") == 0 ||
-				 strcmp(*argv, "ipip") == 0 ||
-				 strcmp(*argv, "ip4ip4") == 0)
-				proto = IPPROTO_IPIP;
-			else if (strcmp(*argv, "mpls/ipv4") == 0 ||
-				   strcmp(*argv, "mplsip") == 0)
-				proto = IPPROTO_MPLS;
-			else if (strcmp(*argv, "any/ipv4") == 0 ||
-				 strcmp(*argv, "any") == 0)
-				proto = 0;
-			else
-				invarg("Cannot guess tunnel mode.", *argv);
 		} else if (strcmp(*argv, "noencap") == 0) {
 			encaptype = TUNNEL_ENCAP_NONE;
 		} else if (strcmp(*argv, "encap") == 0) {
@@ -337,6 +322,7 @@ get_failed:
 		exit(-1);
 	}
 
+	addattr8(n, 1024, IFLA_IPTUN_PROTO, proto);
 	if (metadata) {
 		addattr_l(n, 1024, IFLA_IPTUN_COLLECT_METADATA, NULL, 0);
 		return 0;
@@ -355,9 +341,6 @@ get_failed:
 	addattr16(n, 1024, IFLA_IPTUN_ENCAP_SPORT, htons(encapsport));
 	addattr16(n, 1024, IFLA_IPTUN_ENCAP_DPORT, htons(encapdport));
 
-	if (strcmp(lu->id, "ipip") == 0 || strcmp(lu->id, "sit") == 0)
-		addattr8(n, 1024, IFLA_IPTUN_PROTO, proto);
-
 	if (strcmp(lu->id, "sit") == 0) {
 		addattr16(n, 1024, IFLA_IPTUN_FLAGS, iflags);
 		if (ip6rdprefixlen) {
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH iproute2 v2 0/3] iptnl/tunnel: Open JSON section, kill code duplication and print iptnl mode
From: Serhey Popovych @ 2018-01-02 21:27 UTC (permalink / raw)
  To: netdev

In this series I present following improvements and fixes:

  1) Add missing open_json_context() to link_iptnl.c
     for "encap" options.

  2) Get rid of code duplication when parsing tunnel mode
     parameters in link_iptnl.c.

  3) Add code to link_iptnl.c to print tunnel mode to be
     inline with ip6tnl.

See individual patch description message for details.

This is v2. Fixed proto value for ipip tunnel mode:
"proto ipip" vs "proto ip4ip4" previously.

Thanks,
Serhii

Serhey Popovych (3):
  link_iptnl: Kill code duplication
  link_iptnl: Print tunnel mode
  link_iptnl: Open "encap" JSON object

 ip/link_iptnl.c |   45 +++++++++++++++++++++++----------------------
 1 file changed, 23 insertions(+), 22 deletions(-)

-- 
1.7.10.4

^ permalink raw reply

* Re: [net-next PATCH] net: ptr_ring: otherwise safe empty checks can overrun array bounds
From: John Fastabend @ 2018-01-02 21:27 UTC (permalink / raw)
  To: Michael S. Tsirkin, David Miller
  Cc: jakub.kicinski, xiyou.wangcong, jiri, netdev
In-Reply-To: <20180102191329-mutt-send-email-mst@kernel.org>

On 01/02/2018 09:17 AM, Michael S. Tsirkin wrote:
> On Tue, Jan 02, 2018 at 07:01:33PM +0200, Michael S. Tsirkin wrote:
>> On Tue, Jan 02, 2018 at 11:52:19AM -0500, David Miller wrote:
>>> From: John Fastabend <john.fastabend@gmail.com>
>>> Date: Wed, 27 Dec 2017 19:50:25 -0800
>>>
>>>> When running consumer and/or producer operations and empty checks in
>>>> parallel its possible to have the empty check run past the end of the
>>>> array. The scenario occurs when an empty check is run while
>>>> __ptr_ring_discard_one() is in progress. Specifically after the
>>>> consumer_head is incremented but before (consumer_head >= ring_size)
>>>> check is made and the consumer head is zeroe'd.
>>>>
>>>> To resolve this, without having to rework how consumer/producer ops
>>>> work on the array, simply add an extra dummy slot to the end of the
>>>> array. Even if we did a rework to avoid the extra slot it looks
>>>> like the normal case checks would suffer some so best to just
>>>> allocate an extra pointer.
>>>>
>>>> Reported-by: Jakub Kicinski <jakub.kicinski@netronome.com>
>>>> Fixes: c5ad119fb6c09 ("net: sched: pfifo_fast use skb_array")
>>>> Signed-off-by: John Fastabend <john.fastabend@gmail.com>
>>>
>>> Applied, thanks John.
>>
>> I think that patch is wrong. I'd rather it got reverted.
> 
> And replaced with something like the following - stil untested, but
> apparently there's some urgency to fix it so posting for review ASAP.
> 

So the above ptr_ring patch is meant for the dequeue() case not
the peek case. Dequeue case shown here,

static struct sk_buff *pfifo_fast_dequeue(struct Qdisc *qdisc)
{
        struct pfifo_fast_priv *priv = qdisc_priv(qdisc);
        struct sk_buff *skb = NULL;
        int band;

        for (band = 0; band < PFIFO_FAST_BANDS && !skb; band++) {
                struct skb_array *q = band2list(priv, band);

                if (__skb_array_empty(q))
                        continue;

                skb = skb_array_consume_bh(q);
        }
        if (likely(skb)) {
                qdisc_qstats_cpu_backlog_dec(qdisc, skb);
                qdisc_bstats_cpu_update(qdisc, skb);
                qdisc_qstats_cpu_qlen_dec(qdisc);
        }

        return skb;
}

In the dequeue case we use it purely as a hint and then do a proper
consume (with locks) if needed. A false negative in this case means
a consume is happening on another cpu due to a reset op or a
dequeue op. (an aside but I'm evaluating if we should only allow a
single dequeue'ing cpu at a time more below?) If its a reset op
that caused the false negative it is OK because we are flushing the
array anyways. If it is a dequeue op it is also OK because this core
will abort but the running core will continue to dequeue avoiding a
stall. So I believe false negatives are OK in the above function.

> John, others, could you pls confirm it's not too bad performance-wise?
> I'll then split it up properly and re-post.
> 

I haven't benchmarked it but in the dequeue case taking a lock for
every priority even when its empty seems unneeded.

> -->
> 
> net: don't misuse ptr_ring_peek
> 
> ptr_ring_peek only claims to be safe if the result is never
> dereferenced, which isn't the case for its use in sch_generic.
> Add locked API variants and use the bh one here.
> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> 
> ---
> 

[...]

> --- a/net/sched/sch_generic.c
> +++ b/net/sched/sch_generic.c
> @@ -659,7 +659,7 @@ static struct sk_buff *pfifo_fast_peek(struct Qdisc *qdisc)
>  	for (band = 0; band < PFIFO_FAST_BANDS && !skb; band++) {
>  		struct skb_array *q = band2list(priv, band);
>  
> -		skb = __skb_array_peek(q);
> +		skb = skb_array_peek_bh(q);

Ah I should have added a comment here. For now peek() is only used from
locking qdiscs. So peek and consume/produce operations will never happen
in parallel. In this case we should never hit the false negative case with
my patch or the out of bounds reference without my patch.

Doing a peek() op without qdisc lock is a bit problematic anyways. With
current code another cpu could consume the skb and free it. Either we
can ensure a single consumer runs at a time on an array (not the same as
qdisc maybe) or just avoid peek operations in this case. My current plan
was to just avoid peek() ops altogether, they seem unnecessary with the
types of qdiscs I want to be build.

Thanks,
John

>  	}
>  
>  	return skb;
> 

^ permalink raw reply

* Re: [PATCH iproute2 v2 2/3] link_iptnl: Print tunnel mode
From: Stephen Hemminger @ 2018-01-02 21:13 UTC (permalink / raw)
  To: Serhey Popovych; +Cc: netdev
In-Reply-To: <1514926965-22061-1-git-send-email-serhe.popovych@gmail.com>

On Tue,  2 Jan 2018 23:02:45 +0200
Serhey Popovych <serhe.popovych@gmail.com> wrote:

> Tunnel mode does not appear in parameters print for iptnl
> supported tunnels like ipip and sit, while printed for
> ip6tnl.
> 
> Print tunnel mode as "proto" field name for JSON and
> without any name when printing to cli to follow ip6tnl
> behaviour.
> 
> For non JSON output we have:
> 
>    $ ip -d link show dev sit1
> 
> Before:
> -------
> 17: sit1@NONE: <NOARP> mtu 1480 qdisc noop state DOWN ...
>     link/sit X.X.X.X brd 0.0.0.0 promiscuity 0
>     sit remote any local X.X.X.X ...
>         ~~~
> 
> After:
> ------
> 17: sit1@NONE: <NOARP> mtu 1480 qdisc noop state DOWN ...
>     link/sit X.X.X.X brd 0.0.0.0 promiscuity 0
>     sit any remote any local X.X.X.X ...
>         ^^^
> 
> Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
> ---
> v2: Addressed comments: "proto ipip" vs "proto ip4ip4" for
>     IPPROTO_IPIP tunnel type. Add example to message.
> 
>  ip/link_iptnl.c |   17 +++++++++++++++++
>  1 file changed, 17 insertions(+)
> 
> diff --git a/ip/link_iptnl.c b/ip/link_iptnl.c
> index d4d935b..b6ef95d 100644
> --- a/ip/link_iptnl.c
> +++ b/ip/link_iptnl.c
> @@ -372,6 +372,23 @@ static void iptunnel_print_opt(struct link_util *lu, FILE *f, struct rtattr *tb[
>  	if (tb[IFLA_IPTUN_COLLECT_METADATA])
>  		print_bool(PRINT_ANY, "external", "external ", true);
>  
> +	if (tb[IFLA_IPTUN_PROTO]) {
> +		switch (rta_getattr_u8(tb[IFLA_IPTUN_PROTO])) {
> +		case IPPROTO_IPIP:
> +			print_string(PRINT_ANY, "proto", "%s ", "ipip");
> +			break;
> +		case IPPROTO_IPV6:
> +			print_string(PRINT_ANY, "proto", "%s ", "ip6ip");
> +			break;
> +		case IPPROTO_MPLS:
> +			print_string(PRINT_ANY, "proto", "%s ", "mplsip");
> +			break;
> +		case 0:
> +			print_string(PRINT_ANY, "proto", "%s ", "any");
> +			break;
> +		}
> +	}
> +
>  	if (tb[IFLA_IPTUN_REMOTE]) {
>  		unsigned int addr = rta_getattr_u32(tb[IFLA_IPTUN_REMOTE]);

Thanks for fixing the ip4 value.

When you fix one thing, you need to resend whole patch series.

^ permalink raw reply

* Re: [PATCH] net: ipv4: emulate READ_ONCE() on ->hdrincl bit-field in raw_sendmsg()
From: Stefano Brivio @ 2018-01-02 21:12 UTC (permalink / raw)
  To: Nicolai Stange
  Cc: David S. Miller, Alexey Kuznetsov, Hideaki YOSHIFUJI,
	Mohamed Ghannam, Michal Kubecek, Miroslav Benes, netdev,
	linux-kernel
In-Reply-To: <20180102163020.32473-1-nstange@suse.de>

Hi,

On Tue,  2 Jan 2018 17:30:20 +0100
Nicolai Stange <nstange@suse.de> wrote:

> [...]
>
> diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
> index 5b9bd5c33d9d..e84290c28c0c 100644
> --- a/net/ipv4/raw.c
> +++ b/net/ipv4/raw.c
> @@ -513,16 +513,18 @@ static int raw_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
>  	int err;
>  	struct ip_options_data opt_copy;
>  	struct raw_frag_vec rfv;
> -	int hdrincl;
> +	int hdrincl, __hdrincl;
>  
>  	err = -EMSGSIZE;
>  	if (len > 0xFFFF)
>  		goto out;
>  
>  	/* hdrincl should be READ_ONCE(inet->hdrincl)
> -	 * but READ_ONCE() doesn't work with bit fields
> +	 * but READ_ONCE() doesn't work with bit fields.
> +	 * Emulate it by doing the READ_ONCE() from an intermediate int.
>  	 */
> -	hdrincl = inet->hdrincl;
> +	__hdrincl = inet->hdrincl;
> +	hdrincl = READ_ONCE(__hdrincl);

I guess you don't actually need to use a third variable. What about
doing READ_ONCE() on hdrincl itself after the first assignment?

Perhaps something like the patch below -- applies to net.git, yields
same binary output as your version with gcc 6, looks IMHO more
straightforward:

diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
index 125c1eab3eaa..8c2f783a95fc 100644
--- a/net/ipv4/raw.c
+++ b/net/ipv4/raw.c
@@ -519,10 +519,12 @@ static int raw_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 	if (len > 0xFFFF)
 		goto out;
 
-	/* hdrincl should be READ_ONCE(inet->hdrincl)
-	 * but READ_ONCE() doesn't work with bit fields
+	/* hdrincl should be READ_ONCE(inet->hdrincl) but READ_ONCE() doesn't
+	 * work with bit fields. Emulate it by adding a further sequence point.
 	 */
 	hdrincl = inet->hdrincl;
+	hdrincl = READ_ONCE(hdrincl);
+
 	/*
 	 *	Check the flags.
 	 */


-- 
Stefano

^ permalink raw reply related

* [PATCH iproute2 v2 2/3] link_iptnl: Print tunnel mode
From: Serhey Popovych @ 2018-01-02 21:02 UTC (permalink / raw)
  To: netdev
In-Reply-To: <20180102115413.37666ab5@xeon-e3>

Tunnel mode does not appear in parameters print for iptnl
supported tunnels like ipip and sit, while printed for
ip6tnl.

Print tunnel mode as "proto" field name for JSON and
without any name when printing to cli to follow ip6tnl
behaviour.

For non JSON output we have:

   $ ip -d link show dev sit1

Before:
-------
17: sit1@NONE: <NOARP> mtu 1480 qdisc noop state DOWN ...
    link/sit X.X.X.X brd 0.0.0.0 promiscuity 0
    sit remote any local X.X.X.X ...
        ~~~

After:
------
17: sit1@NONE: <NOARP> mtu 1480 qdisc noop state DOWN ...
    link/sit X.X.X.X brd 0.0.0.0 promiscuity 0
    sit any remote any local X.X.X.X ...
        ^^^

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
---
v2: Addressed comments: "proto ipip" vs "proto ip4ip4" for
    IPPROTO_IPIP tunnel type. Add example to message.

 ip/link_iptnl.c |   17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/ip/link_iptnl.c b/ip/link_iptnl.c
index d4d935b..b6ef95d 100644
--- a/ip/link_iptnl.c
+++ b/ip/link_iptnl.c
@@ -372,6 +372,23 @@ static void iptunnel_print_opt(struct link_util *lu, FILE *f, struct rtattr *tb[
 	if (tb[IFLA_IPTUN_COLLECT_METADATA])
 		print_bool(PRINT_ANY, "external", "external ", true);
 
+	if (tb[IFLA_IPTUN_PROTO]) {
+		switch (rta_getattr_u8(tb[IFLA_IPTUN_PROTO])) {
+		case IPPROTO_IPIP:
+			print_string(PRINT_ANY, "proto", "%s ", "ipip");
+			break;
+		case IPPROTO_IPV6:
+			print_string(PRINT_ANY, "proto", "%s ", "ip6ip");
+			break;
+		case IPPROTO_MPLS:
+			print_string(PRINT_ANY, "proto", "%s ", "mplsip");
+			break;
+		case 0:
+			print_string(PRINT_ANY, "proto", "%s ", "any");
+			break;
+		}
+	}
+
 	if (tb[IFLA_IPTUN_REMOTE]) {
 		unsigned int addr = rta_getattr_u32(tb[IFLA_IPTUN_REMOTE]);
 
-- 
1.7.10.4

^ permalink raw reply related

* Re: [PATCH net-next 2/2] l2tp: add peer_offset parameter
From: James Chapman @ 2018-01-02 20:59 UTC (permalink / raw)
  To: Guillaume Nault, Lorenzo Bianconi, David S. Miller; +Cc: netdev, Hangbin Liu
In-Reply-To: <f196609a-eeb4-5fd9-4c56-9b4c17547fe2@katalix.com>

On 02/01/18 20:08, James Chapman wrote:
> On 02/01/18 18:05, Guillaume Nault wrote:
>>>> Lorenzo, is this being added to fix interoperability with another 
>>>> L2TPv3
>>>> implementation? If so, can you share more details?
>>>>
>>> Hi James,
>>>
>>> I introduced peer_offset parameter to fix a specific setup where
>>> tunnel endpoints
>>> running L2TPv3 would use different values for tx offset (since in
>>> iproute2 there is no
>>> restriction on it), not to fix a given an interoperability issue.
>>>
>> Yes, but was it just to test iproute2's peer_offset option? Or is there
>> a plan to use it for real?
>>
>>> I introduced this feature since:
>>>   - offset has been added for long time to L2TPv3 implementation
>>>     (commit f7faffa3ff8ef6ae712ef16312b8a2aa7a1c95fe and
>>>     commit 309795f4bec2d69cd507a631f82065c2198a0825) and I wanted to
>>> preserve UABI
>>>   - have the same degree of freedom for offset parameter we have in
>>> L2TPv2 and fix the issue
>>>     described above
>>>
>> AFAIU, the current L2TPv2 implementation never sets the offset field
>> and nobody ever realised.
>>
>>> Now what we can do I guess is:
>>> - as suggested by Guillaume drop completely the offset support 
>>> without removing
>>>    netlink attribute in order to not break UABI
>>> - fix offset support initializing properly padding bits
>>>
>> I'd go for the first one. I just wonder if that looks acceptable to
>> David an James.
>
> I think the first one too. Also update iproute2 to remove or hide the 
> offset and peer_offset parameters.
>
>
I just realised the peer_offset attribute changes are already applied in 
net-next. (I missed these when they were submitted just before 
Christmas.) Should these commits be reverted? We probably don't want 
v4.15 to get an additional l2tp peer_offset attribute if we are going to 
remove it and the rest of the code supporting configurable offset 
attributes in the next release.

81487bf Merge branch 'l2tp-next'
f15bc54 l2tp: add peer_offset parameter
820da53 l2tp: fix missing print session offset info

^ permalink raw reply

* [PATCH] ethernet: cpsw-phy-sel: Delete an error message for a failed memory allocation in cpsw_phy_sel_probe()
From: SF Markus Elfring @ 2018-01-02 20:48 UTC (permalink / raw)
  To: netdev, linux-omap, Grygorii Strashko, Mugunthan V N
  Cc: LKML, kernel-janitors, Daniel Mack

From: Markus Elfring <elfring@users.sourceforge.net>
Date: Tue, 2 Jan 2018 21:41:25 +0100

Omit an extra message for a memory allocation failure in this function.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
---
 drivers/net/ethernet/ti/cpsw-phy-sel.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/ti/cpsw-phy-sel.c b/drivers/net/ethernet/ti/cpsw-phy-sel.c
index 18013645e76c..d29bc349bfc4 100644
--- a/drivers/net/ethernet/ti/cpsw-phy-sel.c
+++ b/drivers/net/ethernet/ti/cpsw-phy-sel.c
@@ -213,10 +213,8 @@ static int cpsw_phy_sel_probe(struct platform_device *pdev)
 		return -EINVAL;
 
 	priv = devm_kzalloc(&pdev->dev, sizeof(*priv), GFP_KERNEL);
-	if (!priv) {
-		dev_err(&pdev->dev, "unable to alloc memory for cpsw phy sel\n");
+	if (!priv)
 		return -ENOMEM;
-	}
 
 	priv->dev = &pdev->dev;
 	priv->cpsw_phy_sel = of_id->data;
-- 
2.15.1

^ permalink raw reply related

* Re: [PATCH net 3/3] eet: ena: invoke netif_carrier_off() only after netdev registered
From: Belgazal, Netanel @ 2018-01-02 20:38 UTC (permalink / raw)
  To: David Miller
  Cc: netdev@vger.kernel.org, Woodhouse, David, Machulsky, Zorik,
	Matushevsky, Alexander, Bshara, Saeed, Wilson, Matt,
	Liguori, Anthony, Bshara, Nafea, Schmeilin, Evgeny, Tzalik, Guy
In-Reply-To: <20180102.140804.982695913961893372.davem@davemloft.net>

Right.
I’ll remove this patch.

On 1/2/18, 9:08 PM, "David Miller" <davem@davemloft.net> wrote:

    From: <netanel@amazon.com>
    Date: Thu, 28 Dec 2017 21:30:20 +0000
    
    > From: Netanel Belgazal <netanel@amazon.com>
    > 
    > netif_carrier_off() should be called only after register netdev.
    > Move the function's call after the registration.
    > 
    > Signed-off-by: Netanel Belgazal <netanel@amazon.com>
    > ---
    >  drivers/net/ethernet/amazon/ena/ena_netdev.c | 4 ++--
    >  1 file changed, 2 insertions(+), 2 deletions(-)
    > 
    > diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c
    > index fbe21a817bd8..ee50c56765a4 100644
    > --- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
    > +++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
    > @@ -3276,14 +3276,14 @@ static int ena_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
    >  
    >  	memcpy(adapter->netdev->perm_addr, adapter->mac_addr, netdev->addr_len);
    >  
    > -	netif_carrier_off(netdev);
    > -
    >  	rc = register_netdev(netdev);
    >  	if (rc) {
    >  		dev_err(&pdev->dev, "Cannot register net device\n");
    >  		goto err_rss;
    >  	}
    >  
    > +	netif_carrier_off(netdev);
    > +
    
    You cannot invoke this after register_netdev(), asynchronous activity can cause this
    call to lose information and lose a link up event.
    
    


^ permalink raw reply

* Re: [net-next] netfilter: add segment routing header 'srh' match
From: Ahmed Abdelsalam @ 2018-01-02 20:37 UTC (permalink / raw)
  To: David Lebrun
  Cc: pablo, kadlec, fw, kuznet, linux-kernel, netfilter-devel,
	coreteam, netdev, amsalam20
In-Reply-To: <1514545672-13827-1-git-send-email-amsalam20@gmail.com>

On Fri, 29 Dec 2017 12:07:52 +0100
Ahmed Abdelsalam <amsalam20@gmail.com> wrote:

> It allows matching packets based on Segment Routing Header
> (SRH) information.
> The implementation considers revision 7 of the SRH draft.
> https://tools.ietf.org/html/draft-ietf-6man-segment-routing-header-07
> 
> Currently supported match options include:
> (1) Next Header
> (2) Hdr Ext Len
> (3) Segments Left
> (4) Last Entry
> (5) Tag value of SRH
> 
> Signed-off-by: Ahmed Abdelsalam <amsalam20@gmail.com>
> ---
>  include/uapi/linux/netfilter_ipv6/ip6t_srh.h |  63 ++++++++++
>  net/ipv6/netfilter/Kconfig                   |   9 ++
>  net/ipv6/netfilter/Makefile                  |   1 +
>  net/ipv6/netfilter/ip6t_srh.c                | 165 +++++++++++++++++++++++++++
>  4 files changed, 238 insertions(+)
>  create mode 100644 include/uapi/linux/netfilter_ipv6/ip6t_srh.h
>  create mode 100644 net/ipv6/netfilter/ip6t_srh.c
> 
> diff --git a/include/uapi/linux/netfilter_ipv6/ip6t_srh.h b/include/uapi/linux/netfilter_ipv6/ip6t_srh.h
> new file mode 100644
> index 0000000..1b5dbd8
> --- /dev/null
> +++ b/include/uapi/linux/netfilter_ipv6/ip6t_srh.h
> @@ -0,0 +1,63 @@
> +/**
> + * Definitions for Segment Routing Header 'srh' match
> + *
> + * Author:
> + *       Ahmed Abdelsalam       <amsalam20@gmail.com>
> + */
> +
> +#ifndef _IP6T_SRH_H
> +#define _IP6T_SRH_H
> +
> +#include <linux/types.h>
> +#include <linux/netfilter.h>
> +
> +/* Values for "mt_flags" field in struct ip6t_srh */
> +#define IP6T_SRH_NEXTHDR        0x0001
> +#define IP6T_SRH_LEN_EQ         0x0002
> +#define IP6T_SRH_LEN_GT         0x0004
> +#define IP6T_SRH_LEN_LT         0x0008
> +#define IP6T_SRH_SEGS_EQ        0x0010
> +#define IP6T_SRH_SEGS_GT        0x0020
> +#define IP6T_SRH_SEGS_LT        0x0040
> +#define IP6T_SRH_LAST_EQ        0x0080
> +#define IP6T_SRH_LAST_GT        0x0100
> +#define IP6T_SRH_LAST_LT        0x0200
> +#define IP6T_SRH_TAG            0x0400
> +#define IP6T_SRH_MASK           0x07FF
> +
> +/* Values for "mt_invflags" field in struct ip6t_srh */
> +#define IP6T_SRH_INV_NEXTHDR    0x0001
> +#define IP6T_SRH_INV_LEN_EQ     0x0002
> +#define IP6T_SRH_INV_LEN_GT     0x0004
> +#define IP6T_SRH_INV_LEN_LT     0x0008
> +#define IP6T_SRH_INV_SEGS_EQ    0x0010
> +#define IP6T_SRH_INV_SEGS_GT    0x0020
> +#define IP6T_SRH_INV_SEGS_LT    0x0040
> +#define IP6T_SRH_INV_LAST_EQ    0x0080
> +#define IP6T_SRH_INV_LAST_GT    0x0100
> +#define IP6T_SRH_INV_LAST_LT    0x0200
> +#define IP6T_SRH_INV_TAG        0x0400
> +#define IP6T_SRH_INV_MASK       0x07FF
> +
> +/**
> + *      struct ip6t_srh - SRH match options
> + *      @ next_hdr: Next header field of SRH
> + *      @ hdr_len: Extension header length field of SRH
> + *      @ segs_left: Segments left field of SRH
> + *      @ last_entry: Last entry field of SRH
> + *      @ tag: Tag field of SRH
> + *      @ mt_flags: match options
> + *      @ mt_invflags: Invert the sense of match options
> + */
> +
> +struct ip6t_srh {
> +	__u8                    next_hdr;
> +	__u8                    hdr_len;
> +	__u8                    segs_left;
> +	__u8                    last_entry;
> +	__u16                   tag;
> +	__u16                   mt_flags;
> +	__u16                   mt_invflags;
> +};
> +
> +#endif /*_IP6T_SRH_H*/
> diff --git a/net/ipv6/netfilter/Kconfig b/net/ipv6/netfilter/Kconfig
> index 6acb2ee..e1818eb 100644
> --- a/net/ipv6/netfilter/Kconfig
> +++ b/net/ipv6/netfilter/Kconfig
> @@ -232,6 +232,15 @@ config IP6_NF_MATCH_RT
>  
>  	  To compile it as a module, choose M here.  If unsure, say N.
>  
> +config IP6_NF_MATCH_SRH
> +        tristate '"srh" Segment Routing header match support'
> +        depends on NETFILTER_ADVANCED
> +        help
> +          srh matching allows you to match packets based on the segment
> +	  routing header of the packet.
> +
> +          To compile it as a module, choose M here.  If unsure, say N.
> +
>  # The targets
>  config IP6_NF_TARGET_HL
>  	tristate '"HL" hoplimit target support'
> diff --git a/net/ipv6/netfilter/Makefile b/net/ipv6/netfilter/Makefile
> index c6ee0cd..e0d51a9 100644
> --- a/net/ipv6/netfilter/Makefile
> +++ b/net/ipv6/netfilter/Makefile
> @@ -54,6 +54,7 @@ obj-$(CONFIG_IP6_NF_MATCH_MH) += ip6t_mh.o
>  obj-$(CONFIG_IP6_NF_MATCH_OPTS) += ip6t_hbh.o
>  obj-$(CONFIG_IP6_NF_MATCH_RPFILTER) += ip6t_rpfilter.o
>  obj-$(CONFIG_IP6_NF_MATCH_RT) += ip6t_rt.o
> +obj-$(CONFIG_IP6_NF_MATCH_SRH) += ip6t_srh.o
>  
>  # targets
>  obj-$(CONFIG_IP6_NF_TARGET_MASQUERADE) += ip6t_MASQUERADE.o
> diff --git a/net/ipv6/netfilter/ip6t_srh.c b/net/ipv6/netfilter/ip6t_srh.c
> new file mode 100644
> index 0000000..75e41dc9
> --- /dev/null
> +++ b/net/ipv6/netfilter/ip6t_srh.c
> @@ -0,0 +1,165 @@
> +/*
> + * Kernel module to match Segment Routing Header (SRH) parameters.
> + *
> + * Author:
> + * Ahmed Abdelsalam <amsalam20@gmail.com>
> + *
> + *
> + *  This program is free software; you can redistribute it and/or
> + *	modify it under the terms of the GNU General Public License
> + *	as published by the Free Software Foundation; either version 2
> + *	of the License, or (at your option) any later version.
> + */
> +
> +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> +#include <linux/module.h>
> +#include <linux/skbuff.h>
> +#include <linux/ipv6.h>
> +#include <linux/types.h>
> +#include <net/ipv6.h>
> +#include <net/seg6.h>
> +
> +#include <linux/netfilter/x_tables.h>
> +#include <linux/netfilter_ipv6/ip6t_srh.h>
> +#include <linux/netfilter_ipv6/ip6_tables.h>
> +
> +MODULE_LICENSE("GPL");
> +MODULE_DESCRIPTION("Xtables: IPv6 Segment Routing Header match");
> +MODULE_AUTHOR("Ahmed Abdelsalam  <amsalam20@gmail.com>");
> +
> +/* Test a struct->mt_invflags and a boolean for inequality */
> +#define NF_SRH_INVF(ptr, flag, boolean)	\
> +	((boolean) ^ !!((ptr)->mt_invflags & (flag)))
> +
> +static bool srh_mt6(const struct sk_buff *skb, struct xt_action_param *par)
> +{
> +	const struct ip6t_srh *srhinfo = par->matchinfo;
> +	struct ipv6_sr_hdr *srh;
> +	struct ipv6_sr_hdr _srh;
> +	int hdrlen, srhoff = 0;
> +
> +	if (ipv6_find_hdr(skb, &srhoff, IPPROTO_ROUTING, NULL, NULL) < 0)
> +		return false;
> +
> +	srh = skb_header_pointer(skb, srhoff, sizeof(_srh), &_srh);
> +
> +	if (!srh)
> +		return false;
> +
> +	hdrlen = ipv6_optlen(srh);
> +	if (skb->len - srhoff < hdrlen)
> +		return false;
> +
> +	if (srh->type != IPV6_SRCRT_TYPE_4)
> +		return false;
> +
> +	if (srh->segments_left > srh->first_segment)
> +		return false;
> +
> +	/* Next Header matching */
> +	if (srhinfo->mt_flags & IP6T_SRH_NEXTHDR)
> +		if (NF_SRH_INVF(srhinfo, IP6T_SRH_INV_NEXTHDR,
> +				!(srh->nexthdr == srhinfo->next_hdr)))
> +			return false;
> +
> +	/* Header Extension Length matching */
> +	if (srhinfo->mt_flags & IP6T_SRH_LEN_EQ)
> +		if (NF_SRH_INVF(srhinfo, IP6T_SRH_INV_LEN_EQ,
> +				!(srh->hdrlen == srhinfo->hdr_len)))
> +			return false;
> +
> +	if (srhinfo->mt_flags & IP6T_SRH_LEN_GT)
> +		if (NF_SRH_INVF(srhinfo, IP6T_SRH_INV_LEN_GT,
> +				!(srh->hdrlen > srhinfo->hdr_len)))
> +			return false;
> +
> +	if (srhinfo->mt_flags & IP6T_SRH_LEN_LT)
> +		if (NF_SRH_INVF(srhinfo, IP6T_SRH_INV_LEN_LT,
> +				!(srh->hdrlen < srhinfo->hdr_len)))
> +			return false;
> +
> +	/* Segments Left matching */
> +	if (srhinfo->mt_flags & IP6T_SRH_SEGS_EQ)
> +		if (NF_SRH_INVF(srhinfo, IP6T_SRH_INV_SEGS_EQ,
> +				!(srh->segments_left == srhinfo->segs_left)))
> +			return false;
> +
> +	if (srhinfo->mt_flags & IP6T_SRH_SEGS_GT)
> +		if (NF_SRH_INVF(srhinfo, IP6T_SRH_INV_SEGS_GT,
> +				!(srh->segments_left > srhinfo->segs_left)))
> +			return false;
> +
> +	if (srhinfo->mt_flags & IP6T_SRH_SEGS_LT)
> +		if (NF_SRH_INVF(srhinfo, IP6T_SRH_INV_SEGS_LT,
> +				!(srh->segments_left < srhinfo->segs_left)))
> +			return false;
> +
> +	/**
> +	 * Last Entry matching
> +	 * Last_Entry field was introduced in revision 6 of the SRH draft.
> +	 * It was called First_Segment in the previous revision
> +	 */
> +	if (srhinfo->mt_flags & IP6T_SRH_LAST_EQ)
> +		if (NF_SRH_INVF(srhinfo, IP6T_SRH_INV_LAST_EQ,
> +				!(srh->first_segment == srhinfo->last_entry)))
> +			return false;
> +
> +	if (srhinfo->mt_flags & IP6T_SRH_LAST_GT)
> +		if (NF_SRH_INVF(srhinfo, IP6T_SRH_INV_LAST_GT,
> +				!(srh->first_segment > srhinfo->last_entry)))
> +			return false;
> +
> +	if (srhinfo->mt_flags & IP6T_SRH_LAST_LT)
> +		if (NF_SRH_INVF(srhinfo, IP6T_SRH_INV_LAST_LT,
> +				!(srh->first_segment < srhinfo->last_entry)))
> +			return false;
> +
> +	/**
> +	 * Tag matchig
> +	 * Tag field was introduced in revision 6 of the SRH draft.
> +	 */
> +	if (srhinfo->mt_flags & IP6T_SRH_TAG)
> +		if (NF_SRH_INVF(srhinfo, IP6T_SRH_INV_TAG,
> +				!(srh->tag == srhinfo->tag)))
> +			return false;
> +	return true;
> +}
> +
> +static int srh_mt6_check(const struct xt_mtchk_param *par)
> +{
> +	const struct ip6t_srh *srhinfo = par->matchinfo;
> +
> +	if (srhinfo->mt_flags & ~IP6T_SRH_MASK) {
> +		pr_debug("unknown srh match flags  %X\n", srhinfo->mt_flags);
> +		return -EINVAL;
> +	}
> +
> +	if (srhinfo->mt_invflags & ~IP6T_SRH_INV_MASK) {
> +		pr_debug("unknown srh invflags %X\n", srhinfo->mt_invflags);
> +		return -EINVAL;
> +	}
> +
> +	return 0;
> +}
> +
> +static struct xt_match srh_mt6_reg __read_mostly = {
> +	.name		= "srh",
> +	.family		= NFPROTO_IPV6,
> +	.match		= srh_mt6,
> +	.matchsize	= sizeof(struct ip6t_srh),
> +	.checkentry	= srh_mt6_check,
> +	.me		= THIS_MODULE,
> +};
> +
> +static int __init srh_mt6_init(void)
> +{
> +	return xt_register_match(&srh_mt6_reg);
> +}
> +
> +static void __exit srh_mt6_exit(void)
> +{
> +	xt_unregister_match(&srh_mt6_reg);
> +}
> +
> +module_init(srh_mt6_init);
> +module_exit(srh_mt6_exit);
> -- 
> 2.1.4
> 
David, 
Any comment on this patch ?
-- 
Ahmed 

^ permalink raw reply

* Re: Bluetooth: Prevent stack info leak from the EFS element.
From: Guenter Roeck @ 2018-01-02 20:26 UTC (permalink / raw)
  To: gregkh@linuxfoundation.org
  Cc: Marcel Holtmann, Gustavo Padovan, Johan Hedberg, Ben Seri, netdev,
	linux-kernel
In-Reply-To: <20171208141447.GA5404@kroah.com>

On Fri, Dec 08, 2017 at 03:14:47PM +0100, gregkh@linuxfoundation.org wrote:
> From: Ben Seri <ben@armis.com>
> 
> In the function l2cap_parse_conf_rsp and in the function
> l2cap_parse_conf_req the following variable is declared without
> initialization:
> 
> struct l2cap_conf_efs efs;
> 
> In addition, when parsing input configuration parameters in both of
> these functions, the switch case for handling EFS elements may skip the
> memcpy call that will write to the efs variable:
> 
> ...
> case L2CAP_CONF_EFS:
> if (olen == sizeof(efs))
> memcpy(&efs, (void *)val, olen);
> ...
> 
> The olen in the above if is attacker controlled, and regardless of that
> if, in both of these functions the efs variable would eventually be
> added to the outgoing configuration request that is being built:
> 
> l2cap_add_conf_opt(&ptr, L2CAP_CONF_EFS, sizeof(efs), (unsigned long) &efs);
> 
> So by sending a configuration request, or response, that contains an
> L2CAP_CONF_EFS element, but with an element length that is not
> sizeof(efs) - the memcpy to the uninitialized efs variable can be
> avoided, and the uninitialized variable would be returned to the
> attacker (16 bytes).
> 
> This issue has been assigned CVE-2017-1000410
> 
> Cc: Marcel Holtmann <marcel@holtmann.org>
> Cc: Gustavo Padovan <gustavo@padovan.org>
> Cc: Johan Hedberg <johan.hedberg@gmail.com>
> Cc: stable <stable@vger.kernel.org>
> Signed-off-by: Ben Seri <ben@armis.com>
> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> ---
>  net/bluetooth/l2cap_core.c |   20 +++++++++++---------
>  1 file changed, 11 insertions(+), 9 deletions(-)
> 
> Marcel, for some reason this patch never got applied, despite lots of
> advance notice.  Please, someone queue it up as it resolves the above
> very-well-reported issue.
> 

This patch is still not upstream or in -next. Given that we (ChromeOS)
are heavy Bluetooth users, I'll go ahead and apply it without waiting
any longer. For my understanding and for tracking purposes, it would
be useful to know why it is not being applied. Does anyone know ?

Thanks,
Guenter

> diff --git a/net/bluetooth/l2cap_core.c b/net/bluetooth/l2cap_core.c
> index 43ba91c440bc..fc6615d59165 100644
> --- a/net/bluetooth/l2cap_core.c
> +++ b/net/bluetooth/l2cap_core.c
> @@ -3363,9 +3363,10 @@ static int l2cap_parse_conf_req(struct l2cap_chan *chan, void *data, size_t data
>  			break;
>  
>  		case L2CAP_CONF_EFS:
> -			remote_efs = 1;
> -			if (olen == sizeof(efs))
> +			if (olen == sizeof(efs)) {
> +				remote_efs = 1;
>  				memcpy(&efs, (void *) val, olen);
> +			}
>  			break;
>  
>  		case L2CAP_CONF_EWS:
> @@ -3584,16 +3585,17 @@ static int l2cap_parse_conf_rsp(struct l2cap_chan *chan, void *rsp, int len,
>  			break;
>  
>  		case L2CAP_CONF_EFS:
> -			if (olen == sizeof(efs))
> +			if (olen == sizeof(efs)) {
>  				memcpy(&efs, (void *)val, olen);
>  
> -			if (chan->local_stype != L2CAP_SERV_NOTRAFIC &&
> -			    efs.stype != L2CAP_SERV_NOTRAFIC &&
> -			    efs.stype != chan->local_stype)
> -				return -ECONNREFUSED;
> +				if (chan->local_stype != L2CAP_SERV_NOTRAFIC &&
> +				    efs.stype != L2CAP_SERV_NOTRAFIC &&
> +				    efs.stype != chan->local_stype)
> +					return -ECONNREFUSED;
>  
> -			l2cap_add_conf_opt(&ptr, L2CAP_CONF_EFS, sizeof(efs),
> -					   (unsigned long) &efs, endptr - ptr);
> +				l2cap_add_conf_opt(&ptr, L2CAP_CONF_EFS, sizeof(efs),
> +						   (unsigned long) &efs, endptr - ptr);
> +			}
>  			break;
>  
>  		case L2CAP_CONF_FCS:

^ permalink raw reply

* Re: [net-next] ipv6: sr: export some functions of seg6local
From: Ahmed Abdelsalam @ 2018-01-02 20:25 UTC (permalink / raw)
  To: David Lebrun; +Cc: davem, kuznet, yoshfuji, david.lebrun, netdev
In-Reply-To: <744656e9-b277-2915-e575-aea8b2c083be@gmail.com>

Ok! 
Thanks David
On Tue, 2 Jan 2018 20:11:48 +0000
David Lebrun <dav.lebrun@gmail.com> wrote:

> On 12/29/2017 09:09 PM, Ahmed Abdelsalam wrote:
> > Some functions of seg6local are very useful to process SRv6
> > encapsulated packets.
> > 
> > This patch exports some functions of seg6local that are useful and
> > can be re-used at different parts of the kernel.
> > 
> > The set of exported functions are:
> > (1) get_srh()
> > (2) advance_nextseg()
> > (3) lookup_nexthop
> 
> If you want to export those functions, it's better to prefix them with
> e.g. seg6_:
> 
> seg6_get_srh()
> seg6_advance_nextseg()
> seg6_lookup_nexthop()
> 
> David


-- 
Ahmed Abdelsalam <amsalam20@gmail.com>

^ permalink raw reply

* [net 1/2] e1000: fix disabling already-disabled warning
From: Jeff Kirsher @ 2018-01-02 20:20 UTC (permalink / raw)
  To: davem; +Cc: Tushar Dave, netdev, nhorman, sassmann, jogreene, Jeff Kirsher
In-Reply-To: <20180102202045.72780-1-jeffrey.t.kirsher@intel.com>

From: Tushar Dave <tushar.n.dave@oracle.com>

This patch adds check so that driver does not disable already
disabled device.

[   44.637743] advantechwdt: Unexpected close, not stopping watchdog!
[   44.997548] input: ImExPS/2 Generic Explorer Mouse as /devices/platform/i8042/serio1/input/input6
[   45.013419] e1000 0000:00:03.0: disabling already-disabled device
[   45.013447] ------------[ cut here ]------------
[   45.014868] WARNING: CPU: 1 PID: 71 at drivers/pci/pci.c:1641 pci_disable_device+0xa1/0x105:
						pci_disable_device at drivers/pci/pci.c:1640
[   45.016171] CPU: 1 PID: 71 Comm: rcu_perf_shutdo Not tainted 4.14.0-01330-g3c07399 #1
[   45.017197] task: ffff88011bee9e40 task.stack: ffffc90000860000
[   45.017987] RIP: 0010:pci_disable_device+0xa1/0x105:
						pci_disable_device at drivers/pci/pci.c:1640
[   45.018603] RSP: 0000:ffffc90000863e30 EFLAGS: 00010286
[   45.019282] RAX: 0000000000000035 RBX: ffff88013a230008 RCX: 0000000000000000
[   45.020182] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000203
[   45.021084] RBP: ffff88013a3f31e8 R08: 0000000000000001 R09: 0000000000000000
[   45.021986] R10: ffffffff827ec29c R11: 0000000000000002 R12: 0000000000000001
[   45.022946] R13: ffff88013a230008 R14: ffff880117802b20 R15: ffffc90000863e8f
[   45.023842] FS:  0000000000000000(0000) GS:ffff88013fd00000(0000) knlGS:0000000000000000
[   45.024863] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   45.025583] CR2: ffffc900006d4000 CR3: 000000000220f000 CR4: 00000000000006a0
[   45.026478] Call Trace:
[   45.026811]  __e1000_shutdown+0x1d4/0x1e2:
						__e1000_shutdown at drivers/net/ethernet/intel/e1000/e1000_main.c:5162
[   45.027344]  ? rcu_perf_cleanup+0x2a1/0x2a1:
						rcu_perf_shutdown at kernel/rcu/rcuperf.c:627
[   45.027883]  e1000_shutdown+0x14/0x3a:
						e1000_shutdown at drivers/net/ethernet/intel/e1000/e1000_main.c:5235
[   45.028351]  device_shutdown+0x110/0x1aa:
						device_shutdown at drivers/base/core.c:2807
[   45.028858]  kernel_power_off+0x31/0x64:
						kernel_power_off at kernel/reboot.c:260
[   45.029343]  rcu_perf_shutdown+0x9b/0xa7:
						rcu_perf_shutdown at kernel/rcu/rcuperf.c:637
[   45.029852]  ? __wake_up_common_lock+0xa2/0xa2:
						autoremove_wake_function at kernel/sched/wait.c:376
[   45.030414]  kthread+0x126/0x12e:
						kthread at kernel/kthread.c:233
[   45.030834]  ? __kthread_bind_mask+0x8e/0x8e:
						kthread at kernel/kthread.c:190
[   45.031399]  ? ret_from_fork+0x1f/0x30:
						ret_from_fork at arch/x86/entry/entry_64.S:443
[   45.031883]  ? kernel_init+0xa/0xf5:
						kernel_init at init/main.c:997
[   45.032325]  ret_from_fork+0x1f/0x30:
						ret_from_fork at arch/x86/entry/entry_64.S:443
[   45.032777] Code: 00 48 85 ed 75 07 48 8b ab a8 00 00 00 48 8d bb 98 00 00 00 e8 aa d1 11 00 48 89 ea 48 89 c6 48 c7 c7 d8 e4 0b 82 e8 55 7d da ff <0f> ff b9 01 00 00 00 31 d2 be 01 00 00 00 48 c7 c7 f0 b1 61 82
[   45.035222] ---[ end trace c257137b1b1976ef ]---
[   45.037838] ACPI: Preparing to enter system sleep state S5

Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Tested-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/e1000/e1000.h      |  3 ++-
 drivers/net/ethernet/intel/e1000/e1000_main.c | 27 ++++++++++++++++++++++-----
 2 files changed, 24 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/intel/e1000/e1000.h b/drivers/net/ethernet/intel/e1000/e1000.h
index d7bdea79e9fa..8fd2458060a0 100644
--- a/drivers/net/ethernet/intel/e1000/e1000.h
+++ b/drivers/net/ethernet/intel/e1000/e1000.h
@@ -331,7 +331,8 @@ struct e1000_adapter {
 enum e1000_state_t {
 	__E1000_TESTING,
 	__E1000_RESETTING,
-	__E1000_DOWN
+	__E1000_DOWN,
+	__E1000_DISABLED
 };
 
 #undef pr_fmt
diff --git a/drivers/net/ethernet/intel/e1000/e1000_main.c b/drivers/net/ethernet/intel/e1000/e1000_main.c
index 1982f7917a8d..3dd4aeb2706d 100644
--- a/drivers/net/ethernet/intel/e1000/e1000_main.c
+++ b/drivers/net/ethernet/intel/e1000/e1000_main.c
@@ -945,7 +945,7 @@ static int e1000_init_hw_struct(struct e1000_adapter *adapter,
 static int e1000_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 {
 	struct net_device *netdev;
-	struct e1000_adapter *adapter;
+	struct e1000_adapter *adapter = NULL;
 	struct e1000_hw *hw;
 
 	static int cards_found;
@@ -955,6 +955,7 @@ static int e1000_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 	u16 tmp = 0;
 	u16 eeprom_apme_mask = E1000_EEPROM_APME;
 	int bars, need_ioport;
+	bool disable_dev = false;
 
 	/* do not allocate ioport bars when not needed */
 	need_ioport = e1000_is_need_ioport(pdev);
@@ -1259,11 +1260,13 @@ static int e1000_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 	iounmap(hw->ce4100_gbe_mdio_base_virt);
 	iounmap(hw->hw_addr);
 err_ioremap:
+	disable_dev = !test_and_set_bit(__E1000_DISABLED, &adapter->flags);
 	free_netdev(netdev);
 err_alloc_etherdev:
 	pci_release_selected_regions(pdev, bars);
 err_pci_reg:
-	pci_disable_device(pdev);
+	if (!adapter || disable_dev)
+		pci_disable_device(pdev);
 	return err;
 }
 
@@ -1281,6 +1284,7 @@ static void e1000_remove(struct pci_dev *pdev)
 	struct net_device *netdev = pci_get_drvdata(pdev);
 	struct e1000_adapter *adapter = netdev_priv(netdev);
 	struct e1000_hw *hw = &adapter->hw;
+	bool disable_dev;
 
 	e1000_down_and_stop(adapter);
 	e1000_release_manageability(adapter);
@@ -1299,9 +1303,11 @@ static void e1000_remove(struct pci_dev *pdev)
 		iounmap(hw->flash_address);
 	pci_release_selected_regions(pdev, adapter->bars);
 
+	disable_dev = !test_and_set_bit(__E1000_DISABLED, &adapter->flags);
 	free_netdev(netdev);
 
-	pci_disable_device(pdev);
+	if (disable_dev)
+		pci_disable_device(pdev);
 }
 
 /**
@@ -5156,7 +5162,8 @@ static int __e1000_shutdown(struct pci_dev *pdev, bool *enable_wake)
 	if (netif_running(netdev))
 		e1000_free_irq(adapter);
 
-	pci_disable_device(pdev);
+	if (!test_and_set_bit(__E1000_DISABLED, &adapter->flags))
+		pci_disable_device(pdev);
 
 	return 0;
 }
@@ -5200,6 +5207,10 @@ static int e1000_resume(struct pci_dev *pdev)
 		pr_err("Cannot enable PCI device from suspend\n");
 		return err;
 	}
+
+	/* flush memory to make sure state is correct */
+	smp_mb__before_atomic();
+	clear_bit(__E1000_DISABLED, &adapter->flags);
 	pci_set_master(pdev);
 
 	pci_enable_wake(pdev, PCI_D3hot, 0);
@@ -5274,7 +5285,9 @@ static pci_ers_result_t e1000_io_error_detected(struct pci_dev *pdev,
 
 	if (netif_running(netdev))
 		e1000_down(adapter);
-	pci_disable_device(pdev);
+
+	if (!test_and_set_bit(__E1000_DISABLED, &adapter->flags))
+		pci_disable_device(pdev);
 
 	/* Request a slot slot reset. */
 	return PCI_ERS_RESULT_NEED_RESET;
@@ -5302,6 +5315,10 @@ static pci_ers_result_t e1000_io_slot_reset(struct pci_dev *pdev)
 		pr_err("Cannot re-enable PCI device after reset.\n");
 		return PCI_ERS_RESULT_DISCONNECT;
 	}
+
+	/* flush memory to make sure state is correct */
+	smp_mb__before_atomic();
+	clear_bit(__E1000_DISABLED, &adapter->flags);
 	pci_set_master(pdev);
 
 	pci_enable_wake(pdev, PCI_D3hot, 0);
-- 
2.15.1

^ permalink raw reply related

* [net 2/2] e1000e: Fix e1000_check_for_copper_link_ich8lan return value.
From: Jeff Kirsher @ 2018-01-02 20:20 UTC (permalink / raw)
  To: davem; +Cc: Benjamin Poirier, netdev, nhorman, sassmann, jogreene,
	Jeff Kirsher
In-Reply-To: <20180102202045.72780-1-jeffrey.t.kirsher@intel.com>

From: Benjamin Poirier <bpoirier@suse.com>

e1000e_check_for_copper_link() and e1000_check_for_copper_link_ich8lan()
are the two functions that may be assigned to mac.ops.check_for_link when
phy.media_type == e1000_media_type_copper. Commit 19110cfbb34d ("e1000e:
Separate signaling for link check/link up") changed the meaning of the
return value of check_for_link for copper media but only adjusted the first
function. This patch adjusts the second function likewise.

Reported-by: Christian Hesse <list@eworm.de>
Reported-by: Gabriel C <nix.or.die@gmail.com>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=198047
Fixes: 19110cfbb34d ("e1000e: Separate signaling for link check/link up")
Signed-off-by: Benjamin Poirier <bpoirier@suse.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Tested-by: Christian Hesse <list@eworm.de>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/e1000e/ich8lan.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/intel/e1000e/ich8lan.c b/drivers/net/ethernet/intel/e1000e/ich8lan.c
index d6d4ed7acf03..31277d3bb7dc 100644
--- a/drivers/net/ethernet/intel/e1000e/ich8lan.c
+++ b/drivers/net/ethernet/intel/e1000e/ich8lan.c
@@ -1367,6 +1367,9 @@ static s32 e1000_disable_ulp_lpt_lp(struct e1000_hw *hw, bool force)
  *  Checks to see of the link status of the hardware has changed.  If a
  *  change in link status has been detected, then we read the PHY registers
  *  to get the current speed/duplex if link exists.
+ *
+ *  Returns a negative error code (-E1000_ERR_*) or 0 (link down) or 1 (link
+ *  up).
  **/
 static s32 e1000_check_for_copper_link_ich8lan(struct e1000_hw *hw)
 {
@@ -1382,7 +1385,7 @@ static s32 e1000_check_for_copper_link_ich8lan(struct e1000_hw *hw)
 	 * Change or Rx Sequence Error interrupt.
 	 */
 	if (!mac->get_link_status)
-		return 0;
+		return 1;
 
 	/* First we want to see if the MII Status Register reports
 	 * link.  If so, then we want to get the current speed/duplex
@@ -1613,10 +1616,12 @@ static s32 e1000_check_for_copper_link_ich8lan(struct e1000_hw *hw)
 	 * different link partner.
 	 */
 	ret_val = e1000e_config_fc_after_link_up(hw);
-	if (ret_val)
+	if (ret_val) {
 		e_dbg("Error configuring flow control\n");
+		return ret_val;
+	}
 
-	return ret_val;
+	return 1;
 }
 
 static s32 e1000_get_variants_ich8lan(struct e1000_adapter *adapter)
-- 
2.15.1

^ permalink raw reply related

* [net 0/2][pull request] Intel Wired LAN Driver Updates 2018-01-02
From: Jeff Kirsher @ 2018-01-02 20:20 UTC (permalink / raw)
  To: davem; +Cc: Jeff Kirsher, netdev, nhorman, sassmann, jogreene

This series contains fixes for e1000 and e1000e.

Tushar Dave adds a check to the driver so that it won't attempt to disable a
device that is already disabled for e1000.

Benjamin Poirier provides a fix to e1000e, where a previous commit that
Benjamin submitted changed the meaning of the return value for
"check_for_link" for copper media and not all the instances were properly
updated.  Benjamin fixes the remaining instances that needed the change.

The following are changes since commit 2758b3e3e630ba304fc4aca434d591e70e528298:
  Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
and are available in the git repository at:
  git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue 1GbE

Benjamin Poirier (1):
  e1000e: Fix e1000_check_for_copper_link_ich8lan return value.

Tushar Dave (1):
  e1000: fix disabling already-disabled warning

 drivers/net/ethernet/intel/e1000/e1000.h      |  3 ++-
 drivers/net/ethernet/intel/e1000/e1000_main.c | 27 ++++++++++++++++++++++-----
 drivers/net/ethernet/intel/e1000e/ich8lan.c   | 11 ++++++++---
 3 files changed, 32 insertions(+), 9 deletions(-)

-- 
2.15.1

^ permalink raw reply

* Re: [PATCH net-next 2/2] l2tp: add peer_offset parameter
From: James Chapman @ 2018-01-02 20:18 UTC (permalink / raw)
  To: Lorenzo Bianconi, Guillaume Nault; +Cc: David S. Miller, netdev, Hangbin Liu
In-Reply-To: <CAJ0CqmWhhzvi8HCKjf-xUjkszuNKgnuoD_MXmDNtUtWyEExdNQ@mail.gmail.com>

On 02/01/18 19:28, Lorenzo Bianconi wrote:
>>>> Lorenzo, is this being added to fix interoperability with another L2TPv3
>>>> implementation? If so, can you share more details?
>>>>
>>> Hi James,
>>>
>>> I introduced peer_offset parameter to fix a specific setup where
>>> tunnel endpoints
>>> running L2TPv3 would use different values for tx offset (since in
>>> iproute2 there is no
>>> restriction on it), not to fix a given an interoperability issue.
>>>
>> Yes, but was it just to test iproute2's peer_offset option? Or is there
>> a plan to use it for real?
>>
>>> I introduced this feature since:
>>>   - offset has been added for long time to L2TPv3 implementation
>>>     (commit f7faffa3ff8ef6ae712ef16312b8a2aa7a1c95fe and
>>>     commit 309795f4bec2d69cd507a631f82065c2198a0825) and I wanted to
>>> preserve UABI
>>>   - have the same degree of freedom for offset parameter we have in
>>> L2TPv2 and fix the issue
>>>     described above
>>>
>> AFAIU, the current L2TPv2 implementation never sets the offset field
>> and nobody ever realised.
>>
> Perhaps I am little bit polarized on UABI issue, but I was rethinking
> about it and maybe removing offset parameter would lead to an
> interoperability issue for device running L2TPv3 since offset
> parameter is there and it is not a nope.
> Please consider this setup:
> - 2 endpoint running L2TPv3, the first running net-next and the second
> running 4.14
> - both endpoint are configured using iproute2 in this way:
>
>    - ip l2tp add tunnel local <ip0> remote <ip1> tunnel_id <id0>
> peer_tunnel_id <id1> udp_sport <p0> udp_dport <p1>
>    - ip l2tp add tunnel local <ip1> remote <ip0> tunnel_id <id1>
> peer_tunnel_id <id0> udp_sport <p1> udp_dport <p0>
>    - ip l2tp add session name l2tp0 tunnel_id <id0> session_id <s0>
> peer_session_id <s1> offset 8
>    - ip l2tp add session name l2tp0 tunnel_id <id1> session_id <s1>
> peer_session_id <s0> offset 8
>
> Can we assume offset is never used for L2TPv3?
If offset is ever set non-zero, packets transmitted on the wire are not 
compliant with the L2TPv3 RFC. So if someone is using a non-zero offset 
in their config, it might be a good thing that it stops working with a 
kernel update.

> Regards,
> Lorenzo
>
>>> Now what we can do I guess is:
>>> - as suggested by Guillaume drop completely the offset support without removing
>>>    netlink attribute in order to not break UABI
>>> - fix offset support initializing properly padding bits
>>>
>> I'd go for the first one. I just wonder if that looks acceptable to
>> David an James.

^ permalink raw reply

* Re: [PATCH net-next 2/2] l2tp: add peer_offset parameter
From: James Chapman @ 2018-01-02 20:08 UTC (permalink / raw)
  To: Guillaume Nault, Lorenzo Bianconi; +Cc: David S. Miller, netdev, Hangbin Liu
In-Reply-To: <20180102180557.GB1402@alphalink.fr>

On 02/01/18 18:05, Guillaume Nault wrote:
>>> Lorenzo, is this being added to fix interoperability with another L2TPv3
>>> implementation? If so, can you share more details?
>>>
>> Hi James,
>>
>> I introduced peer_offset parameter to fix a specific setup where
>> tunnel endpoints
>> running L2TPv3 would use different values for tx offset (since in
>> iproute2 there is no
>> restriction on it), not to fix a given an interoperability issue.
>>
> Yes, but was it just to test iproute2's peer_offset option? Or is there
> a plan to use it for real?
>
>> I introduced this feature since:
>>   - offset has been added for long time to L2TPv3 implementation
>>     (commit f7faffa3ff8ef6ae712ef16312b8a2aa7a1c95fe and
>>     commit 309795f4bec2d69cd507a631f82065c2198a0825) and I wanted to
>> preserve UABI
>>   - have the same degree of freedom for offset parameter we have in
>> L2TPv2 and fix the issue
>>     described above
>>
> AFAIU, the current L2TPv2 implementation never sets the offset field
> and nobody ever realised.
>
>> Now what we can do I guess is:
>> - as suggested by Guillaume drop completely the offset support without removing
>>    netlink attribute in order to not break UABI
>> - fix offset support initializing properly padding bits
>>
> I'd go for the first one. I just wonder if that looks acceptable to
> David an James.

I think the first one too. Also update iproute2 to remove or hide the 
offset and peer_offset parameters.

^ permalink raw reply

* Re: [PATCH net-next 2/2] l2tp: add peer_offset parameter
From: James Chapman @ 2018-01-02 20:08 UTC (permalink / raw)
  To: Guillaume Nault; +Cc: Lorenzo Bianconi, davem, netdev, liuhangbin
In-Reply-To: <20180102175048.GA1402@alphalink.fr>

On 02/01/18 17:50, Guillaume Nault wrote:
> On Fri, Dec 29, 2017 at 06:53:56PM +0000, James Chapman wrote:
>> On 28/12/17 19:45, Guillaume Nault wrote:
>>> Here we have an option that:
>>>     * creates invalid packets (AFAIK),
>>>     * is buggy and leaks memory on the network,
>>>     * doesn't seem to have any use case (even the manpage
>>>       says "This is hardly ever used").
>>>
>>> So I'm sorry, but I don't see the point in expanding this option to
>>> allow even stranger setups. If there's a use case, then fine.
>>> Otherwise, let's just acknowledge that the "peer_offset" option of
>>> iproute2 is a noop (and maybe remove it from the manpage).
>>>
>>> And the kernel "offset" option needs to be fixed. Actually, I wouldn't
>>> mind if it was converted to be a noop, or even rejected. L2TP already
>>> has its share of unused features that complicate the code and hamper
>>> evolution and bug fixing. As I said earlier, if it's buggy, unused and
>>> can't even produce valid packets, then why bothering with it?
>>>
>>> But that's just my point of view. James, do you have an opinion on
>>> this?
>> I agree, Guillaume.
>>
>> The L2TPv3 protocol RFC dropped the configurable offset of L2TPv2 - instead,
>> the Layer-2-Specific-Sublayer is supposed to handle any transport-specific
>> data alignment requirements.
>>
> Yes, and AFAIK, no RFC has ever defined an L2TPv3 sublayer using offsets.
>
>> I think a configurable offset has found its way
>> into iproute2 l2tp commands by mistake, perhaps because the netlink API
>> defines an attribute for it, but which was only intended for use with
>> L2TPv2.
>>
> Makes sense, however L2TP_ATTR_OFFSET seems to be a noop for L2TPv2 in
> the current implementation.
>
>> For L2TPv2, we only configure the offset for transmitted packets. In
>> received packets, the offset (if present) is obtained from the L2TPv2 header
>> in each received packet. There is no need to add a peer-offset netlink
>> attribute to set the offset expected in received packets.
>>
> Agreed for Rx side. I also agree on the theory for Tx, but in the current
> implementation, l2tp_build_l2tpv2_header() doesn't take the session's
> "offset" field into account. So, unless I've missed something,
> L2TP_ATTR_OFFSET is already a noop for L2TPv2.

You're right. My bad.

> Not sure if it's worth handling this feature of L2TPv2. The Linux
> implementation has been there for so long, and nobody ever complained
> that there was no way to define an offset on Tx.

I agree.

^ permalink raw reply

* Re: [net-next] ipv6: sr: export some functions of seg6local
From: David Lebrun @ 2018-01-02 20:11 UTC (permalink / raw)
  To: Ahmed Abdelsalam, davem, kuznet, yoshfuji; +Cc: david.lebrun, netdev
In-Reply-To: <1514581784-1527-1-git-send-email-amsalam20@gmail.com>

On 12/29/2017 09:09 PM, Ahmed Abdelsalam wrote:
> Some functions of seg6local are very useful to process SRv6
> encapsulated packets.
> 
> This patch exports some functions of seg6local that are useful and
> can be re-used at different parts of the kernel.
> 
> The set of exported functions are:
> (1) get_srh()
> (2) advance_nextseg()
> (3) lookup_nexthop

If you want to export those functions, it's better to prefix them with
e.g. seg6_:

seg6_get_srh()
seg6_advance_nextseg()
seg6_lookup_nexthop()

David

^ permalink raw reply

* Re: [patch iproute2 v4 3/3] man: Add -bs option to tc manpage
From: Marcelo Ricardo Leitner @ 2018-01-02 20:07 UTC (permalink / raw)
  To: Chris Mi; +Cc: netdev, gerlitz.or, stephen, dsahern
In-Reply-To: <20180102142804.27145-4-chrism@mellanox.com>

On Tue, Jan 02, 2018 at 11:28:04PM +0900, Chris Mi wrote:
> Signed-off-by: Chris Mi <chrism@mellanox.com>
> ---
>  man/man8/tc.8 | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/man/man8/tc.8 b/man/man8/tc.8
> index ff071b33..de137e16 100644
> --- a/man/man8/tc.8
> +++ b/man/man8/tc.8
> @@ -601,6 +601,11 @@ must exist already.
>  read commands from provided file or standard input and invoke them.
>  First failure will cause termination of tc.
>  
> +.TP
> +.BR "\-bs", " \-bs size", " \-batchsize", " \-batchsize size"
> +How many commands are accumulated before sending to kernel.
> +By default, it is 1. It only takes effect in batch mode.
> +

You should also describe the limitations it has. Like, it only works
for action and filter and that it shouldn't be mixed with other
commands.
And maybe even do such check in the code: refuse to do other commands
if batch_size > 1.

>  .TP
>  .BR "\-force"
>  don't terminate tc on errors in batch mode.
> -- 
> 2.14.3
> 

^ permalink raw reply

* Re: [patch iproute2 v4 2/3] tc: Add -bs option to batch mode
From: Marcelo Ricardo Leitner @ 2018-01-02 20:04 UTC (permalink / raw)
  To: Chris Mi; +Cc: netdev, gerlitz.or, stephen, dsahern
In-Reply-To: <20180102142804.27145-3-chrism@mellanox.com>

On Tue, Jan 02, 2018 at 11:28:03PM +0900, Chris Mi wrote:
> @@ -240,23 +244,49 @@ static int batch(const char *name)
>  	}
>  
>  	cmdlineno = 0;
> -	while (getcmdline(&line, &len, stdin) != -1) {
> +	if (getcmdline(&line, &len, stdin) == -1)
> +		goto Exit;
> +	do {
>  		char *largv[100];
>  		int largc;
>  
> +		if (getcmdline(&line2, &len, stdin) == -1)
> +			lastline = true;
> +
>  		largc = makeargs(line, largv, 100);
>  		if (largc == 0)
>  			continue;	/* blank line */

If it reads a new line, it won't process anything else after it
because line won't get updated.

  Marcelo

>  
> -		if (do_cmd(largc, largv)) {
> -			fprintf(stderr, "Command failed %s:%d\n", name, cmdlineno);
> +		line = line2;
> +		line2 = NULL;
> +		len = 0;
> +
> +		/*
> +		 * In batch mode, if we haven't accumulated enough commands
> +		 * and this is not the last command, don't send the message
> +		 * immediately.
> +		 */
> +		if (batch_size > 1 && msg_iov_index + 1 != batch_size
> +		    && !lastline)
> +			send = false;
> +		else
> +			send = true;
> +
> +		ret = do_cmd(largc, largv, batch_size, msg_iov_index++, send);
> +		if (ret < 0) {
> +			fprintf(stderr, "Command failed %s:%d\n", name,
> +				cmdlineno);
>  			ret = 1;
>  			if (!force)
>  				break;
>  		}
> -	}
> -	if (line)
> -		free(line);
> +		msg_iov_index %= batch_size;
> +	} while (!lastline);
> +
> +	free_filter_reqs();
> +	free_action_reqs();
> +Exit:
> +	free(line);
>  
>  	rtnl_close(&rth);
>  	return ret;

^ permalink raw reply

* Re: [PATCH net-next 0/5] marvell10g updates
From: David Miller @ 2018-01-02 20:01 UTC (permalink / raw)
  To: linux; +Cc: andrew, f.fainelli, netdev
In-Reply-To: <20171229124412.GA10595@n2100.armlinux.org.uk>

From: Russell King - ARM Linux <linux@armlinux.org.uk>
Date: Fri, 29 Dec 2017 12:44:13 +0000

> This series:
> - adds MDI/MDIX reporting
> - adds support for 10/100Mbps half-duplex link modes
> - adds a comment describing the setup on VF610 ZII boards (where
>   the phy interface mode doesn't change.)
> - cleans up the phy interace mode switching

Series applied, thank you.

^ permalink raw reply

* Re: [Intel-wired-lan] [PATCH] i40e: Delete an error message for a failed memory allocation in i40e_init_interrupt_scheme()
From: Jesse Brandeburg @ 2018-01-02 19:56 UTC (permalink / raw)
  To: SF Markus Elfring
  Cc: netdev, intel-wired-lan, Jeff Kirsher, kernel-janitors, LKML,
	jesse.brandeburg
In-Reply-To: <c9c151e5-bd26-5b72-d26b-c153249811ab@users.sourceforge.net>

On Mon, 1 Jan 2018 20:43:35 +0100
SF Markus Elfring <elfring@users.sourceforge.net> wrote:

> From: Markus Elfring <elfring@users.sourceforge.net>
> Date: Mon, 1 Jan 2018 20:38:14 +0100
> 
> Omit an extra message for a memory allocation failure in this function.
> 
> This issue was detected by using the Coccinelle software.
> 
> Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>

Thanks for the patch.

Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>

^ permalink raw reply

* Re: [PATCH iproute2 2/3] link_iptnl: Print tunnel mode
From: Stephen Hemminger @ 2018-01-02 19:54 UTC (permalink / raw)
  To: Serhey Popovych; +Cc: netdev
In-Reply-To: <1514906959-9719-3-git-send-email-serhe.popovych@gmail.com>

On Tue,  2 Jan 2018 17:29:18 +0200
Serhey Popovych <serhe.popovych@gmail.com> wrote:

> Tunnel mode does not appear in parameters print for iptnl
> supported tunnels like ipip and sit, while printed for
> ip6tnl.
> 
> Print tunnel mode with "proto" field for JSON and without
> any name when printing to cli to follow ip6tnl behaviour.
> 
> Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
> ---
>  ip/link_iptnl.c |   17 +++++++++++++++++
>  1 file changed, 17 insertions(+)
> 
> diff --git a/ip/link_iptnl.c b/ip/link_iptnl.c
> index d4d935b..afd1696 100644
> --- a/ip/link_iptnl.c
> +++ b/ip/link_iptnl.c
> @@ -372,6 +372,23 @@ static void iptunnel_print_opt(struct link_util *lu, FILE *f, struct rtattr *tb[
>  	if (tb[IFLA_IPTUN_COLLECT_METADATA])
>  		print_bool(PRINT_ANY, "external", "external ", true);
>  
> +	if (tb[IFLA_IPTUN_PROTO]) {
> +		switch (rta_getattr_u8(tb[IFLA_IPTUN_PROTO])) {
> +		case IPPROTO_IPIP:
> +			print_string(PRINT_ANY, "proto", "%s ", "ip4ip4");
> +			break;
> +		case IPPROTO_IPV6:
> +			print_string(PRINT_ANY, "proto", "%s ", "ip6ip");
> +			break;
> +		case IPPROTO_MPLS:
> +			print_string(PRINT_ANY, "proto", "%s ", "mplsip");
> +			break;
> +		case 0:
> +			print_string(PRINT_ANY, "proto", "%s ", "any");
> +			break;
> +		}
> +	}
> +
>  	if (tb[IFLA_IPTUN_REMOTE]) {
>  		unsigned int addr = rta_getattr_u32(tb[IFLA_IPTUN_REMOTE]);
>  

In iproute2 utilities the output format must match the input format.
I don't see "proto ip4ip4" as a valid option.

^ permalink raw reply

* Re: [PATCH v3 net-next 2/5] net: tracepoint: replace tcp_set_state tracepoint with inet_sock_set_state tracepoint
From: David Miller @ 2018-01-02 19:52 UTC (permalink / raw)
  To: brendan.d.gregg
  Cc: laoar.shao, songliubraving, marcelo.leitner, rostedt, bgregg,
	netdev, linux-kernel
In-Reply-To: <CAE40pddorDmoaGn2og-7u7vxpDY8uiWN8qEbQ05YYnK8f0Q_XQ@mail.gmail.com>

From: Brendan Gregg <brendan.d.gregg@gmail.com>
Date: Tue, 2 Jan 2018 11:46:26 -0800

> If I'm to use sock:inet_sock_set_state for TCP tracing, I'd like
> sk->sk_protocol exposed as a tracepoint argument so I can match on
> IPPROTO_TCP.

Agreed.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox