Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [RFC PATCH v2 4/5] mlx4: add support for fast rx drop bpf program
From: Jesper Dangaard Brouer @ 2016-04-08 11:41 UTC (permalink / raw)
  To: Brenden Blanco
  Cc: davem, netdev, tom, alexei.starovoitov, ogerlitz, daniel,
	eric.dumazet, ecree, john.fastabend, tgraf, johannes,
	eranlinuxmellanox, lorenzo, brouer
In-Reply-To: <1460090930-11219-4-git-send-email-bblanco@plumgrid.com>


On Thu,  7 Apr 2016 21:48:49 -0700 Brenden Blanco <bblanco@plumgrid.com> wrote:

> +int mlx4_call_bpf(struct bpf_prog *prog, void *data, unsigned int length)
> +{
> +	struct sk_buff *skb = this_cpu_ptr(&percpu_bpf_phys_dev_md);
> +	int ret;
> +
> +	build_bpf_phys_dev_md(skb, data, length);
> +
> +	rcu_read_lock();
> +	ret = BPF_PROG_RUN(prog, (void *)skb);
> +	rcu_read_unlock();
> +
> +	return ret;
> +}
[...]

> diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
> index 86bcfe5..287da02 100644
> --- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
> +++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
> @@ -840,6 +843,23 @@ int mlx4_en_process_rx_cq(struct net_device *dev, struct mlx4_en_cq *cq, int bud
>  		l2_tunnel = (dev->hw_enc_features & NETIF_F_RXCSUM) &&
>  			(cqe->vlan_my_qpn & cpu_to_be32(MLX4_CQE_L2_TUNNEL));
>  
> +		/* A bpf program gets first chance to drop the packet. It may
> +		 * read bytes but not past the end of the frag.
> +		 */
> +		if (prog) {
> +			struct ethhdr *ethh;
> +			dma_addr_t dma;
> +
> +			dma = be64_to_cpu(rx_desc->data[0].addr);
> +			dma_sync_single_for_cpu(priv->ddev, dma, sizeof(*ethh),
> +						DMA_FROM_DEVICE);
> +			ethh = page_address(frags[0].page) +
> +							frags[0].page_offset;
> +			if (mlx4_call_bpf(prog, ethh, frags[0].page_size) ==
> +							BPF_PHYS_DEV_DROP)
> +				goto next;
> +		}
> +

Do the program need to know if the packet-page we handover is
considered 'read-only' at the call site?  Or do we rely on my idea
requiring the registration to "know" this...

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply

* [PATCH RESEND] net: ethernet: renesas: ravb_main: test clock rate to avoid division by 0
From: Wolfram Sang @ 2016-04-08 11:28 UTC (permalink / raw)
  To: netdev; +Cc: Wolfram Sang, linux-renesas-soc, David Miller, Sergei Shtylyov

From: Wolfram Sang <wsa+renesas@sang-engineering.com>

The clk API may return 0 on clk_get_rate, so we should check the result before
using it as a divisor.

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Acked-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
---

The original patch was marked as "NOT APPLICABLE" in patchwork. I reviewed
again and can't see why. So, in case this remains true, please explain.

 drivers/net/ethernet/renesas/ravb_main.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/ethernet/renesas/ravb_main.c b/drivers/net/ethernet/renesas/ravb_main.c
index 4e1a7dba7c4abb..791930b63991dc 100644
--- a/drivers/net/ethernet/renesas/ravb_main.c
+++ b/drivers/net/ethernet/renesas/ravb_main.c
@@ -1691,6 +1691,9 @@ static int ravb_set_gti(struct net_device *ndev)
 	rate = clk_get_rate(clk);
 	clk_put(clk);
 
+	if (!rate)
+		return -EINVAL;
+
 	inc = 1000000000ULL << 20;
 	do_div(inc, rate);
 
-- 
2.7.0

^ permalink raw reply related

* Re: [PATCH RFC] net: decrease the length of backlog queue immediately after it's detached from sk
From: Yang Yingliang @ 2016-04-08 11:18 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev, davem, Ding Tianhong
In-Reply-To: <1460040665.6473.398.camel@edumazet-glaptop3.roam.corp.google.com>



On 2016/4/7 22:51, Eric Dumazet wrote:
> On Thu, 2016-04-07 at 03:21 -0700, Eric Dumazet wrote:
>
>> Please do not send patches before really understanding the issue you
>> have.
>>
>> Having a backlog of 12506206 bytes is ridiculous. Dropping packets is
>> absolutely fine if this ever happens.
>>
>> Something is really wrong on your host, or the sender simply does not
>> comply with TCP protocol (not caring of receiver window at all)
>>
>> Since you added a trace of truesize, please also trace skb->len
>>

[2016-04-08 18:33:39][ 9748.726948] TCP: rcvbuf:10485760 sndbuf:2097152 
limit:12582912 backloglen:12607514 rmem_alloc:0, truesize:31992, len:17540
[2016-04-08 18:33:39][ 9748.726964] TCP: rcvbuf:10485760 sndbuf:2097152 
limit:12582912 backloglen:12607514 rmem_alloc:29326, truesize:18662, 
len:10240
[2016-04-08 18:33:39][ 9748.726986] TCP: rcvbuf:10485760 sndbuf:2097152 
limit:12582912 backloglen:12607514 rmem_alloc:0, truesize:39990, len:21920
[2016-04-08 18:33:39][ 9748.727028] TCP: rcvbuf:10485760 sndbuf:2097152 
limit:12582912 backloglen:12607514 rmem_alloc:0, truesize:58652, len:32140
[2016-04-08 18:33:39][ 9748.727068] TCP: rcvbuf:10485760 sndbuf:2097152 
limit:12582912 backloglen:12607514 rmem_alloc:0, truesize:58652, len:32140
[2016-04-08 18:33:39][ 9748.727082] TCP: rcvbuf:10485760 sndbuf:2097152 
limit:12582912 backloglen:12607514 rmem_alloc:21328, truesize:5332, len:2940
[2016-04-08 18:33:39][ 9748.727310] TCP: rcvbuf:10485760 sndbuf:2097152 
limit:12582912 backloglen:12607514 rmem_alloc:0, truesize:53320, len:29220
[2016-04-08 18:33:39][ 9748.727326] TCP: rcvbuf:10485760 sndbuf:2097152 
limit:12582912 backloglen:12607514 rmem_alloc:26660, truesize:7998, len:4400
[2016-04-08 18:33:39][ 9748.727352] TCP: rcvbuf:10485760 sndbuf:2097152 
limit:12582912 backloglen:12607514 rmem_alloc:47988, truesize:58652, 
len:32140
[2016-04-08 18:33:39][ 9748.727389] TCP: rcvbuf:10485760 sndbuf:2097152 
limit:12582912 backloglen:12607514 rmem_alloc:0, truesize:39990, len:21920
[2016-04-08 18:33:39][ 9748.727409] TCP: rcvbuf:10485760 sndbuf:2097152 
limit:12582912 backloglen:12607514 rmem_alloc:58652, truesize:18662, 
len:10240

If I expand buffer 5 times((sndbuf+rcvbuf)*5). There are only 5M data in 
backlog at most.

[2016-04-08 18:33:39][ 9748.777743] TCP: rcvbuf:10485760 sndbuf:2097152 
limit:12582912 backloglen:5435954 rmem_alloc:0, truesize:55986, len:30680
[2016-04-08 18:33:39][ 9748.777762] TCP: rcvbuf:10485760 sndbuf:2097152 
limit:12582912 backloglen:5457282 rmem_alloc:58652, truesize:21328, 
len:11700
[2016-04-08 18:33:39][ 9748.777804] TCP: rcvbuf:10485760 sndbuf:2097152 
limit:12582912 backloglen:5515934 rmem_alloc:55986, truesize:58652, 
len:32140
[2016-04-08 18:33:39][ 9748.777818] TCP: rcvbuf:10485760 sndbuf:2097152 
limit:12582912 backloglen:5537262 rmem_alloc:0, truesize:21328, len:11700
[2016-04-08 18:33:39][ 9748.777839] TCP: rcvbuf:10485760 sndbuf:2097152 
limit:12582912 backloglen:5574586 rmem_alloc:0, truesize:37324, len:20460
[2016-04-08 18:33:39][ 9748.777854] TCP: rcvbuf:10485760 sndbuf:2097152 
limit:12582912 backloglen:5601246 rmem_alloc:58652, truesize:26660, 
len:14620
[2016-04-08 18:33:39][ 9748.777881] TCP: rcvbuf:10485760 sndbuf:2097152 
limit:12582912 backloglen:5659898 rmem_alloc:21328, truesize:58652, 
len:32140
[2016-04-08 18:33:39][ 9748.777894] TCP: rcvbuf:10485760 sndbuf:2097152 
limit:12582912 backloglen:5675894 rmem_alloc:37324, truesize:15996, len:8780
[2016-04-08 18:33:39][ 9748.778047] TCP: rcvbuf:10485760 sndbuf:2097152 
limit:12582912 backloglen:58652 rmem_alloc:0, truesize:58652, len:32140
[2016-04-08 18:33:39][ 9748.778075] TCP: rcvbuf:10485760 sndbuf:2097152 
limit:12582912 backloglen:117304 rmem_alloc:0, truesize:58652, len:32140
[2016-04-08 18:33:39][ 9748.778084] TCP: rcvbuf:10485760 sndbuf:2097152 
limit:12582912 backloglen:122636 rmem_alloc:0, truesize:5332, len:2940
[2016-04-08 18:33:39][ 9748.778109] TCP: rcvbuf:10485760 sndbuf:2097152 
limit:12582912 backloglen:175956 rmem_alloc:0, truesize:53320, len:29220
[2016-04-08 18:33:39][ 9748.778156] TCP: rcvbuf:10485760 sndbuf:2097152 
limit:12582912 backloglen:234608 rmem_alloc:0, truesize:58652, len:32140
[2016-04-08 18:33:39][ 9748.778178] TCP: rcvbuf:10485760 sndbuf:2097152 
limit:12582912 backloglen:282596 rmem_alloc:58652, truesize:47988, len:26300
>
> BTW, have you played with /proc/sys/net/ipv4/tcp_adv_win_scale ?
>

I expand  tcp_adv_win_scale and tcp_rmem. It has no effect.
>
>
>
>

^ permalink raw reply

* Re: [RFC PATCH v2 1/5] bpf: add PHYS_DEV prog type for early driver filter
From: Daniel Borkmann @ 2016-04-08 11:09 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, Brenden Blanco
  Cc: davem, netdev, tom, alexei.starovoitov, ogerlitz, eric.dumazet,
	ecree, john.fastabend, tgraf, johannes, eranlinuxmellanox,
	lorenzo
In-Reply-To: <20160408123614.2a15a346@redhat.com>

On 04/08/2016 12:36 PM, Jesper Dangaard Brouer wrote:
> On Thu,  7 Apr 2016 21:48:46 -0700
> Brenden Blanco <bblanco@plumgrid.com> wrote:
>
>> Add a new bpf prog type that is intended to run in early stages of the
>> packet rx path. Only minimal packet metadata will be available, hence a
>> new context type, struct bpf_phys_dev_md, is exposed to userspace. So
>> far only expose the readable packet length, and only in read mode.
>>
>> The PHYS_DEV name is chosen to represent that the program is meant only
>> for physical adapters, rather than all netdevs.
>>
>> While the user visible struct is new, the underlying context must be
>> implemented as a minimal skb in order for the packet load_* instructions
>> to work. The skb filled in by the driver must have skb->len, skb->head,
>> and skb->data set, and skb->data_len == 0.
>>
>> Signed-off-by: Brenden Blanco <bblanco@plumgrid.com>
>> ---
>>   include/uapi/linux/bpf.h | 14 ++++++++++
>>   kernel/bpf/verifier.c    |  1 +
>>   net/core/filter.c        | 68 ++++++++++++++++++++++++++++++++++++++++++++++++
>>   3 files changed, 83 insertions(+)
>>
>> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
>> index 70eda5a..3018d83 100644
>> --- a/include/uapi/linux/bpf.h
>> +++ b/include/uapi/linux/bpf.h
>> @@ -93,6 +93,7 @@ enum bpf_prog_type {
>>   	BPF_PROG_TYPE_SCHED_CLS,
>>   	BPF_PROG_TYPE_SCHED_ACT,
>>   	BPF_PROG_TYPE_TRACEPOINT,
>> +	BPF_PROG_TYPE_PHYS_DEV,
>>   };
>>
>>   #define BPF_PSEUDO_MAP_FD	1
>> @@ -368,6 +369,19 @@ struct __sk_buff {
>>   	__u32 tc_classid;
>>   };
>>
>> +/* user return codes for PHYS_DEV prog type */
>> +enum bpf_phys_dev_action {
>> +	BPF_PHYS_DEV_DROP,
>> +	BPF_PHYS_DEV_OK,
>> +};
>
> I can imagine these extra return codes:
>
>   BPF_PHYS_DEV_MODIFIED,   /* Packet page/payload modified */
>   BPF_PHYS_DEV_STOLEN,     /* E.g. forward use-case */
>   BPF_PHYS_DEV_SHARED,     /* Queue for async processing, e.g. tcpdump use-case */
>
> The "STOLEN" and "SHARED" use-cases require some refcnt manipulations,
> which we can look at when we get that far...

I'd probably let the tcpdump case be handled by the rest of the stack.
Forwarding could be to some txqueue or perhaps directly to a virtual net
device e.g. the veth end sitting in a container (where fake skb would
need to be promoted to a real one). Or, perhaps instead of veth end,
directly demuxed into a target socket queue in that container ...
Alternatively for tcpdump use-case, you could also do sampling on the
bpf_phy_dev with eBPF maps.

> For the "MODIFIED" case, people caring about checksumming, might want
> to voice their concern if we want additional info or return codes,
> indicating if software need to recalc CSUM (or e.g. if only IP-pseudo
> hdr was modified).
>
>> +/* user accessible metadata for PHYS_DEV packet hook
>> + * new fields must be added to the end of this structure
>> + */
>> +struct bpf_phys_dev_md {
>> +	__u32 len;
>> +};
>
> This is userspace visible.  I can easily imagine this structure will get
> extended.  How does a userspace eBPF program know that the struct got
> extended? (bet you got some scheme, normally I would add a "version" as
> first elem).
>
> BTW, how and where is this "struct bpf_phys_dev_md" allocated?

It isn't, see bpf_phys_dev_convert_ctx_access(). At load time the verifier
will convert/rewrite this based on offsetof() to a real access of the per
cpu sk_buff, that's the only purpose.

Cheers,
Daniel

^ permalink raw reply

* Re: [RFC PATCH v2 1/5] bpf: add PHYS_DEV prog type for early driver filter
From: Jesper Dangaard Brouer @ 2016-04-08 10:36 UTC (permalink / raw)
  To: Brenden Blanco
  Cc: davem, netdev, tom, alexei.starovoitov, ogerlitz, daniel,
	eric.dumazet, ecree, john.fastabend, tgraf, johannes,
	eranlinuxmellanox, lorenzo, brouer
In-Reply-To: <1460090930-11219-1-git-send-email-bblanco@plumgrid.com>

On Thu,  7 Apr 2016 21:48:46 -0700
Brenden Blanco <bblanco@plumgrid.com> wrote:

> Add a new bpf prog type that is intended to run in early stages of the
> packet rx path. Only minimal packet metadata will be available, hence a
> new context type, struct bpf_phys_dev_md, is exposed to userspace. So
> far only expose the readable packet length, and only in read mode.
> 
> The PHYS_DEV name is chosen to represent that the program is meant only
> for physical adapters, rather than all netdevs.
> 
> While the user visible struct is new, the underlying context must be
> implemented as a minimal skb in order for the packet load_* instructions
> to work. The skb filled in by the driver must have skb->len, skb->head,
> and skb->data set, and skb->data_len == 0.
> 
> Signed-off-by: Brenden Blanco <bblanco@plumgrid.com>
> ---
>  include/uapi/linux/bpf.h | 14 ++++++++++
>  kernel/bpf/verifier.c    |  1 +
>  net/core/filter.c        | 68 ++++++++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 83 insertions(+)
> 
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 70eda5a..3018d83 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -93,6 +93,7 @@ enum bpf_prog_type {
>  	BPF_PROG_TYPE_SCHED_CLS,
>  	BPF_PROG_TYPE_SCHED_ACT,
>  	BPF_PROG_TYPE_TRACEPOINT,
> +	BPF_PROG_TYPE_PHYS_DEV,
>  };
>  
>  #define BPF_PSEUDO_MAP_FD	1
> @@ -368,6 +369,19 @@ struct __sk_buff {
>  	__u32 tc_classid;
>  };
>  
> +/* user return codes for PHYS_DEV prog type */
> +enum bpf_phys_dev_action {
> +	BPF_PHYS_DEV_DROP,
> +	BPF_PHYS_DEV_OK,
> +};

I can imagine these extra return codes:

 BPF_PHYS_DEV_MODIFIED,   /* Packet page/payload modified */
 BPF_PHYS_DEV_STOLEN,     /* E.g. forward use-case */
 BPF_PHYS_DEV_SHARED,     /* Queue for async processing, e.g. tcpdump use-case */

The "STOLEN" and "SHARED" use-cases require some refcnt manipulations,
which we can look at when we get that far...

For the "MODIFIED" case, people caring about checksumming, might want
to voice their concern if we want additional info or return codes,
indicating if software need to recalc CSUM (or e.g. if only IP-pseudo
hdr was modified).


> +/* user accessible metadata for PHYS_DEV packet hook
> + * new fields must be added to the end of this structure
> + */
> +struct bpf_phys_dev_md {
> +	__u32 len;
> +};

This is userspace visible.  I can easily imagine this structure will get
extended.  How does a userspace eBPF program know that the struct got
extended? (bet you got some scheme, normally I would add a "version" as
first elem).

BTW, how and where is this "struct bpf_phys_dev_md" allocated?


>  struct bpf_tunnel_key {
>  	__u32 tunnel_id;
>  	union {
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 58792fe..877542d 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -1344,6 +1344,7 @@ static bool may_access_skb(enum bpf_prog_type type)
>  	case BPF_PROG_TYPE_SOCKET_FILTER:
>  	case BPF_PROG_TYPE_SCHED_CLS:
>  	case BPF_PROG_TYPE_SCHED_ACT:
> +	case BPF_PROG_TYPE_PHYS_DEV:
>  		return true;
>  	default:
>  		return false;
> diff --git a/net/core/filter.c b/net/core/filter.c
> index e8486ba..4f73fb9 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -2021,6 +2021,12 @@ tc_cls_act_func_proto(enum bpf_func_id func_id)
>  	}
>  }
>  
> +static const struct bpf_func_proto *
> +phys_dev_func_proto(enum bpf_func_id func_id)
> +{
> +	return sk_filter_func_proto(func_id);
> +}
> +
>  static bool __is_valid_access(int off, int size, enum bpf_access_type type)
>  {
>  	/* check bounds */
> @@ -2076,6 +2082,36 @@ static bool tc_cls_act_is_valid_access(int off, int size,
>  	return __is_valid_access(off, size, type);
>  }
>  
> +static bool __is_valid_phys_dev_access(int off, int size,
> +				       enum bpf_access_type type)
> +{
> +	if (off < 0 || off >= sizeof(struct bpf_phys_dev_md))
> +		return false;
> +
> +	if (off % size != 0)
> +		return false;
> +
> +	if (size != 4)
> +		return false;
> +
> +	return true;
> +}
> +
> +static bool phys_dev_is_valid_access(int off, int size,
> +				     enum bpf_access_type type)
> +{
> +	if (type == BPF_WRITE)
> +		return false;
> +
> +	switch (off) {
> +	case offsetof(struct bpf_phys_dev_md, len):
> +		break;
> +	default:
> +		return false;
> +	}
> +	return __is_valid_phys_dev_access(off, size, type);
> +}
> +
>  static u32 bpf_net_convert_ctx_access(enum bpf_access_type type, int dst_reg,
>  				      int src_reg, int ctx_off,
>  				      struct bpf_insn *insn_buf,
> @@ -2213,6 +2249,26 @@ static u32 bpf_net_convert_ctx_access(enum bpf_access_type type, int dst_reg,
>  	return insn - insn_buf;
>  }
>  
> +static u32 bpf_phys_dev_convert_ctx_access(enum bpf_access_type type,
> +					   int dst_reg, int src_reg,
> +					   int ctx_off,
> +					   struct bpf_insn *insn_buf,
> +					   struct bpf_prog *prog)
> +{
> +	struct bpf_insn *insn = insn_buf;
> +
> +	switch (ctx_off) {
> +	case offsetof(struct bpf_phys_dev_md, len):
> +		BUILD_BUG_ON(FIELD_SIZEOF(struct sk_buff, len) != 4);
> +
> +		*insn++ = BPF_LDX_MEM(BPF_W, dst_reg, src_reg,
> +				      offsetof(struct sk_buff, len));
> +		break;
> +	}
> +
> +	return insn - insn_buf;
> +}
> +
>  static const struct bpf_verifier_ops sk_filter_ops = {
>  	.get_func_proto = sk_filter_func_proto,
>  	.is_valid_access = sk_filter_is_valid_access,
> @@ -2225,6 +2281,12 @@ static const struct bpf_verifier_ops tc_cls_act_ops = {
>  	.convert_ctx_access = bpf_net_convert_ctx_access,
>  };
>  
> +static const struct bpf_verifier_ops phys_dev_ops = {
> +	.get_func_proto = phys_dev_func_proto,
> +	.is_valid_access = phys_dev_is_valid_access,
> +	.convert_ctx_access = bpf_phys_dev_convert_ctx_access,
> +};
> +
>  static struct bpf_prog_type_list sk_filter_type __read_mostly = {
>  	.ops = &sk_filter_ops,
>  	.type = BPF_PROG_TYPE_SOCKET_FILTER,
> @@ -2240,11 +2302,17 @@ static struct bpf_prog_type_list sched_act_type __read_mostly = {
>  	.type = BPF_PROG_TYPE_SCHED_ACT,
>  };
>  
> +static struct bpf_prog_type_list phys_dev_type __read_mostly = {
> +	.ops = &phys_dev_ops,
> +	.type = BPF_PROG_TYPE_PHYS_DEV,
> +};
> +
>  static int __init register_sk_filter_ops(void)
>  {
>  	bpf_register_prog_type(&sk_filter_type);
>  	bpf_register_prog_type(&sched_cls_type);
>  	bpf_register_prog_type(&sched_act_type);
> +	bpf_register_prog_type(&phys_dev_type);
>  
>  	return 0;
>  }



-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply

* Re: [RFC PATCH v2 2/5] net: add ndo to set bpf prog in adapter rx
From: Jesper Dangaard Brouer @ 2016-04-08  9:38 UTC (permalink / raw)
  To: Brenden Blanco
  Cc: davem, netdev, tom, alexei.starovoitov, ogerlitz, daniel,
	eric.dumazet, ecree, john.fastabend, tgraf, johannes,
	eranlinuxmellanox, lorenzo, brouer
In-Reply-To: <1460090930-11219-2-git-send-email-bblanco@plumgrid.com>

On Thu,  7 Apr 2016 21:48:47 -0700 Brenden Blanco <bblanco@plumgrid.com> wrote:

> Add two new set/get netdev ops for drivers implementing the
> BPF_PROG_TYPE_PHYS_DEV filter.
> 
> Signed-off-by: Brenden Blanco <bblanco@plumgrid.com>
[...]
> 
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index cb4e508..3acf732 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
[...]
> @@ -1102,6 +1103,14 @@ struct tc_to_netdev {
>   *	appropriate rx headroom value allows avoiding skb head copy on
>   *	forward. Setting a negative value resets the rx headroom to the
>   *	default value.
> + * int  (*ndo_bpf_set)(struct net_device *dev, struct bpf_prog *prog);
> + *	This function is used to set or clear a bpf program used in the
> + *	earliest stages of packet rx. The prog will have been loaded as
> + *	BPF_PROG_TYPE_PHYS_DEV. The callee is responsible for calling
> + *	bpf_prog_put on any old progs that are stored, but not on the passed
> + *	in prog.
> + * bool (*ndo_bpf_get)(struct net_device *dev);
> + *	This function is used to check if a bpf program is set on the device.
>   *

This interface for the entire device, right.  I can imagine that users
want to attach a eBPF program per RX queue.  Can we adapt the interface
to support this? (could default to adding it all queues)

I'm also wondering if we should add a "flags" parameter.  Or maybe we
can extend 'struct bpf_prog' with I have in mind.

When the eBPF program is attached to a RX queue, I want to know if the
program want to modify packet-data, up-front.

The problem is that most drivers use dma_sync, which means that data is
considered 'read-only' (the "considered" part depend on DMA engine, and
we might find a DMA loop-hole for some configs).
  If the program want to write, the driver have the option of
reconfiguring the driver routine to use dma_unmap, before handing over
the page.  Or driver can also choose to realloc the specific RX ring
queue pages as single pages (using dma_map/unmap consistently).
 This also allow us to give a return code indicating given driver does
not support writable packet-pages (rejecting program).

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply

* [PATCH] ipv6: rework the lock in addrconf_permanent_addr
From: roy.qing.li @ 2016-04-08  9:22 UTC (permalink / raw)
  To: netdev

From: Li RongQing <roy.qing.li@gmail.com>

1. nothing of idev is changed, so read lock is enough
2. ifp is changed, so used ifp->lock or cmpxchg to protect it

Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
---
 net/ipv6/addrconf.c | 26 ++++++++++++++++++++------
 1 file changed, 20 insertions(+), 6 deletions(-)

diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 27aed1a..f6e7605b 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -3184,14 +3184,21 @@ static void addrconf_gre_config(struct net_device *dev)
 static void l3mdev_check_host_rt(struct inet6_dev *idev,
 				  struct inet6_ifaddr *ifp)
 {
+	struct rt6_info *rt = NULL;
+
+	spin_lock(&ifp->lock);
 	if (ifp->rt) {
 		u32 tb_id = l3mdev_fib_table(idev->dev) ? : RT6_TABLE_LOCAL;
 
 		if (tb_id != ifp->rt->rt6i_table->tb6_id) {
-			ip6_del_rt(ifp->rt);
+			rt = ifp->rt;
 			ifp->rt = NULL;
 		}
 	}
+	spin_unlock(&ifp->lock);
+
+	if (rt)
+		ip6_del_rt(rt);
 }
 #else
 static void l3mdev_check_host_rt(struct inet6_dev *idev,
@@ -3203,6 +3210,8 @@ static void l3mdev_check_host_rt(struct inet6_dev *idev,
 static int fixup_permanent_addr(struct inet6_dev *idev,
 				struct inet6_ifaddr *ifp)
 {
+	struct rt6_info *prev;
+
 	l3mdev_check_host_rt(idev, ifp);
 
 	if (!ifp->rt) {
@@ -3212,7 +3221,12 @@ static int fixup_permanent_addr(struct inet6_dev *idev,
 		if (unlikely(IS_ERR(rt)))
 			return PTR_ERR(rt);
 
-		ifp->rt = rt;
+		prev = cmpxchg(&ifp->rt, NULL, rt);
+
+		/*if cmpxchg failed*/
+		if (prev) {
+			ip6_rt_put(rt);
+		}
 	}
 
 	if (!(ifp->flags & IFA_F_NOPREFIXROUTE)) {
@@ -3234,21 +3248,21 @@ static void addrconf_permanent_addr(struct net_device *dev)
 	if (!idev)
 		return;
 
-	write_lock_bh(&idev->lock);
+	read_lock_bh(&idev->lock);
 
 	list_for_each_entry_safe(ifp, tmp, &idev->addr_list, if_list) {
 		if ((ifp->flags & IFA_F_PERMANENT) &&
 		    fixup_permanent_addr(idev, ifp) < 0) {
-			write_unlock_bh(&idev->lock);
+			read_unlock_bh(&idev->lock);
 			ipv6_del_addr(ifp);
-			write_lock_bh(&idev->lock);
+			read_lock_bh(&idev->lock);
 
 			net_info_ratelimited("%s: Failed to add prefix route for address %pI6c; dropping\n",
 					     idev->dev->name, &ifp->addr);
 		}
 	}
 
-	write_unlock_bh(&idev->lock);
+	read_unlock_bh(&idev->lock);
 }
 
 static int addrconf_notify(struct notifier_block *this, unsigned long event,
-- 
2.1.4

^ permalink raw reply related

* [PATCH net-next] net: ethernet: stmmac: GMAC4.xx: Fix TX descriptor preparation
From: Alexandre TORGUE @ 2016-04-08  9:18 UTC (permalink / raw)
  To: netdev, peppe.cavallaro; +Cc: dan.carpenter, kernel-janitors

On GMAC4.xx each descriptor contains 2 buffers of 16KB (each).
Initially, those 2 buffers was filled in dwmac4_rd_prepare_tx_desc but
it is actually not needed. Indeed, stmmac driver supports frame up to
9000 bytes (jumbo). So only one buffer is needed.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Alexandre TORGUE <alexandre.torgue@st.com>

diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c b/drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c
index d4952c7..4ec7397 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c
@@ -254,14 +254,7 @@ static void dwmac4_rd_prepare_tx_desc(struct dma_desc *p, int is_fs, int len,
 {
 	unsigned int tdes3 = p->des3;
 
-	if (unlikely(len > BUF_SIZE_16KiB)) {
-		p->des2 |= (((len - BUF_SIZE_16KiB) <<
-			     TDES2_BUFFER2_SIZE_MASK_SHIFT)
-			    & TDES2_BUFFER2_SIZE_MASK)
-			    | (BUF_SIZE_16KiB & TDES2_BUFFER1_SIZE_MASK);
-	} else {
-		p->des2 |= (len & TDES2_BUFFER1_SIZE_MASK);
-	}
+	p->des2 |= (len & TDES2_BUFFER1_SIZE_MASK);
 
 	if (is_fs)
 		tdes3 |= TDES3_FIRST_DESCRIPTOR;
-- 
1.9.1

^ permalink raw reply related

* Re: [PATCH net] tuntap: restore default qdisc
From: Phil Sutter @ 2016-04-08  9:02 UTC (permalink / raw)
  To: Jason Wang; +Cc: davem, netdev, linux-kernel, mst
In-Reply-To: <1460093208-4364-1-git-send-email-jasowang@redhat.com>

On Fri, Apr 08, 2016 at 01:26:48PM +0800, Jason Wang wrote:
> After commit f84bb1eac027 ("net: fix IFF_NO_QUEUE for drivers using
> alloc_netdev"), default qdisc was changed to noqueue because
> tuntap does not set tx_queue_len during .setup(). This patch restores
> default qdisc by setting tx_queue_len in tun_setup().
> 
> Fixes: f84bb1eac027 ("net: fix IFF_NO_QUEUE for drivers using alloc_netdev")
> Cc: Phil Sutter <phil@nwl.cc>
> Signed-off-by: Jason Wang <jasowang@redhat.com>

Acked-by: Phil Sutter <phil@nwl.cc>

^ permalink raw reply

* Re: [PATCH ethtool] ethtool.c: fix memory leaks
From: Ivan Vecera @ 2016-04-08  8:06 UTC (permalink / raw)
  To: netdev; +Cc: bwh
In-Reply-To: <1458303855-4976-1-git-send-email-ivecera@redhat.com>

On 18.3.2016 13:24, Ivan Vecera wrote:
> Memory allocated at several places is not appropriately freed.
>
> Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Ben, ping.

I.
> ---
>   ethtool.c | 60 +++++++++++++++++++++++++++++++++++++++++++++---------------
>   1 file changed, 45 insertions(+), 15 deletions(-)
>
> diff --git a/ethtool.c b/ethtool.c
> index 0cd0d4f..ca0bf28 100644
> --- a/ethtool.c
> +++ b/ethtool.c
> @@ -2065,10 +2065,14 @@ static int do_gfeatures(struct cmd_context *ctx)
>   	features = get_features(ctx, defs);
>   	if (!features) {
>   		fprintf(stdout, "no feature info available\n");
> +		free(defs);
>   		return 1;
>   	}
>
>   	dump_features(defs, features, NULL);
> +
> +	free(features);
> +	free(defs);
>   	return 0;
>   }
>
> @@ -2078,11 +2082,11 @@ static int do_sfeatures(struct cmd_context *ctx)
>   	int any_changed = 0, any_mismatch = 0;
>   	u32 off_flags_wanted = 0;
>   	u32 off_flags_mask = 0;
> -	struct ethtool_sfeatures *efeatures;
> +	struct ethtool_sfeatures *efeatures = NULL;
>   	struct cmdline_info *cmdline_features;
> -	struct feature_state *old_state, *new_state;
> +	struct feature_state *old_state = NULL, *new_state = NULL;
>   	struct ethtool_value eval;
> -	int err;
> +	int err, retval = 1;
>   	int i, j;
>
>   	defs = get_feature_defs(ctx);
> @@ -2096,7 +2100,7 @@ static int do_sfeatures(struct cmd_context *ctx)
>   				   sizeof(efeatures->features[0]));
>   		if (!efeatures) {
>   			perror("Cannot parse arguments");
> -			return 1;
> +			goto finish;
>   		}
>   		efeatures->cmd = ETHTOOL_SFEATURES;
>   		efeatures->size = FEATURE_BITS_TO_BLOCKS(defs->n_features);
> @@ -2114,7 +2118,7 @@ static int do_sfeatures(struct cmd_context *ctx)
>   				  sizeof(cmdline_features[0]));
>   	if (!cmdline_features) {
>   		perror("Cannot parse arguments");
> -		return 1;
> +		goto finish;
>   	}
>   	for (i = 0; i < ARRAY_SIZE(off_flag_def); i++)
>   		flag_to_cmdline_info(off_flag_def[i].short_name,
> @@ -2133,12 +2137,13 @@ static int do_sfeatures(struct cmd_context *ctx)
>
>   	if (!any_changed) {
>   		fprintf(stdout, "no features changed\n");
> -		return 0;
> +		retval = 0;
> +		goto finish;
>   	}
>
>   	old_state = get_features(ctx, defs);
>   	if (!old_state)
> -		return 1;
> +		goto finish;
>
>   	if (efeatures) {
>   		/* For each offload that the user specified, update any
> @@ -2182,7 +2187,7 @@ static int do_sfeatures(struct cmd_context *ctx)
>   		err = send_ioctl(ctx, efeatures);
>   		if (err < 0) {
>   			perror("Cannot set device feature settings");
> -			return 1;
> +			goto finish;
>   		}
>   	} else {
>   		for (i = 0; i < ARRAY_SIZE(off_flag_def); i++) {
> @@ -2197,7 +2202,7 @@ static int do_sfeatures(struct cmd_context *ctx)
>   					fprintf(stderr,
>   						"Cannot set device %s settings: %m\n",
>   						off_flag_def[i].long_name);
> -					return 1;
> +					goto finish;
>   				}
>   			}
>   		}
> @@ -2211,7 +2216,8 @@ static int do_sfeatures(struct cmd_context *ctx)
>   			err = send_ioctl(ctx, &eval);
>   			if (err) {
>   				perror("Cannot set device flag settings");
> -				return 92;
> +				retval = 92;
> +				goto finish;
>   			}
>   		}
>   	}
> @@ -2219,7 +2225,7 @@ static int do_sfeatures(struct cmd_context *ctx)
>   	/* Compare new state with requested state */
>   	new_state = get_features(ctx, defs);
>   	if (!new_state)
> -		return 1;
> +		goto finish;
>   	any_changed = new_state->off_flags != old_state->off_flags;
>   	any_mismatch = (new_state->off_flags !=
>   			((old_state->off_flags & ~off_flags_mask) |
> @@ -2238,13 +2244,19 @@ static int do_sfeatures(struct cmd_context *ctx)
>   		if (!any_changed) {
>   			fprintf(stderr,
>   				"Could not change any device features\n");
> -			return 1;
> +			goto finish;
>   		}
>   		printf("Actual changes:\n");
>   		dump_features(defs, new_state, old_state);
>   	}
>
> -	return 0;
> +	retval = 0;
> +finish:
> +	free(new_state);
> +	free(old_state);
> +	free(efeatures);
> +	free(defs);
> +	return retval;
>   }
>
>   static int do_gset(struct cmd_context *ctx)
> @@ -2705,8 +2717,18 @@ static int do_gregs(struct cmd_context *ctx)
>   			return 75;
>   		}
>
> -		regs = realloc(regs, sizeof(*regs) + st.st_size);
> -		regs->len = st.st_size;
> +		if (regs->len != st.st_size) {
> +			struct ethtool_regs *new_regs;
> +			new_regs = realloc(regs, sizeof(*regs) + st.st_size);
> +			if (!new_regs) {
> +				perror("Cannot allocate memory for register "
> +				       "dump");
> +				free(regs);
> +				return 73;
> +			}
> +			regs = new_regs;
> +			regs->len = st.st_size;
> +		}
>   		nread = fread(regs->data, regs->len, 1, f);
>   		fclose(f);
>   		if (nread != 1) {
> @@ -3775,6 +3797,7 @@ static int do_gprivflags(struct cmd_context *ctx)
>   	}
>   	if (strings->len == 0) {
>   		fprintf(stderr, "No private flags defined\n");
> +		free(strings);
>   		return 1;
>   	}
>   	if (strings->len > 32) {
> @@ -3786,6 +3809,7 @@ static int do_gprivflags(struct cmd_context *ctx)
>   	flags.cmd = ETHTOOL_GPFLAGS;
>   	if (send_ioctl(ctx, &flags)) {
>   		perror("Cannot get private flags");
> +		free(strings);
>   		return 1;
>   	}
>
> @@ -3804,6 +3828,7 @@ static int do_gprivflags(struct cmd_context *ctx)
>   		       (const char *)strings->data + i * ETH_GSTRING_LEN,
>   		       (flags.data & (1U << i)) ? "on" : "off");
>
> +	free(strings);
>   	return 0;
>   }
>
> @@ -3825,6 +3850,7 @@ static int do_sprivflags(struct cmd_context *ctx)
>   	}
>   	if (strings->len == 0) {
>   		fprintf(stderr, "No private flags defined\n");
> +		free(strings);
>   		return 1;
>   	}
>   	if (strings->len > 32) {
> @@ -3836,6 +3862,7 @@ static int do_sprivflags(struct cmd_context *ctx)
>   	cmdline = calloc(strings->len, sizeof(*cmdline));
>   	if (!cmdline) {
>   		perror("Cannot parse arguments");
> +		free(strings);
>   		return 1;
>   	}
>   	for (i = 0; i < strings->len; i++) {
> @@ -3852,6 +3879,7 @@ static int do_sprivflags(struct cmd_context *ctx)
>   	flags.cmd = ETHTOOL_GPFLAGS;
>   	if (send_ioctl(ctx, &flags)) {
>   		perror("Cannot get private flags");
> +		free(strings);
>   		return 1;
>   	}
>
> @@ -3859,9 +3887,11 @@ static int do_sprivflags(struct cmd_context *ctx)
>   	flags.data = (flags.data & ~seen_flags) | wanted_flags;
>   	if (send_ioctl(ctx, &flags)) {
>   		perror("Cannot set private flags");
> +		free(strings);
>   		return 1;
>   	}
>
> +	free(strings);
>   	return 0;
>   }
>
>

^ permalink raw reply

* [PATCH V3 3/8] net: mediatek: remove superfluous reset call
From: John Crispin @ 2016-04-07 22:54 UTC (permalink / raw)
  To: David S. Miller
  Cc: Felix Fietkau, Matthias Brugger,
	Sean Wang (王志亘), netdev, linux-mediatek,
	linux-kernel, John Crispin
In-Reply-To: <1460069651-1234-1-git-send-email-blogic@openwrt.org>

HW reset is triggered in the mtk_hw_init() function. There is no need to
also reset the core during probe.

Signed-off-by: John Crispin <blogic@openwrt.org>
---
 drivers/net/ethernet/mediatek/mtk_eth_soc.c |    4 ----
 1 file changed, 4 deletions(-)

diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
index 94cceb8..a4982e4 100644
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
@@ -1679,10 +1679,6 @@ static int mtk_probe(struct platform_device *pdev)
 	struct mtk_eth *eth;
 	int err;
 
-	err = device_reset(&pdev->dev);
-	if (err)
-		return err;
-
 	match = of_match_device(of_mtk_match, &pdev->dev);
 	soc = (struct mtk_soc_data *)match->data;
 
-- 
1.7.10.4

^ permalink raw reply related

* Re: [net-next,2/2] rxrpc: do not pull udp headers on receive
From: Thierry Reding @ 2016-04-08  7:48 UTC (permalink / raw)
  To: Willem de Bruijn; +Cc: netdev, davem, fcooper, samanthakumar, Willem de Bruijn
In-Reply-To: <1460043899-56894-3-git-send-email-willemdebruijn.kernel@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 703 bytes --]

On Thu, Apr 07, 2016 at 11:44:59AM -0400, Willem de Bruijn wrote:
> From: Willem de Bruijn <willemb@google.com>
> 
> Commit e6afc8ace6dd modified the udp receive path by pulling the udp
> header before queuing an skbuff onto the receive queue.
> 
> Rxrpc also calls skb_recv_datagram to dequeue an skb from a udp
> socket. Modify this receive path to also no longer expect udp
> headers.
> 
> Fixes: e6afc8ace6dd ("udp: remove headers from UDP packets before queueing")
> 
> Signed-off-by: Willem de Bruijn <willemb@google.com>
> ---
>  net/rxrpc/ar-input.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

To be explicit:

Tested-by: Thierry Reding <treding@nvidia.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply

* Re: [PATCH net] tuntap: restore default qdisc
From: Michael S. Tsirkin @ 2016-04-08  7:47 UTC (permalink / raw)
  To: Jason Wang; +Cc: davem, netdev, linux-kernel, Phil Sutter
In-Reply-To: <1460093208-4364-1-git-send-email-jasowang@redhat.com>

On Fri, Apr 08, 2016 at 01:26:48PM +0800, Jason Wang wrote:
> After commit f84bb1eac027 ("net: fix IFF_NO_QUEUE for drivers using
> alloc_netdev"), default qdisc was changed to noqueue because
> tuntap does not set tx_queue_len during .setup(). This patch restores
> default qdisc by setting tx_queue_len in tun_setup().
> 
> Fixes: f84bb1eac027 ("net: fix IFF_NO_QUEUE for drivers using alloc_netdev")
> Cc: Phil Sutter <phil@nwl.cc>
> Signed-off-by: Jason Wang <jasowang@redhat.com>

Acked-by: Michael S. Tsirkin <mst@redhat.com>

> ---
>  drivers/net/tun.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> index 510e90a..2c9e45f5 100644
> --- a/drivers/net/tun.c
> +++ b/drivers/net/tun.c
> @@ -1015,7 +1015,6 @@ static void tun_net_init(struct net_device *dev)
>  		/* Zero header length */
>  		dev->type = ARPHRD_NONE;
>  		dev->flags = IFF_POINTOPOINT | IFF_NOARP | IFF_MULTICAST;
> -		dev->tx_queue_len = TUN_READQ_SIZE;  /* We prefer our own queue length */
>  		break;
>  
>  	case IFF_TAP:
> @@ -1027,7 +1026,6 @@ static void tun_net_init(struct net_device *dev)
>  
>  		eth_hw_addr_random(dev);
>  
> -		dev->tx_queue_len = TUN_READQ_SIZE;  /* We prefer our own queue length */
>  		break;
>  	}
>  }
> @@ -1481,6 +1479,8 @@ static void tun_setup(struct net_device *dev)
>  
>  	dev->ethtool_ops = &tun_ethtool_ops;
>  	dev->destructor = tun_free_netdev;
> +	/* We prefer our own queue length */
> +	dev->tx_queue_len = TUN_READQ_SIZE;
>  }
>  
>  /* Trivial set of netlink ops to allow deleting tun or tap
> -- 
> 2.5.0

^ permalink raw reply

* Re: [net-next,1/2] sunrpc: do not pull udp headers on receive
From: Thierry Reding @ 2016-04-08  7:47 UTC (permalink / raw)
  To: Willem de Bruijn; +Cc: netdev, davem, fcooper, samanthakumar, Willem de Bruijn
In-Reply-To: <1460043899-56894-2-git-send-email-willemdebruijn.kernel@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 994 bytes --]

On Thu, Apr 07, 2016 at 11:44:58AM -0400, Willem de Bruijn wrote:
> From: Willem de Bruijn <willemb@google.com>
> 
> Commit e6afc8ace6dd modified the udp receive path by pulling the udp
> header before queuing an skbuff onto the receive queue.
> 
> Sunrpc also calls skb_recv_datagram to dequeue an skb from a udp
> socket. Modify this receive path to also no longer expect udp
> headers.
> 
> Fixes: e6afc8ace6dd ("udp: remove headers from UDP packets before queueing")
> 
> Reported-by: Franklin S Cooper Jr. <fcooper@ti.com>
> Signed-off-by: Willem de Bruijn <willemb@google.com>
> ---
>  net/sunrpc/socklib.c  | 2 +-
>  net/sunrpc/svcsock.c  | 5 ++---
>  net/sunrpc/xprtsock.c | 5 ++---
>  3 files changed, 5 insertions(+), 7 deletions(-)

Applying this and patch 2/2 (along with a couple of unrelated regulator
fixes) on top of next-20160408 fixes booting from NFS root for me on
Jetson TK1. Thanks for fixing this.

Tested-by: Thierry Reding <treding@nvidia.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply

* [PATCH net] tuntap: restore default qdisc
From: Jason Wang @ 2016-04-08  5:26 UTC (permalink / raw)
  To: davem, netdev, linux-kernel; +Cc: mst, Jason Wang, Phil Sutter

After commit f84bb1eac027 ("net: fix IFF_NO_QUEUE for drivers using
alloc_netdev"), default qdisc was changed to noqueue because
tuntap does not set tx_queue_len during .setup(). This patch restores
default qdisc by setting tx_queue_len in tun_setup().

Fixes: f84bb1eac027 ("net: fix IFF_NO_QUEUE for drivers using alloc_netdev")
Cc: Phil Sutter <phil@nwl.cc>
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 drivers/net/tun.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 510e90a..2c9e45f5 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1015,7 +1015,6 @@ static void tun_net_init(struct net_device *dev)
 		/* Zero header length */
 		dev->type = ARPHRD_NONE;
 		dev->flags = IFF_POINTOPOINT | IFF_NOARP | IFF_MULTICAST;
-		dev->tx_queue_len = TUN_READQ_SIZE;  /* We prefer our own queue length */
 		break;
 
 	case IFF_TAP:
@@ -1027,7 +1026,6 @@ static void tun_net_init(struct net_device *dev)
 
 		eth_hw_addr_random(dev);
 
-		dev->tx_queue_len = TUN_READQ_SIZE;  /* We prefer our own queue length */
 		break;
 	}
 }
@@ -1481,6 +1479,8 @@ static void tun_setup(struct net_device *dev)
 
 	dev->ethtool_ops = &tun_ethtool_ops;
 	dev->destructor = tun_free_netdev;
+	/* We prefer our own queue length */
+	dev->tx_queue_len = TUN_READQ_SIZE;
 }
 
 /* Trivial set of netlink ops to allow deleting tun or tap
-- 
2.5.0

^ permalink raw reply related

* [RFC PATCH v2 5/5] Add sample for adding simple drop program to link
From: Brenden Blanco @ 2016-04-08  4:48 UTC (permalink / raw)
  To: davem
  Cc: Brenden Blanco, netdev, tom, alexei.starovoitov, ogerlitz, daniel,
	brouer, eric.dumazet, ecree, john.fastabend, tgraf, johannes,
	eranlinuxmellanox, lorenzo
In-Reply-To: <1460090930-11219-1-git-send-email-bblanco@plumgrid.com>

Add a sample program that only drops packets at the
BPF_PROG_TYPE_PHYS_DEV hook of a link. With the drop-only program,
observed single core rate is ~19.5Mpps.

Other tests were run, for instance without the dropcnt increment or
without reading from the packet header, the packet rate was mostly
unchanged.

$ perf record -a samples/bpf/netdrvx1 $(</sys/class/net/eth0/ifindex)
proto 17:   19596362 drops/s

./pktgen_sample03_burst_single_flow.sh -i $DEV -d $IP -m $MAC -t 4
Running... ctrl^C to stop
Device: eth4@0
Result: OK: 7873817(c7872245+d1572) usec, 38801823 (60byte,0frags)
  4927955pps 2365Mb/sec (2365418400bps) errors: 0
Device: eth4@1
Result: OK: 7873817(c7872123+d1693) usec, 38587342 (60byte,0frags)
  4900715pps 2352Mb/sec (2352343200bps) errors: 0
Device: eth4@2
Result: OK: 7873817(c7870929+d2888) usec, 38718848 (60byte,0frags)
  4917417pps 2360Mb/sec (2360360160bps) errors: 0
Device: eth4@3
Result: OK: 7873818(c7872193+d1625) usec, 38796346 (60byte,0frags)
  4927259pps 2365Mb/sec (2365084320bps) errors: 0

perf report --no-children:
 29.48%  ksoftirqd/6  [mlx4_en]         [k] mlx4_en_process_rx_cq
 18.17%  ksoftirqd/6  [mlx4_en]         [k] mlx4_en_alloc_frags
  8.19%  ksoftirqd/6  [mlx4_en]         [k] mlx4_en_free_frag
  5.35%  ksoftirqd/6  [kernel.vmlinux]  [k] get_page_from_freelist
  2.92%  ksoftirqd/6  [kernel.vmlinux]  [k] free_pages_prepare
  2.90%  ksoftirqd/6  [mlx4_en]         [k] mlx4_call_bpf
  2.72%  ksoftirqd/6  [fjes]            [k] 0x000000000000af66
  2.37%  ksoftirqd/6  [kernel.vmlinux]  [k] swiotlb_sync_single_for_cpu
  1.92%  ksoftirqd/6  [kernel.vmlinux]  [k] percpu_array_map_lookup_elem
  1.83%  ksoftirqd/6  [kernel.vmlinux]  [k] free_one_page
  1.70%  ksoftirqd/6  [kernel.vmlinux]  [k] swiotlb_sync_single
  1.69%  ksoftirqd/6  [kernel.vmlinux]  [k] bpf_map_lookup_elem
  1.33%  swapper      [kernel.vmlinux]  [k] intel_idle
  1.32%  ksoftirqd/6  [fjes]            [k] 0x000000000000af90
  1.21%  ksoftirqd/6  [kernel.vmlinux]  [k] sk_load_byte_positive_offset
  1.07%  ksoftirqd/6  [kernel.vmlinux]  [k] __alloc_pages_nodemask
  0.89%  ksoftirqd/6  [kernel.vmlinux]  [k] __rmqueue
  0.84%  ksoftirqd/6  [mlx4_en]         [k] mlx4_alloc_pages.isra.23
  0.79%  ksoftirqd/6  [kernel.vmlinux]  [k] net_rx_action

machine specs:
 receiver - Intel E5-1630 v3 @ 3.70GHz
 sender - Intel E5645 @ 2.40GHz
 Mellanox ConnectX-3 @40G

Signed-off-by: Brenden Blanco <bblanco@plumgrid.com>
---
 samples/bpf/Makefile        |   4 ++
 samples/bpf/bpf_load.c      |   8 +++
 samples/bpf/netdrvx1_kern.c |  26 ++++++++
 samples/bpf/netdrvx1_user.c | 155 ++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 193 insertions(+)
 create mode 100644 samples/bpf/netdrvx1_kern.c
 create mode 100644 samples/bpf/netdrvx1_user.c

diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
index 9959771..19bb926 100644
--- a/samples/bpf/Makefile
+++ b/samples/bpf/Makefile
@@ -20,6 +20,7 @@ hostprogs-y += offwaketime
 hostprogs-y += spintest
 hostprogs-y += map_perf_test
 hostprogs-y += test_overhead
+hostprogs-y += netdrvx1
 
 test_verifier-objs := test_verifier.o libbpf.o
 test_maps-objs := test_maps.o libbpf.o
@@ -40,6 +41,7 @@ offwaketime-objs := bpf_load.o libbpf.o offwaketime_user.o
 spintest-objs := bpf_load.o libbpf.o spintest_user.o
 map_perf_test-objs := bpf_load.o libbpf.o map_perf_test_user.o
 test_overhead-objs := bpf_load.o libbpf.o test_overhead_user.o
+netdrvx1-objs := bpf_load.o libbpf.o netdrvx1_user.o
 
 # Tell kbuild to always build the programs
 always := $(hostprogs-y)
@@ -60,6 +62,7 @@ always += spintest_kern.o
 always += map_perf_test_kern.o
 always += test_overhead_tp_kern.o
 always += test_overhead_kprobe_kern.o
+always += netdrvx1_kern.o
 
 HOSTCFLAGS += -I$(objtree)/usr/include
 
@@ -80,6 +83,7 @@ HOSTLOADLIBES_offwaketime += -lelf
 HOSTLOADLIBES_spintest += -lelf
 HOSTLOADLIBES_map_perf_test += -lelf -lrt
 HOSTLOADLIBES_test_overhead += -lelf -lrt
+HOSTLOADLIBES_netdrvx1 += -lelf
 
 # point this to your LLVM backend with bpf support
 LLC=$(srctree)/tools/bpf/llvm/bld/Debug+Asserts/bin/llc
diff --git a/samples/bpf/bpf_load.c b/samples/bpf/bpf_load.c
index 022af71..c7b2245 100644
--- a/samples/bpf/bpf_load.c
+++ b/samples/bpf/bpf_load.c
@@ -50,6 +50,7 @@ static int load_and_attach(const char *event, struct bpf_insn *prog, int size)
 	bool is_kprobe = strncmp(event, "kprobe/", 7) == 0;
 	bool is_kretprobe = strncmp(event, "kretprobe/", 10) == 0;
 	bool is_tracepoint = strncmp(event, "tracepoint/", 11) == 0;
+	bool is_phys_dev = strncmp(event, "phys_dev", 8) == 0;
 	enum bpf_prog_type prog_type;
 	char buf[256];
 	int fd, efd, err, id;
@@ -66,6 +67,8 @@ static int load_and_attach(const char *event, struct bpf_insn *prog, int size)
 		prog_type = BPF_PROG_TYPE_KPROBE;
 	} else if (is_tracepoint) {
 		prog_type = BPF_PROG_TYPE_TRACEPOINT;
+	} else if (is_phys_dev) {
+		prog_type = BPF_PROG_TYPE_PHYS_DEV;
 	} else {
 		printf("Unknown event '%s'\n", event);
 		return -1;
@@ -79,6 +82,9 @@ static int load_and_attach(const char *event, struct bpf_insn *prog, int size)
 
 	prog_fd[prog_cnt++] = fd;
 
+	if (is_phys_dev)
+		return 0;
+
 	if (is_socket) {
 		event += 6;
 		if (*event != '/')
@@ -319,6 +325,7 @@ int load_bpf_file(char *path)
 			if (memcmp(shname_prog, "kprobe/", 7) == 0 ||
 			    memcmp(shname_prog, "kretprobe/", 10) == 0 ||
 			    memcmp(shname_prog, "tracepoint/", 11) == 0 ||
+			    memcmp(shname_prog, "phys_dev", 8) == 0 ||
 			    memcmp(shname_prog, "socket", 6) == 0)
 				load_and_attach(shname_prog, insns, data_prog->d_size);
 		}
@@ -336,6 +343,7 @@ int load_bpf_file(char *path)
 		if (memcmp(shname, "kprobe/", 7) == 0 ||
 		    memcmp(shname, "kretprobe/", 10) == 0 ||
 		    memcmp(shname, "tracepoint/", 11) == 0 ||
+		    memcmp(shname, "phys_dev", 8) == 0 ||
 		    memcmp(shname, "socket", 6) == 0)
 			load_and_attach(shname, data->d_buf, data->d_size);
 	}
diff --git a/samples/bpf/netdrvx1_kern.c b/samples/bpf/netdrvx1_kern.c
new file mode 100644
index 0000000..849802d
--- /dev/null
+++ b/samples/bpf/netdrvx1_kern.c
@@ -0,0 +1,26 @@
+#include <uapi/linux/bpf.h>
+#include <uapi/linux/if_ether.h>
+#include <uapi/linux/if_packet.h>
+#include <uapi/linux/ip.h>
+#include "bpf_helpers.h"
+
+struct bpf_map_def SEC("maps") dropcnt = {
+	.type = BPF_MAP_TYPE_PERCPU_ARRAY,
+	.key_size = sizeof(u32),
+	.value_size = sizeof(long),
+	.max_entries = 256,
+};
+
+SEC("phys_dev1")
+int bpf_prog1(struct bpf_phys_dev_md *ctx)
+{
+	int index = load_byte(ctx, ETH_HLEN + offsetof(struct iphdr, protocol));
+	long *value;
+
+	value = bpf_map_lookup_elem(&dropcnt, &index);
+	if (value)
+		*value += 1;
+
+	return BPF_PHYS_DEV_DROP;
+}
+char _license[] SEC("license") = "GPL";
diff --git a/samples/bpf/netdrvx1_user.c b/samples/bpf/netdrvx1_user.c
new file mode 100644
index 0000000..9e6ec9a
--- /dev/null
+++ b/samples/bpf/netdrvx1_user.c
@@ -0,0 +1,155 @@
+#include <linux/bpf.h>
+#include <linux/netlink.h>
+#include <linux/rtnetlink.h>
+#include <assert.h>
+#include <errno.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <sys/socket.h>
+#include <unistd.h>
+#include "bpf_load.h"
+#include "libbpf.h"
+
+static int set_link_bpf_fd(int ifindex, int fd)
+{
+	struct sockaddr_nl sa;
+	int sock, seq = 0, len, ret = -1;
+	char buf[4096];
+	struct rtattr *rta;
+	struct {
+		struct nlmsghdr  nh;
+		struct ifinfomsg ifinfo;
+		char             attrbuf[64];
+	} req;
+	struct nlmsghdr *nh;
+	struct nlmsgerr *err;
+
+	memset(&sa, 0, sizeof(sa));
+	sa.nl_family = AF_NETLINK;
+
+	sock = socket(AF_NETLINK, SOCK_RAW, NETLINK_ROUTE);
+	if (sock < 0) {
+		printf("open netlink socket: %s\n", strerror(errno));
+		return -1;
+	}
+
+	if (bind(sock, (struct sockaddr *)&sa, sizeof(sa)) < 0) {
+		printf("bind to netlink: %s\n", strerror(errno));
+		goto cleanup;
+	}
+
+	memset(&req, 0, sizeof(req));
+	req.nh.nlmsg_len = NLMSG_LENGTH(sizeof(struct ifinfomsg));
+	req.nh.nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK;
+	req.nh.nlmsg_type = RTM_SETLINK;
+	req.nh.nlmsg_pid = 0;
+	req.nh.nlmsg_seq = ++seq;
+	req.ifinfo.ifi_family = AF_UNSPEC;
+	req.ifinfo.ifi_index = ifindex;
+	rta = (struct rtattr *)(((char *) &req)
+				+ NLMSG_ALIGN(req.nh.nlmsg_len));
+	rta->rta_type = 42/*IFLA_BPF_FD*/;
+	rta->rta_len = RTA_LENGTH(sizeof(unsigned int));
+	req.nh.nlmsg_len = NLMSG_ALIGN(req.nh.nlmsg_len)
+		+ RTA_LENGTH(sizeof(fd));
+	memcpy(RTA_DATA(rta), &fd, sizeof(fd));
+	if (send(sock, &req, req.nh.nlmsg_len, 0) < 0) {
+		printf("send to netlink: %s\n", strerror(errno));
+		goto cleanup;
+	}
+
+	len = recv(sock, buf, sizeof(buf), 0);
+	if (len < 0) {
+		printf("recv from netlink: %s\n", strerror(errno));
+		goto cleanup;
+	}
+
+	for (nh = (struct nlmsghdr *)buf; NLMSG_OK(nh, len);
+	     nh = NLMSG_NEXT(nh, len)) {
+		if (nh->nlmsg_pid != getpid()) {
+			printf("Wrong pid %d, expected %d\n",
+			       nh->nlmsg_pid, getpid());
+			goto cleanup;
+		}
+		if (nh->nlmsg_seq != seq) {
+			printf("Wrong seq %d, expected %d\n",
+			       nh->nlmsg_seq, seq);
+			goto cleanup;
+		}
+		switch (nh->nlmsg_type) {
+		case NLMSG_ERROR:
+			err = (struct nlmsgerr *)NLMSG_DATA(nh);
+			if (!err->error)
+				continue;
+			printf("nlmsg error %s\n", strerror(-err->error));
+			goto cleanup;
+		case NLMSG_DONE:
+			break;
+		}
+	}
+
+	ret = 0;
+
+cleanup:
+	close(sock);
+	return ret;
+}
+
+/* simple per-protocol drop counter
+ */
+static void poll_stats(int secs)
+{
+	unsigned int nr_cpus = sysconf(_SC_NPROCESSORS_CONF);
+	__u64 values[nr_cpus];
+	__u32 key;
+	int i;
+
+	sleep(secs);
+
+	for (key = 0; key < 256; key++) {
+		__u64 sum = 0;
+
+		assert(bpf_lookup_elem(map_fd[0], &key, values) == 0);
+		for (i = 0; i < nr_cpus; i++)
+			sum += values[i];
+		if (sum)
+			printf("proto %u: %10llu drops/s\n", key, sum/secs);
+	}
+}
+
+int main(int ac, char **argv)
+{
+	char filename[256];
+	int ifindex;
+
+	snprintf(filename, sizeof(filename), "%s_kern.o", argv[0]);
+
+	if (ac != 2) {
+		printf("usage: %s IFINDEX\n", argv[0]);
+		return 1;
+	}
+
+	ifindex = strtoul(argv[1], NULL, 0);
+
+	if (load_bpf_file(filename)) {
+		printf("%s", bpf_log_buf);
+		return 1;
+	}
+
+	if (!prog_fd[0]) {
+		printf("load_bpf_file: %s\n", strerror(errno));
+		return 1;
+	}
+
+	if (set_link_bpf_fd(ifindex, prog_fd[0]) < 0) {
+		printf("link set bpf fd failed\n");
+		return 1;
+	}
+
+	poll_stats(5);
+
+	set_link_bpf_fd(ifindex, -1);
+
+	return 0;
+}
-- 
2.8.0

^ permalink raw reply related

* [RFC PATCH v2 4/5] mlx4: add support for fast rx drop bpf program
From: Brenden Blanco @ 2016-04-08  4:48 UTC (permalink / raw)
  To: davem
  Cc: Brenden Blanco, netdev, tom, alexei.starovoitov, ogerlitz, daniel,
	brouer, eric.dumazet, ecree, john.fastabend, tgraf, johannes,
	eranlinuxmellanox, lorenzo
In-Reply-To: <1460090930-11219-1-git-send-email-bblanco@plumgrid.com>

Add support for the BPF_PROG_TYPE_PHYS_DEV hook in mlx4 driver. Since
bpf programs require a skb context to navigate the packet, build a
percpu fake skb with the minimal fields. This avoids the costly
allocation for packets that end up being dropped.

Since mlx4 is so far the only user of bpf_phys_dev_md, the build
function is defined locally.

Signed-off-by: Brenden Blanco <bblanco@plumgrid.com>
---
 drivers/net/ethernet/mellanox/mlx4/en_netdev.c | 65 ++++++++++++++++++++++++++
 drivers/net/ethernet/mellanox/mlx4/en_rx.c     | 25 ++++++++--
 drivers/net/ethernet/mellanox/mlx4/mlx4_en.h   |  6 +++
 3 files changed, 92 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
index b4b258c..b228651 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
@@ -31,6 +31,7 @@
  *
  */
 
+#include <linux/bpf.h>
 #include <linux/etherdevice.h>
 #include <linux/tcp.h>
 #include <linux/if_vlan.h>
@@ -1966,6 +1967,9 @@ void mlx4_en_free_resources(struct mlx4_en_priv *priv)
 			mlx4_en_destroy_cq(priv, &priv->rx_cq[i]);
 	}
 
+	if (priv->prog)
+		bpf_prog_put(priv->prog);
+
 }
 
 int mlx4_en_alloc_resources(struct mlx4_en_priv *priv)
@@ -2078,6 +2082,11 @@ static int mlx4_en_change_mtu(struct net_device *dev, int new_mtu)
 		en_err(priv, "Bad MTU size:%d.\n", new_mtu);
 		return -EPERM;
 	}
+	if (priv->prog && MLX4_EN_EFF_MTU(new_mtu) > FRAG_SZ0) {
+		en_err(priv, "MTU size:%d requires frags but bpf prog running",
+		       new_mtu);
+		return -EOPNOTSUPP;
+	}
 	dev->mtu = new_mtu;
 
 	if (netif_running(dev)) {
@@ -2456,6 +2465,58 @@ static int mlx4_en_set_tx_maxrate(struct net_device *dev, int queue_index, u32 m
 	return err;
 }
 
+static DEFINE_PER_CPU(struct sk_buff, percpu_bpf_phys_dev_md);
+
+static void build_bpf_phys_dev_md(struct sk_buff *skb, void *data,
+				  unsigned int length)
+{
+	/* data_len is intentionally not set here so that skb_is_nonlinear()
+	 * returns false
+	 */
+
+	skb->len = length;
+	skb->head = data;
+	skb->data = data;
+}
+
+int mlx4_call_bpf(struct bpf_prog *prog, void *data, unsigned int length)
+{
+	struct sk_buff *skb = this_cpu_ptr(&percpu_bpf_phys_dev_md);
+	int ret;
+
+	build_bpf_phys_dev_md(skb, data, length);
+
+	rcu_read_lock();
+	ret = BPF_PROG_RUN(prog, (void *)skb);
+	rcu_read_unlock();
+
+	return ret;
+}
+
+static int mlx4_bpf_set(struct net_device *dev, struct bpf_prog *prog)
+{
+	struct mlx4_en_priv *priv = netdev_priv(dev);
+	struct bpf_prog *old_prog;
+
+	if (priv->num_frags > 1)
+		return -EOPNOTSUPP;
+
+	old_prog = xchg(&priv->prog, prog);
+	if (old_prog) {
+		synchronize_net();
+		bpf_prog_put(old_prog);
+	}
+
+	return 0;
+}
+
+static bool mlx4_bpf_get(struct net_device *dev)
+{
+	struct mlx4_en_priv *priv = netdev_priv(dev);
+
+	return !!priv->prog;
+}
+
 static const struct net_device_ops mlx4_netdev_ops = {
 	.ndo_open		= mlx4_en_open,
 	.ndo_stop		= mlx4_en_close,
@@ -2486,6 +2547,8 @@ static const struct net_device_ops mlx4_netdev_ops = {
 	.ndo_features_check	= mlx4_en_features_check,
 #endif
 	.ndo_set_tx_maxrate	= mlx4_en_set_tx_maxrate,
+	.ndo_bpf_set		= mlx4_bpf_set,
+	.ndo_bpf_get		= mlx4_bpf_get,
 };
 
 static const struct net_device_ops mlx4_netdev_ops_master = {
@@ -2524,6 +2587,8 @@ static const struct net_device_ops mlx4_netdev_ops_master = {
 	.ndo_features_check	= mlx4_en_features_check,
 #endif
 	.ndo_set_tx_maxrate	= mlx4_en_set_tx_maxrate,
+	.ndo_bpf_set		= mlx4_bpf_set,
+	.ndo_bpf_get		= mlx4_bpf_get,
 };
 
 struct mlx4_en_bond {
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
index 86bcfe5..287da02 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
@@ -748,6 +748,7 @@ int mlx4_en_process_rx_cq(struct net_device *dev, struct mlx4_en_cq *cq, int bud
 	struct mlx4_en_rx_ring *ring = priv->rx_ring[cq->ring];
 	struct mlx4_en_rx_alloc *frags;
 	struct mlx4_en_rx_desc *rx_desc;
+	struct bpf_prog *prog;
 	struct sk_buff *skb;
 	int index;
 	int nr;
@@ -764,6 +765,8 @@ int mlx4_en_process_rx_cq(struct net_device *dev, struct mlx4_en_cq *cq, int bud
 	if (budget <= 0)
 		return polled;
 
+	prog = READ_ONCE(priv->prog);
+
 	/* We assume a 1:1 mapping between CQEs and Rx descriptors, so Rx
 	 * descriptor offset can be deduced from the CQE index instead of
 	 * reading 'cqe->index' */
@@ -840,6 +843,23 @@ int mlx4_en_process_rx_cq(struct net_device *dev, struct mlx4_en_cq *cq, int bud
 		l2_tunnel = (dev->hw_enc_features & NETIF_F_RXCSUM) &&
 			(cqe->vlan_my_qpn & cpu_to_be32(MLX4_CQE_L2_TUNNEL));
 
+		/* A bpf program gets first chance to drop the packet. It may
+		 * read bytes but not past the end of the frag.
+		 */
+		if (prog) {
+			struct ethhdr *ethh;
+			dma_addr_t dma;
+
+			dma = be64_to_cpu(rx_desc->data[0].addr);
+			dma_sync_single_for_cpu(priv->ddev, dma, sizeof(*ethh),
+						DMA_FROM_DEVICE);
+			ethh = page_address(frags[0].page) +
+							frags[0].page_offset;
+			if (mlx4_call_bpf(prog, ethh, frags[0].page_size) ==
+							BPF_PHYS_DEV_DROP)
+				goto next;
+		}
+
 		if (likely(dev->features & NETIF_F_RXCSUM)) {
 			if (cqe->status & cpu_to_be16(MLX4_CQE_STATUS_TCP |
 						      MLX4_CQE_STATUS_UDP)) {
@@ -1067,10 +1087,7 @@ static const int frag_sizes[] = {
 void mlx4_en_calc_rx_buf(struct net_device *dev)
 {
 	struct mlx4_en_priv *priv = netdev_priv(dev);
-	/* VLAN_HLEN is added twice,to support skb vlan tagged with multiple
-	 * headers. (For example: ETH_P_8021Q and ETH_P_8021AD).
-	 */
-	int eff_mtu = dev->mtu + ETH_HLEN + (2 * VLAN_HLEN);
+	int eff_mtu = MLX4_EN_EFF_MTU(dev->mtu);
 	int buf_size = 0;
 	int i = 0;
 
diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
index d12ab6a..40eb32d 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
@@ -164,6 +164,10 @@ enum {
 #define MLX4_LOOPBACK_TEST_PAYLOAD (HEADER_COPY_SIZE - ETH_HLEN)
 
 #define MLX4_EN_MIN_MTU		46
+/* VLAN_HLEN is added twice,to support skb vlan tagged with multiple
+ * headers. (For example: ETH_P_8021Q and ETH_P_8021AD).
+ */
+#define MLX4_EN_EFF_MTU(mtu)	((mtu) + ETH_HLEN + (2 * VLAN_HLEN))
 #define ETH_BCAST		0xffffffffffffULL
 
 #define MLX4_EN_LOOPBACK_RETRIES	5
@@ -568,6 +572,7 @@ struct mlx4_en_priv {
 	struct hlist_head mac_hash[MLX4_EN_MAC_HASH_SIZE];
 	struct hwtstamp_config hwtstamp_config;
 	u32 counter_index;
+	struct bpf_prog *prog;
 
 #ifdef CONFIG_MLX4_EN_DCB
 	struct ieee_ets ets;
@@ -682,6 +687,7 @@ int mlx4_en_create_drop_qp(struct mlx4_en_priv *priv);
 void mlx4_en_destroy_drop_qp(struct mlx4_en_priv *priv);
 int mlx4_en_free_tx_buf(struct net_device *dev, struct mlx4_en_tx_ring *ring);
 void mlx4_en_rx_irq(struct mlx4_cq *mcq);
+int mlx4_call_bpf(struct bpf_prog *prog, void *data, unsigned int length);
 
 int mlx4_SET_MCAST_FLTR(struct mlx4_dev *dev, u8 port, u64 mac, u64 clear, u8 mode);
 int mlx4_SET_VLAN_FLTR(struct mlx4_dev *dev, struct mlx4_en_priv *priv);
-- 
2.8.0

^ permalink raw reply related

* [RFC PATCH v2 3/5] rtnl: add option for setting link bpf prog
From: Brenden Blanco @ 2016-04-08  4:48 UTC (permalink / raw)
  To: davem
  Cc: Brenden Blanco, netdev, tom, alexei.starovoitov, ogerlitz, daniel,
	brouer, eric.dumazet, ecree, john.fastabend, tgraf, johannes,
	eranlinuxmellanox, lorenzo
In-Reply-To: <1460090930-11219-1-git-send-email-bblanco@plumgrid.com>

Sets the bpf program represented by fd as an early filter in the rx path
of the netdev. The fd must have been created as BPF_PROG_TYPE_PHYS_DEV.
Providing a negative value as fd clears the program. Getting the fd back
via rtnl is not possible, therefore reading of this value merely
provides a bool whether the program is valid on the link or not.

Signed-off-by: Brenden Blanco <bblanco@plumgrid.com>
---
 include/uapi/linux/if_link.h |  1 +
 net/core/rtnetlink.c         | 12 ++++++++++++
 2 files changed, 13 insertions(+)

diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
index 9427f17..ccad234 100644
--- a/include/uapi/linux/if_link.h
+++ b/include/uapi/linux/if_link.h
@@ -155,6 +155,7 @@ enum {
 	IFLA_PROTO_DOWN,
 	IFLA_GSO_MAX_SEGS,
 	IFLA_GSO_MAX_SIZE,
+	IFLA_BPF_FD,
 	__IFLA_MAX
 };
 
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index a75f7e9..96579ce 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -910,6 +910,7 @@ static noinline size_t if_nlmsg_size(const struct net_device *dev,
 	       + nla_total_size(MAX_PHYS_ITEM_ID_LEN) /* IFLA_PHYS_PORT_ID */
 	       + nla_total_size(MAX_PHYS_ITEM_ID_LEN) /* IFLA_PHYS_SWITCH_ID */
 	       + nla_total_size(IFNAMSIZ) /* IFLA_PHYS_PORT_NAME */
+	       + nla_total_size(4) /* IFLA_BPF_FD */
 	       + nla_total_size(1); /* IFLA_PROTO_DOWN */
 
 }
@@ -1242,6 +1243,9 @@ static int rtnl_fill_ifinfo(struct sk_buff *skb, struct net_device *dev,
 	     nla_put_string(skb, IFLA_IFALIAS, dev->ifalias)) ||
 	    nla_put_u32(skb, IFLA_CARRIER_CHANGES,
 			atomic_read(&dev->carrier_changes)) ||
+	    (dev->netdev_ops->ndo_bpf_get &&
+	     nla_put_s32(skb, IFLA_BPF_FD,
+			 dev->netdev_ops->ndo_bpf_get(dev))) ||
 	    nla_put_u8(skb, IFLA_PROTO_DOWN, dev->proto_down))
 		goto nla_put_failure;
 
@@ -1375,6 +1379,7 @@ static const struct nla_policy ifla_policy[IFLA_MAX+1] = {
 	[IFLA_PHYS_SWITCH_ID]	= { .type = NLA_BINARY, .len = MAX_PHYS_ITEM_ID_LEN },
 	[IFLA_LINK_NETNSID]	= { .type = NLA_S32 },
 	[IFLA_PROTO_DOWN]	= { .type = NLA_U8 },
+	[IFLA_BPF_FD]		= { .type = NLA_S32 },
 };
 
 static const struct nla_policy ifla_info_policy[IFLA_INFO_MAX+1] = {
@@ -2029,6 +2034,13 @@ static int do_setlink(const struct sk_buff *skb,
 		status |= DO_SETLINK_NOTIFY;
 	}
 
+	if (tb[IFLA_BPF_FD]) {
+		err = dev_change_bpf_fd(dev, nla_get_s32(tb[IFLA_BPF_FD]));
+		if (err)
+			goto errout;
+		status |= DO_SETLINK_NOTIFY;
+	}
+
 errout:
 	if (status & DO_SETLINK_MODIFIED) {
 		if (status & DO_SETLINK_NOTIFY)
-- 
2.8.0

^ permalink raw reply related

* [RFC PATCH v2 2/5] net: add ndo to set bpf prog in adapter rx
From: Brenden Blanco @ 2016-04-08  4:48 UTC (permalink / raw)
  To: davem
  Cc: Brenden Blanco, netdev, tom, alexei.starovoitov, ogerlitz, daniel,
	brouer, eric.dumazet, ecree, john.fastabend, tgraf, johannes,
	eranlinuxmellanox, lorenzo
In-Reply-To: <1460090930-11219-1-git-send-email-bblanco@plumgrid.com>

Add two new set/get netdev ops for drivers implementing the
BPF_PROG_TYPE_PHYS_DEV filter.

Signed-off-by: Brenden Blanco <bblanco@plumgrid.com>
---
 include/linux/netdevice.h | 13 +++++++++++++
 net/core/dev.c            | 38 ++++++++++++++++++++++++++++++++++++++
 2 files changed, 51 insertions(+)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index cb4e508..3acf732 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -61,6 +61,7 @@ struct wireless_dev;
 /* 802.15.4 specific */
 struct wpan_dev;
 struct mpls_dev;
+struct bpf_prog;
 
 void netdev_set_default_ethtool_ops(struct net_device *dev,
 				    const struct ethtool_ops *ops);
@@ -1102,6 +1103,14 @@ struct tc_to_netdev {
  *	appropriate rx headroom value allows avoiding skb head copy on
  *	forward. Setting a negative value resets the rx headroom to the
  *	default value.
+ * int  (*ndo_bpf_set)(struct net_device *dev, struct bpf_prog *prog);
+ *	This function is used to set or clear a bpf program used in the
+ *	earliest stages of packet rx. The prog will have been loaded as
+ *	BPF_PROG_TYPE_PHYS_DEV. The callee is responsible for calling
+ *	bpf_prog_put on any old progs that are stored, but not on the passed
+ *	in prog.
+ * bool (*ndo_bpf_get)(struct net_device *dev);
+ *	This function is used to check if a bpf program is set on the device.
  *
  */
 struct net_device_ops {
@@ -1292,6 +1301,9 @@ struct net_device_ops {
 						       struct sk_buff *skb);
 	void			(*ndo_set_rx_headroom)(struct net_device *dev,
 						       int needed_headroom);
+	int			(*ndo_bpf_set)(struct net_device *dev,
+					       struct bpf_prog *prog);
+	bool			(*ndo_bpf_get)(struct net_device *dev);
 };
 
 /**
@@ -3251,6 +3263,7 @@ int dev_get_phys_port_id(struct net_device *dev,
 int dev_get_phys_port_name(struct net_device *dev,
 			   char *name, size_t len);
 int dev_change_proto_down(struct net_device *dev, bool proto_down);
+int dev_change_bpf_fd(struct net_device *dev, int fd);
 struct sk_buff *validate_xmit_skb_list(struct sk_buff *skb, struct net_device *dev);
 struct sk_buff *dev_hard_start_xmit(struct sk_buff *skb, struct net_device *dev,
 				    struct netdev_queue *txq, int *ret);
diff --git a/net/core/dev.c b/net/core/dev.c
index 273f10d..7cf749c 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -94,6 +94,7 @@
 #include <linux/ethtool.h>
 #include <linux/notifier.h>
 #include <linux/skbuff.h>
+#include <linux/bpf.h>
 #include <net/net_namespace.h>
 #include <net/sock.h>
 #include <net/busy_poll.h>
@@ -6483,6 +6484,43 @@ int dev_change_proto_down(struct net_device *dev, bool proto_down)
 EXPORT_SYMBOL(dev_change_proto_down);
 
 /**
+ *	dev_change_bpf_fd - set or clear a bpf program for a device
+ *	@dev: device
+ *	@fd: new program fd or negative value to clear
+ *
+ *	Set or clear a bpf program for a device
+ */
+int dev_change_bpf_fd(struct net_device *dev, int fd)
+{
+	const struct net_device_ops *ops = dev->netdev_ops;
+	struct bpf_prog *prog = NULL;
+	int err;
+
+	if (!ops->ndo_bpf_set)
+		return -EOPNOTSUPP;
+	if (!netif_device_present(dev))
+		return -ENODEV;
+
+	if (fd >= 0) {
+		prog = bpf_prog_get(fd);
+		if (IS_ERR(prog))
+			return PTR_ERR(prog);
+
+		if (prog->type != BPF_PROG_TYPE_PHYS_DEV) {
+			bpf_prog_put(prog);
+			return -EINVAL;
+		}
+	}
+
+	err = ops->ndo_bpf_set(dev, prog);
+	if (err < 0 && prog)
+		bpf_prog_put(prog);
+
+	return err;
+}
+EXPORT_SYMBOL(dev_change_bpf_fd);
+
+/**
  *	dev_new_index	-	allocate an ifindex
  *	@net: the applicable net namespace
  *
-- 
2.8.0

^ permalink raw reply related

* [RFC PATCH v2 1/5] bpf: add PHYS_DEV prog type for early driver filter
From: Brenden Blanco @ 2016-04-08  4:48 UTC (permalink / raw)
  To: davem
  Cc: Brenden Blanco, netdev, tom, alexei.starovoitov, ogerlitz, daniel,
	brouer, eric.dumazet, ecree, john.fastabend, tgraf, johannes,
	eranlinuxmellanox, lorenzo

Add a new bpf prog type that is intended to run in early stages of the
packet rx path. Only minimal packet metadata will be available, hence a
new context type, struct bpf_phys_dev_md, is exposed to userspace. So
far only expose the readable packet length, and only in read mode.

The PHYS_DEV name is chosen to represent that the program is meant only
for physical adapters, rather than all netdevs.

While the user visible struct is new, the underlying context must be
implemented as a minimal skb in order for the packet load_* instructions
to work. The skb filled in by the driver must have skb->len, skb->head,
and skb->data set, and skb->data_len == 0.

Signed-off-by: Brenden Blanco <bblanco@plumgrid.com>
---
 include/uapi/linux/bpf.h | 14 ++++++++++
 kernel/bpf/verifier.c    |  1 +
 net/core/filter.c        | 68 ++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 83 insertions(+)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 70eda5a..3018d83 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -93,6 +93,7 @@ enum bpf_prog_type {
 	BPF_PROG_TYPE_SCHED_CLS,
 	BPF_PROG_TYPE_SCHED_ACT,
 	BPF_PROG_TYPE_TRACEPOINT,
+	BPF_PROG_TYPE_PHYS_DEV,
 };
 
 #define BPF_PSEUDO_MAP_FD	1
@@ -368,6 +369,19 @@ struct __sk_buff {
 	__u32 tc_classid;
 };
 
+/* user return codes for PHYS_DEV prog type */
+enum bpf_phys_dev_action {
+	BPF_PHYS_DEV_DROP,
+	BPF_PHYS_DEV_OK,
+};
+
+/* user accessible metadata for PHYS_DEV packet hook
+ * new fields must be added to the end of this structure
+ */
+struct bpf_phys_dev_md {
+	__u32 len;
+};
+
 struct bpf_tunnel_key {
 	__u32 tunnel_id;
 	union {
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 58792fe..877542d 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -1344,6 +1344,7 @@ static bool may_access_skb(enum bpf_prog_type type)
 	case BPF_PROG_TYPE_SOCKET_FILTER:
 	case BPF_PROG_TYPE_SCHED_CLS:
 	case BPF_PROG_TYPE_SCHED_ACT:
+	case BPF_PROG_TYPE_PHYS_DEV:
 		return true;
 	default:
 		return false;
diff --git a/net/core/filter.c b/net/core/filter.c
index e8486ba..4f73fb9 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -2021,6 +2021,12 @@ tc_cls_act_func_proto(enum bpf_func_id func_id)
 	}
 }
 
+static const struct bpf_func_proto *
+phys_dev_func_proto(enum bpf_func_id func_id)
+{
+	return sk_filter_func_proto(func_id);
+}
+
 static bool __is_valid_access(int off, int size, enum bpf_access_type type)
 {
 	/* check bounds */
@@ -2076,6 +2082,36 @@ static bool tc_cls_act_is_valid_access(int off, int size,
 	return __is_valid_access(off, size, type);
 }
 
+static bool __is_valid_phys_dev_access(int off, int size,
+				       enum bpf_access_type type)
+{
+	if (off < 0 || off >= sizeof(struct bpf_phys_dev_md))
+		return false;
+
+	if (off % size != 0)
+		return false;
+
+	if (size != 4)
+		return false;
+
+	return true;
+}
+
+static bool phys_dev_is_valid_access(int off, int size,
+				     enum bpf_access_type type)
+{
+	if (type == BPF_WRITE)
+		return false;
+
+	switch (off) {
+	case offsetof(struct bpf_phys_dev_md, len):
+		break;
+	default:
+		return false;
+	}
+	return __is_valid_phys_dev_access(off, size, type);
+}
+
 static u32 bpf_net_convert_ctx_access(enum bpf_access_type type, int dst_reg,
 				      int src_reg, int ctx_off,
 				      struct bpf_insn *insn_buf,
@@ -2213,6 +2249,26 @@ static u32 bpf_net_convert_ctx_access(enum bpf_access_type type, int dst_reg,
 	return insn - insn_buf;
 }
 
+static u32 bpf_phys_dev_convert_ctx_access(enum bpf_access_type type,
+					   int dst_reg, int src_reg,
+					   int ctx_off,
+					   struct bpf_insn *insn_buf,
+					   struct bpf_prog *prog)
+{
+	struct bpf_insn *insn = insn_buf;
+
+	switch (ctx_off) {
+	case offsetof(struct bpf_phys_dev_md, len):
+		BUILD_BUG_ON(FIELD_SIZEOF(struct sk_buff, len) != 4);
+
+		*insn++ = BPF_LDX_MEM(BPF_W, dst_reg, src_reg,
+				      offsetof(struct sk_buff, len));
+		break;
+	}
+
+	return insn - insn_buf;
+}
+
 static const struct bpf_verifier_ops sk_filter_ops = {
 	.get_func_proto = sk_filter_func_proto,
 	.is_valid_access = sk_filter_is_valid_access,
@@ -2225,6 +2281,12 @@ static const struct bpf_verifier_ops tc_cls_act_ops = {
 	.convert_ctx_access = bpf_net_convert_ctx_access,
 };
 
+static const struct bpf_verifier_ops phys_dev_ops = {
+	.get_func_proto = phys_dev_func_proto,
+	.is_valid_access = phys_dev_is_valid_access,
+	.convert_ctx_access = bpf_phys_dev_convert_ctx_access,
+};
+
 static struct bpf_prog_type_list sk_filter_type __read_mostly = {
 	.ops = &sk_filter_ops,
 	.type = BPF_PROG_TYPE_SOCKET_FILTER,
@@ -2240,11 +2302,17 @@ static struct bpf_prog_type_list sched_act_type __read_mostly = {
 	.type = BPF_PROG_TYPE_SCHED_ACT,
 };
 
+static struct bpf_prog_type_list phys_dev_type __read_mostly = {
+	.ops = &phys_dev_ops,
+	.type = BPF_PROG_TYPE_PHYS_DEV,
+};
+
 static int __init register_sk_filter_ops(void)
 {
 	bpf_register_prog_type(&sk_filter_type);
 	bpf_register_prog_type(&sched_cls_type);
 	bpf_register_prog_type(&sched_act_type);
+	bpf_register_prog_type(&phys_dev_type);
 
 	return 0;
 }
-- 
2.8.0

^ permalink raw reply related

* [RFC PATCH v2 0/5] Add driver bpf hook for early packet drop
From: Brenden Blanco @ 2016-04-08  4:47 UTC (permalink / raw)
  To: davem
  Cc: Brenden Blanco, netdev, tom, alexei.starovoitov, ogerlitz, daniel,
	brouer, eric.dumazet, ecree, john.fastabend, tgraf, johannes,
	eranlinuxmellanox, lorenzo

This patch set introduces new infrastructure for programmatically
processing packets in the earliest stages of rx, as part of an effort
others are calling Express Data Path (XDP) [1]. Start this effort by
introducing a new bpf program type for early packet filtering, before even
an skb has been allocated.

With this, hope to enable line rate filtering, with this initial
implementation providing drop/allow action only.

Patch 1 introduces the new prog type and helpers for validating the bpf
program. A new userspace struct is defined containing only len as a field,
with others to follow in the future.
In patch 2, create a new ndo to pass the fd to support drivers. 
In patch 3, expose a new rtnl option to userspace.
In patch 4, enable support in mlx4 driver. No skb allocation is required,
instead a static percpu skb is kept in the driver and minimally initialized
for each driver frag.
In patch 5, create a sample drop and count program. With single core,
achieved ~20 Mpps drop rate on a 40G mlx4. This includes packet data
access, bpf array lookup, and increment.

Interestingly, accessing packet data from the program did not have a
noticeable impact on performance. Even so, future enhancements to
prefetching / batching / page-allocs should hopefully improve the
performance in this path.

[1] https://github.com/iovisor/bpf-docs/blob/master/Express_Data_Path.pdf

v2:
  1/5: Drop xdp from types, instead consistently use bpf_phys_dev_.
    Introduce enum for return values from phys_dev hook.
  2/5: Move prog->type check to just before invoking ndo.
    Change ndo to take a bpf_prog * instead of fd.
    Add ndo_bpf_get rather than keeping a bool in the netdev struct.
  3/5: Use ndo_bpf_get to fetch bool.
  4/5: Enforce that only 1 frag is ever given to bpf prog by disallowing
    mtu to increase beyond FRAG_SZ0 when bpf prog is running, or conversely
    to set a bpf prog when priv->num_frags > 1.
    Rename pseudo_skb to bpf_phys_dev_md.
    Implement ndo_bpf_get.
    Add dma sync just before invoking prog.
    Check for explicit bpf return code rather than nonzero.
    Remove increment of rx_dropped.
  5/5: Use explicit bpf return code in example.
    Update commit log with higher pps numbers.

Brenden Blanco (5):
  bpf: add PHYS_DEV prog type for early driver filter
  net: add ndo to set bpf prog in adapter rx
  rtnl: add option for setting link bpf prog
  mlx4: add support for fast rx drop bpf program
  Add sample for adding simple drop program to link

 drivers/net/ethernet/mellanox/mlx4/en_netdev.c |  65 +++++++++++
 drivers/net/ethernet/mellanox/mlx4/en_rx.c     |  25 +++-
 drivers/net/ethernet/mellanox/mlx4/mlx4_en.h   |   6 +
 include/linux/netdevice.h                      |  13 +++
 include/uapi/linux/bpf.h                       |  14 +++
 include/uapi/linux/if_link.h                   |   1 +
 kernel/bpf/verifier.c                          |   1 +
 net/core/dev.c                                 |  38 ++++++
 net/core/filter.c                              |  68 +++++++++++
 net/core/rtnetlink.c                           |  12 ++
 samples/bpf/Makefile                           |   4 +
 samples/bpf/bpf_load.c                         |   8 ++
 samples/bpf/netdrvx1_kern.c                    |  26 +++++
 samples/bpf/netdrvx1_user.c                    | 155 +++++++++++++++++++++++++
 14 files changed, 432 insertions(+), 4 deletions(-)
 create mode 100644 samples/bpf/netdrvx1_kern.c
 create mode 100644 samples/bpf/netdrvx1_user.c

-- 
2.8.0

^ permalink raw reply

* [PATCH net] mpls: find_outdev: check for err ptr in addition to NULL check
From: Roopa Prabhu @ 2016-04-08  4:28 UTC (permalink / raw)
  To: davem; +Cc: netdev

From: Roopa Prabhu <roopa@cumulusnetworks.com>

find_outdev calls inet{,6}_fib_lookup_dev() or dev_get_by_index() to
find the output device. In case of an error, inet{,6}_fib_lookup_dev()
returns error pointer and dev_get_by_index() returns NULL. But the function
only checks for NULL and thus can end up calling dev_put on an ERR_PTR.
This patch adds an additional check for err ptr after the NULL check.

Before: Trying to add an mpls route with no oif from user, no available
path to 10.1.1.8 and no default route:
$ip -f mpls route add 100 as 200 via inet 10.1.1.8
[  822.337195] BUG: unable to handle kernel NULL pointer dereference at
00000000000003a3
[  822.340033] IP: [<ffffffff8148781e>] mpls_nh_assign_dev+0x10b/0x182
[  822.340033] PGD 1db38067 PUD 1de9e067 PMD 0
[  822.340033] Oops: 0000 [#1] SMP
[  822.340033] Modules linked in:
[  822.340033] CPU: 0 PID: 11148 Comm: ip Not tainted 4.5.0-rc7+ #54
[  822.340033] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS rel-1.7.5.1-0-g8936dbb-20141113_115728-nilsson.home.kraxel.org
04/01/2014
[  822.340033] task: ffff88001db82580 ti: ffff88001dad4000 task.ti:
ffff88001dad4000
[  822.340033] RIP: 0010:[<ffffffff8148781e>]  [<ffffffff8148781e>]
mpls_nh_assign_dev+0x10b/0x182
[  822.340033] RSP: 0018:ffff88001dad7a88  EFLAGS: 00010282
[  822.340033] RAX: ffffffffffffff9b RBX: ffffffffffffff9b RCX:
0000000000000002
[  822.340033] RDX: 00000000ffffff9b RSI: 0000000000000008 RDI:
0000000000000000
[  822.340033] RBP: ffff88001ddc9ea0 R08: ffff88001e9f1768 R09:
0000000000000000
[  822.340033] R10: ffff88001d9c1100 R11: ffff88001e3c89f0 R12:
ffffffff8187e0c0
[  822.340033] R13: ffffffff8187e0c0 R14: ffff88001ddc9e80 R15:
0000000000000004
[  822.340033] FS:  00007ff9ed798700(0000) GS:ffff88001fc00000(0000)
knlGS:0000000000000000
[  822.340033] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  822.340033] CR2: 00000000000003a3 CR3: 000000001de89000 CR4:
00000000000006f0
[  822.340033] Stack:
[  822.340033]  0000000000000000 0000000100000000 0000000000000000
0000000000000000
[  822.340033]  0000000000000000 0801010a00000000 0000000000000000
0000000000000000
[  822.340033]  0000000000000004 ffffffff8148749b ffffffff8187e0c0
000000000000001c
[  822.340033] Call Trace:
[  822.340033]  [<ffffffff8148749b>] ? mpls_rt_alloc+0x2b/0x3e
[  822.340033]  [<ffffffff81488e66>] ? mpls_rtm_newroute+0x358/0x3e2
[  822.340033]  [<ffffffff810e7bbc>] ? get_page+0x5/0xa
[  822.340033]  [<ffffffff813b7d94>] ? rtnetlink_rcv_msg+0x17e/0x191
[  822.340033]  [<ffffffff8111794e>] ? __kmalloc_track_caller+0x8c/0x9e
[  822.340033]  [<ffffffff813c9393>] ?
rht_key_hashfn.isra.20.constprop.57+0x14/0x1f
[  822.340033]  [<ffffffff813b7c16>] ? __rtnl_unlock+0xc/0xc
[  822.340033]  [<ffffffff813cb794>] ? netlink_rcv_skb+0x36/0x82
[  822.340033]  [<ffffffff813b4507>] ? rtnetlink_rcv+0x1f/0x28
[  822.340033]  [<ffffffff813cb2b1>] ? netlink_unicast+0x106/0x189
[  822.340033]  [<ffffffff813cb5b3>] ? netlink_sendmsg+0x27f/0x2c8
[  822.340033]  [<ffffffff81392ede>] ? sock_sendmsg_nosec+0x10/0x1b
[  822.340033]  [<ffffffff81393df1>] ? ___sys_sendmsg+0x182/0x1e3
[  822.340033]  [<ffffffff810e4f35>] ?
__alloc_pages_nodemask+0x11c/0x1e4
[  822.340033]  [<ffffffff8110619c>] ? PageAnon+0x5/0xd
[  822.340033]  [<ffffffff811062fe>] ? __page_set_anon_rmap+0x45/0x52
[  822.340033]  [<ffffffff810e7bbc>] ? get_page+0x5/0xa
[  822.340033]  [<ffffffff810e85ab>] ? __lru_cache_add+0x1a/0x3a
[  822.340033]  [<ffffffff81087ea9>] ? current_kernel_time64+0x9/0x30
[  822.340033]  [<ffffffff813940c4>] ? __sys_sendmsg+0x3c/0x5a
[  822.340033]  [<ffffffff8148f597>] ?
entry_SYSCALL_64_fastpath+0x12/0x6a
[  822.340033] Code: 83 08 04 00 00 65 ff 00 48 8b 3c 24 e8 40 7c f2 ff
eb 13 48 c7 c3 9f ff ff ff eb 0f 89 ce e8 f1 ae f1 ff 48 89 c3 48 85 db
74 15 <48> 8b 83 08 04 00 00 65 ff 08 48 81 fb 00 f0 ff ff 76 0d eb 07
[  822.340033] RIP  [<ffffffff8148781e>] mpls_nh_assign_dev+0x10b/0x182
[  822.340033]  RSP <ffff88001dad7a88>
[  822.340033] CR2: 00000000000003a3
[  822.435363] ---[ end trace 98cc65e6f6b8bf11 ]---

After patch:
$ip -f mpls route add 100 as 200 via inet 10.1.1.8
RTNETLINK answers: Network is unreachable

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Reported-by: David Miller <davem@davemloft.net>
---
I completely missed this during negative testing of my original
submission because of a default route always present. It was always
picking the default device in the failure case and no crash. This time
tested with no default route.

Thanks for catching it.

 net/mpls/af_mpls.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/net/mpls/af_mpls.c b/net/mpls/af_mpls.c
index b18c5ed..0b80a71 100644
--- a/net/mpls/af_mpls.c
+++ b/net/mpls/af_mpls.c
@@ -543,6 +543,9 @@ static struct net_device *find_outdev(struct net *net,
 	if (!dev)
 		return ERR_PTR(-ENODEV);
 
+	if (IS_ERR(dev))
+		return dev;
+
 	/* The caller is holding rtnl anyways, so release the dev reference */
 	dev_put(dev);
 
-- 
1.9.1

^ permalink raw reply related

* [net-next 18/18] ixgbe: Bump version number
From: Jeff Kirsher @ 2016-04-08  3:21 UTC (permalink / raw)
  To: davem; +Cc: Mark Rustad, netdev, nhorman, sassmann, jogreene, Jeff Kirsher
In-Reply-To: <1460085673-87056-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Mark Rustad <mark.d.rustad@intel.com>

Update ixgbe version number.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 1a7bfcf..2976df7 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -70,7 +70,7 @@ char ixgbe_default_device_descr[] =
 static char ixgbe_default_device_descr[] =
 			      "Intel(R) 10 Gigabit Network Connection";
 #endif
-#define DRV_VERSION "4.2.1-k"
+#define DRV_VERSION "4.4.0-k"
 const char ixgbe_driver_version[] = DRV_VERSION;
 static const char ixgbe_copyright[] =
 				"Copyright (c) 1999-2016 Intel Corporation.";
-- 
2.5.5

^ permalink raw reply related

* [net-next 17/18] ixgbe: Add KR backplane support for x550em_a
From: Jeff Kirsher @ 2016-04-08  3:21 UTC (permalink / raw)
  To: davem; +Cc: Mark Rustad, netdev, nhorman, sassmann, jogreene, Jeff Kirsher
In-Reply-To: <1460085673-87056-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Mark Rustad <mark.d.rustad@intel.com>

Add support for x550em_a-based KR backplane devices.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |  2 ++
 drivers/net/ethernet/intel/ixgbe/ixgbe_type.h |  2 ++
 drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c | 18 ++++++++++++++----
 3 files changed, 18 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index c96af3f..1a7bfcf 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -131,6 +131,8 @@ static const struct pci_device_id ixgbe_pci_tbl[] = {
 	{PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550EM_X_KR), board_X550EM_x},
 	{PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550EM_X_10G_T), board_X550EM_x},
 	{PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550EM_X_SFP), board_X550EM_x},
+	{PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550EM_A_KR), board_x550em_a },
+	{PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550EM_A_KR_L), board_x550em_a },
 	{PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550EM_A_SFP_N), board_x550em_a },
 	{PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550EM_A_SGMII), board_x550em_a },
 	{PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550EM_A_SGMII_L), board_x550em_a },
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h b/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
index 50e8bc0..ba3b837 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
@@ -81,6 +81,8 @@
 #define IXGBE_DEV_ID_X550EM_X_SFP	0x15AC
 #define IXGBE_DEV_ID_X550EM_X_10G_T	0x15AD
 #define IXGBE_DEV_ID_X550EM_X_1G_T	0x15AE
+#define IXGBE_DEV_ID_X550EM_A_KR	0x15C2
+#define IXGBE_DEV_ID_X550EM_A_KR_L	0x15C3
 #define IXGBE_DEV_ID_X550EM_A_SFP_N	0x15C4
 #define IXGBE_DEV_ID_X550EM_A_SGMII	0x15C6
 #define IXGBE_DEV_ID_X550EM_A_SGMII_L	0x15C7
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c
index 81e5d54..c71e93e 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c
@@ -291,6 +291,8 @@ static s32 ixgbe_identify_phy_x550em(struct ixgbe_hw *hw)
 		hw->phy.type = ixgbe_phy_x550em_kx4;
 		break;
 	case IXGBE_DEV_ID_X550EM_X_KR:
+	case IXGBE_DEV_ID_X550EM_A_KR:
+	case IXGBE_DEV_ID_X550EM_A_KR_L:
 		hw->phy.type = ixgbe_phy_x550em_kr;
 		break;
 	case IXGBE_DEV_ID_X550EM_X_1G_T:
@@ -1984,13 +1986,17 @@ static s32 ixgbe_setup_kx4_x550em(struct ixgbe_hw *hw)
 	return status;
 }
 
-/**  ixgbe_setup_kr_x550em - Configure the KR PHY.
- *   @hw: pointer to hardware structure
+/**
+ * ixgbe_setup_kr_x550em - Configure the KR PHY
+ * @hw: pointer to hardware structure
  *
- *   Configures the integrated KR PHY.
+ * Configures the integrated KR PHY for X550EM_x.
  **/
 static s32 ixgbe_setup_kr_x550em(struct ixgbe_hw *hw)
 {
+	if (hw->mac.type != ixgbe_mac_X550EM_x)
+		return 0;
+
 	return ixgbe_setup_kr_speed_x550em(hw, hw->phy.autoneg_advertised);
 }
 
@@ -2196,7 +2202,9 @@ static s32 ixgbe_setup_fc_x550em(struct ixgbe_hw *hw)
 		return IXGBE_ERR_CONFIG;
 	}
 
-	if (hw->device_id != IXGBE_DEV_ID_X550EM_X_KR)
+	if (hw->device_id != IXGBE_DEV_ID_X550EM_X_KR &&
+	    hw->device_id != IXGBE_DEV_ID_X550EM_A_KR &&
+	    hw->device_id != IXGBE_DEV_ID_X550EM_A_KR_L)
 		return 0;
 
 	rc = hw->mac.ops.read_iosf_sb_reg(hw,
@@ -2437,6 +2445,8 @@ static enum ixgbe_media_type ixgbe_get_media_type_X550em(struct ixgbe_hw *hw)
 		/* Fallthrough */
 	case IXGBE_DEV_ID_X550EM_X_KR:
 	case IXGBE_DEV_ID_X550EM_X_KX4:
+	case IXGBE_DEV_ID_X550EM_A_KR:
+	case IXGBE_DEV_ID_X550EM_A_KR_L:
 		media_type = ixgbe_media_type_backplane;
 		break;
 	case IXGBE_DEV_ID_X550EM_X_SFP:
-- 
2.5.5

^ permalink raw reply related

* [net-next 16/18] ixgbe: Add support for SGMII backplane interface
From: Jeff Kirsher @ 2016-04-08  3:21 UTC (permalink / raw)
  To: davem; +Cc: Mark Rustad, netdev, nhorman, sassmann, jogreene, Jeff Kirsher
In-Reply-To: <1460085673-87056-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Mark Rustad <mark.d.rustad@intel.com>

Add support for an SGMII backplane interface.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |  2 +
 drivers/net/ethernet/intel/ixgbe/ixgbe_type.h |  9 +++++
 drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c | 58 +++++++++++++++++++++++++++
 3 files changed, 69 insertions(+)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 93db4bf..c96af3f 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -132,6 +132,8 @@ static const struct pci_device_id ixgbe_pci_tbl[] = {
 	{PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550EM_X_10G_T), board_X550EM_x},
 	{PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550EM_X_SFP), board_X550EM_x},
 	{PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550EM_A_SFP_N), board_x550em_a },
+	{PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550EM_A_SGMII), board_x550em_a },
+	{PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550EM_A_SGMII_L), board_x550em_a },
 	{PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550EM_A_SFP), board_x550em_a },
 	/* required last entry */
 	{0, }
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h b/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
index fbbc132..50e8bc0 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
@@ -82,6 +82,8 @@
 #define IXGBE_DEV_ID_X550EM_X_10G_T	0x15AD
 #define IXGBE_DEV_ID_X550EM_X_1G_T	0x15AE
 #define IXGBE_DEV_ID_X550EM_A_SFP_N	0x15C4
+#define IXGBE_DEV_ID_X550EM_A_SGMII	0x15C6
+#define IXGBE_DEV_ID_X550EM_A_SGMII_L	0x15C7
 #define IXGBE_DEV_ID_X550EM_A_SFP	0x15CE
 
 /* VF Device IDs */
@@ -3065,6 +3067,7 @@ enum ixgbe_phy_type {
 	ixgbe_phy_qsfp_intel,
 	ixgbe_phy_qsfp_unknown,
 	ixgbe_phy_sfp_unsupported,
+	ixgbe_phy_sgmii,
 	ixgbe_phy_generic
 };
 
@@ -3582,6 +3585,7 @@ struct ixgbe_info {
 #define IXGBE_KRM_LINK_CTRL_1(P)	((P) ? 0x820C : 0x420C)
 #define IXGBE_KRM_AN_CNTL_1(P)		((P) ? 0x822C : 0x422C)
 #define IXGBE_KRM_AN_CNTL_8(P)		((P) ? 0x8248 : 0x4248)
+#define IXGBE_KRM_SGMII_CTRL(P)		((P) ? 0x82A0 : 0x42A0)
 #define IXGBE_KRM_DSP_TXFFE_STATE_4(P)	((P) ? 0x8634 : 0x4634)
 #define IXGBE_KRM_DSP_TXFFE_STATE_5(P)	((P) ? 0x8638 : 0x4638)
 #define IXGBE_KRM_RX_TRN_LINKUP_CTRL(P)	((P) ? 0x8B00 : 0x4B00)
@@ -3595,6 +3599,8 @@ struct ixgbe_info {
 #define IXGBE_KRM_LINK_CTRL_1_TETH_FORCE_SPEED_MASK	(0x7 << 8)
 #define IXGBE_KRM_LINK_CTRL_1_TETH_FORCE_SPEED_1G	(2 << 8)
 #define IXGBE_KRM_LINK_CTRL_1_TETH_FORCE_SPEED_10G	(4 << 8)
+#define IXGBE_KRM_LINK_CTRL_1_TETH_AN_SGMII_EN		BIT(12)
+#define IXGBE_KRM_LINK_CTRL_1_TETH_AN_CLAUSE_37_EN	BIT(13)
 #define IXGBE_KRM_LINK_CTRL_1_TETH_AN_FEC_REQ		(1 << 14)
 #define IXGBE_KRM_LINK_CTRL_1_TETH_AN_CAP_FEC		(1 << 15)
 #define IXGBE_KRM_LINK_CTRL_1_TETH_AN_CAP_KX		(1 << 16)
@@ -3610,6 +3616,9 @@ struct ixgbe_info {
 #define IXGBE_KRM_AN_CNTL_8_LINEAR			BIT(0)
 #define IXGBE_KRM_AN_CNTL_8_LIMITING			BIT(1)
 
+#define IXGBE_KRM_SGMII_CTRL_MAC_TAR_FORCE_100_D	BIT(12)
+#define IXGBE_KRM_SGMII_CTRL_MAC_TAR_FORCE_10_D		BIT(19)
+
 #define IXGBE_KRM_DSP_TXFFE_STATE_C0_EN			(1 << 6)
 #define IXGBE_KRM_DSP_TXFFE_STATE_CP1_CN1_EN		(1 << 15)
 #define IXGBE_KRM_DSP_TXFFE_STATE_CO_ADAPT_EN		(1 << 16)
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c
index a9d86b3..81e5d54 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c
@@ -1558,6 +1558,57 @@ static s32 ixgbe_check_link_t_X550em(struct ixgbe_hw *hw,
 	return 0;
 }
 
+/**
+ * ixgbe_setup_sgmii - Set up link for sgmii
+ * @hw: pointer to hardware structure
+ */
+static s32
+ixgbe_setup_sgmii(struct ixgbe_hw *hw, __always_unused ixgbe_link_speed speed,
+		  __always_unused bool autoneg_wait_to_complete)
+{
+	struct ixgbe_mac_info *mac = &hw->mac;
+	u32 lval, sval;
+	s32 rc;
+
+	rc = mac->ops.read_iosf_sb_reg(hw,
+				       IXGBE_KRM_LINK_CTRL_1(hw->bus.lan_id),
+				       IXGBE_SB_IOSF_TARGET_KR_PHY, &lval);
+	if (rc)
+		return rc;
+
+	lval &= ~IXGBE_KRM_LINK_CTRL_1_TETH_AN_ENABLE;
+	lval &= ~IXGBE_KRM_LINK_CTRL_1_TETH_FORCE_SPEED_MASK;
+	lval |= IXGBE_KRM_LINK_CTRL_1_TETH_AN_SGMII_EN;
+	lval |= IXGBE_KRM_LINK_CTRL_1_TETH_AN_CLAUSE_37_EN;
+	lval |= IXGBE_KRM_LINK_CTRL_1_TETH_FORCE_SPEED_1G;
+	rc = mac->ops.write_iosf_sb_reg(hw,
+					IXGBE_KRM_LINK_CTRL_1(hw->bus.lan_id),
+					IXGBE_SB_IOSF_TARGET_KR_PHY, lval);
+	if (rc)
+		return rc;
+
+	rc = mac->ops.read_iosf_sb_reg(hw,
+				       IXGBE_KRM_SGMII_CTRL(hw->bus.lan_id),
+				       IXGBE_SB_IOSF_TARGET_KR_PHY, &sval);
+	if (rc)
+		return rc;
+
+	sval |= IXGBE_KRM_SGMII_CTRL_MAC_TAR_FORCE_10_D;
+	sval |= IXGBE_KRM_SGMII_CTRL_MAC_TAR_FORCE_100_D;
+	rc = mac->ops.write_iosf_sb_reg(hw,
+					IXGBE_KRM_SGMII_CTRL(hw->bus.lan_id),
+					IXGBE_SB_IOSF_TARGET_KR_PHY, sval);
+	if (rc)
+		return rc;
+
+	lval |= IXGBE_KRM_LINK_CTRL_1_TETH_AN_RESTART;
+	rc = mac->ops.write_iosf_sb_reg(hw,
+					IXGBE_KRM_LINK_CTRL_1(hw->bus.lan_id),
+					IXGBE_SB_IOSF_TARGET_KR_PHY, lval);
+
+	return rc;
+}
+
 /** ixgbe_init_mac_link_ops_X550em - init mac link function pointers
  *  @hw: pointer to hardware structure
  **/
@@ -1597,6 +1648,9 @@ static void ixgbe_init_mac_link_ops_X550em(struct ixgbe_hw *hw)
 		mac->ops.check_link = ixgbe_check_link_t_X550em;
 		return;
 	case ixgbe_media_type_backplane:
+		if (hw->device_id == IXGBE_DEV_ID_X550EM_A_SGMII ||
+		    hw->device_id == IXGBE_DEV_ID_X550EM_A_SGMII_L)
+			mac->ops.setup_link = ixgbe_setup_sgmii;
 		break;
 	default:
 		mac->ops.setup_fc = ixgbe_setup_fc_x550em;
@@ -2377,6 +2431,10 @@ static enum ixgbe_media_type ixgbe_get_media_type_X550em(struct ixgbe_hw *hw)
 
 	/* Detect if there is a copper PHY attached. */
 	switch (hw->device_id) {
+	case IXGBE_DEV_ID_X550EM_A_SGMII:
+	case IXGBE_DEV_ID_X550EM_A_SGMII_L:
+		hw->phy.type = ixgbe_phy_sgmii;
+		/* Fallthrough */
 	case IXGBE_DEV_ID_X550EM_X_KR:
 	case IXGBE_DEV_ID_X550EM_X_KX4:
 		media_type = ixgbe_media_type_backplane;
-- 
2.5.5

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox