Netdev List
 help / color / mirror / Atom feed
* Re: [RFC PATCH 1/2] net: use cmpxchg instead of spinlock in ptr rings
From: Jesper Dangaard Brouer @ 2016-11-14 11:09 UTC (permalink / raw)
  To: John Fastabend
  Cc: jasowang, netdev, linux-kernel, Michael S. Tsirkin, Jason Wang,
	Mathieu Desnoyers
In-Reply-To: <20161111044408.1547.92737.stgit@john-Precision-Tower-5810>


On Thu, 10 Nov 2016 20:44:08 -0800 John Fastabend <john.fastabend@gmail.com> wrote:

> ---
>  include/linux/ptr_ring_ll.h |  136 +++++++++++++++++++++++++++++++++++++++++++
>  include/linux/skb_array.h   |   25 ++++++++
>  2 files changed, 161 insertions(+)
>  create mode 100644 include/linux/ptr_ring_ll.h
> 
> diff --git a/include/linux/ptr_ring_ll.h b/include/linux/ptr_ring_ll.h
> new file mode 100644
> index 0000000..bcb11f3
> --- /dev/null
> +++ b/include/linux/ptr_ring_ll.h
> @@ -0,0 +1,136 @@
> +/*
> + *	Definitions for the 'struct ptr_ring_ll' datastructure.
> + *
> + *	Author:
> + *		John Fastabend <john.r.fastabend@intel.com>
[...]
> + *
> + *	This is a limited-size FIFO maintaining pointers in FIFO order, with
> + *	one CPU producing entries and another consuming entries from a FIFO.
> + *	extended from ptr_ring_ll to use cmpxchg over spin lock.

It sounds like this is Single Producer Single Consumer (SPSC)
implementation, but your implementation actually is Multi Producer
Multi Consumer (MPMC) capable.

The implementation looks a lot like my alf_queue[1] implementation:
 [1] https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/include/linux/alf_queue.h

If the primary use-case is one CPU producing and another consuming,
then the normal ptr_ring (skb_array) will actually be faster!

The reason is ptr_ring avoids bouncing a cache-line between the CPUs on
every ring access.  This is achieved by having the checks for full
(__ptr_ring_full) and empty (__ptr_ring_empty) use the contents of the
array (NULL value).

I actually implemented two micro-benchmarks to measure the difference
between skb_array[2] and alf_queue[3]:
 [2] https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/lib/skb_array_parallel01.c
 [3] https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/lib/alf_queue_parallel01.c


> + */
> +
> +#ifndef _LINUX_PTR_RING_LL_H
> +#define _LINUX_PTR_RING_LL_H 1
> +
[...]
> +
> +struct ptr_ring_ll {
> +	u32 prod_size;
> +	u32 prod_mask;
> +	u32 prod_head;
> +	u32 prod_tail;
> +	u32 cons_size;
> +	u32 cons_mask;
> +	u32 cons_head;
> +	u32 cons_tail;
> +
> +	void **queue;
> +};

Your implementation doesn't even split the consumer and producer into
different cachelines (which in practice doesn't help much due to how
the empty/full checks are performed).

> +
> +/* Note: callers invoking this in a loop must use a compiler barrier,
> + * for example cpu_relax(). Callers must hold producer_lock.
> + */
> +static inline int __ptr_ring_ll_produce(struct ptr_ring_ll *r, void *ptr)
> +{
> +	u32 ret, head, tail, next, slots, mask;
> +
> +	do {
> +		head = READ_ONCE(r->prod_head);
> +		mask = READ_ONCE(r->prod_mask);
> +		tail = READ_ONCE(r->cons_tail);

Problem occur here, as the producer need to access/read the consumers
tail, to determine if the queue is not already full (slots avail).
Thus, the next "consumer-CPU" will see the cacheline in wrong state
(Modified/Invalid or Shared).

> +
> +		slots = mask + tail - head;
> +		if (slots < 1)
> +			return -ENOMEM;
> +
> +		next = head + 1;
> +		ret = cmpxchg(&r->prod_head, head, next);
> +	} while (ret != head);
> +
> +	r->queue[head & mask] = ptr;
> +	smp_wmb();
> +
> +	while (r->prod_tail != head)
> +		cpu_relax();
> +
> +	r->prod_tail = next;
> +	return 0;
> +}
> +
> +static inline void *__ptr_ring_ll_consume(struct ptr_ring_ll *r)
> +{
> +	u32 ret, head, tail, next, slots, mask;
> +	void *ptr;
> +
> +	do {
> +		head = READ_ONCE(r->cons_head);
> +		mask = READ_ONCE(r->cons_mask);
> +		tail = READ_ONCE(r->prod_tail);

Like wise the consumer is reading the producer tail (for the empty check).

> +
> +		slots = tail - head;
> +		if (slots < 1)
> +			return ERR_PTR(-ENOMEM);
> +
> +		next = head + 1;
> +		ret = cmpxchg(&r->cons_head, head, next);
> +	} while (ret != head);
> +
> +	ptr = r->queue[head & mask];
> +	smp_rmb();
> +
> +	while (r->cons_tail != head)
> +		cpu_relax();
> +
> +	r->cons_tail = next;
> +	return ptr;
> +}
> +
> +static inline void **__ptr_ring_ll_init_queue_alloc(int size, gfp_t gfp)
> +{
> +	return kzalloc(ALIGN(size * sizeof(void *), SMP_CACHE_BYTES), gfp);
> +}
> +
> +static inline int ptr_ring_ll_init(struct ptr_ring_ll *r, int size, gfp_t gfp)
> +{
> +	r->queue = __ptr_ring_init_queue_alloc(size, gfp);
> +	if (!r->queue)
> +		return -ENOMEM;
> +
> +	r->prod_size = r->cons_size = size;
> +	r->prod_mask = r->cons_mask = size - 1;

Shouldn't we have some check like is_power_of_2(size), as this code
looks like it depend on this.

> +	r->prod_tail = r->prod_head = 0;
> +	r->cons_tail = r->prod_tail = 0;
> +
> +	return 0;
> +}
> +
[...]
> +#endif /* _LINUX_PTR_RING_LL_H  */
> diff --git a/include/linux/skb_array.h b/include/linux/skb_array.h
> index f4dfade..9b43dfd 100644
> --- a/include/linux/skb_array.h
> +++ b/include/linux/skb_array.h
[...]
>  
> +static inline int skb_array_ll_produce(struct skb_array_ll *a, struct sk_buff *skb)
> +{
> +	return __ptr_ring_ll_produce(&a->ring, skb);
> +}
> +
[...]
>  
> +static inline struct sk_buff *skb_array_ll_consume(struct skb_array_ll *a)
> +{
> +	return __ptr_ring_ll_consume(&a->ring);
> +}
> +

Note in the Multi Producer Multi Consumer (MPMC) use-case this type of
queue can be faster than normal ptr_ring.  And in patch2 you implement
bulking, which is where the real benefit shows (in the MPMC case) for
this kind of queue.

What I would really like to see is a lock-free (locked cmpxchg) queue
implementation, what like ptr_ring use the array as empty/full check,
and still (somehow) support bulking.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply

* Re: [PATCH] Fixup packets with incorrect ethertype sent by ZTE MF821D
From: Bjørn Mork @ 2016-11-14 11:37 UTC (permalink / raw)
  To: Jussi Peltola; +Cc: netdev
In-Reply-To: <20161114002218.GW2745@pokute.pelzi.net>

Jussi Peltola <plz@plz.fi> writes:

> So here's another stab. The comments and the current implementation are
> not in sync: any non-multicast address starting with a null octet gets
> rewritten, while the comment specifically mentions 00:a0:c6:00:00:00. It
> is certainly not elegant but re-writing all unicast destinations with
> our address does come to mind instead of special cases.

The known bug is related to 00:a0:c6:00:00:00 only.  But the workaround
catches anything starting with 00 for simplicity.  It's a deliberate
trade-off.  Could probably be clearer from the comments, yes.

> This patch fails to handle the invalid destinations in either way so I
> will send another one if you think it's worthwhile to go on. And it
> seems I forgot htons but I need this device for work now so a better
> patch must wait :)
>
> commit 35d3a46b7f1ece70e24386acbdd16af4507cb5f3
> Author: Jussi Peltola <plz@plz.fi>
> Date:   Mon Nov 14 01:45:32 2016 +0200
>
>     Attempt to fix up packets with a broken ethernet header
>     
>     Signed-off-by: Jussi Peltola <plz@plz.fi>
>
> diff --git a/drivers/net/usb/qmi_wwan.c b/drivers/net/usb/qmi_wwan.c
> index 3ff76c6..7308d6b 100644
> --- a/drivers/net/usb/qmi_wwan.c
> +++ b/drivers/net/usb/qmi_wwan.c
> @@ -153,25 +153,57 @@ static const u8 default_modem_addr[ETH_ALEN] = {0x02, 0x50, 0xf3};
>  
>  static const u8 buggy_fw_addr[ETH_ALEN] = {0x00, 0xa0, 0xc6, 0x00, 0x00, 0x00};
>  
> -/* Make up an ethernet header if the packet doesn't have one.
> +/* Check if the ethernet header has an unknown ethertype, and return a
> + * guess of the correct one based on the L3 header, or zero if the type was
> + * known or detection failed.
> + */
> +static __be16 detect_bogus_header(struct sk_buff *skb) {
> +       struct ethhdr *eth_hdr = (struct ethhdr*) skb->data;
> +
> +       switch (eth_hdr->h_proto) {
> +       case ETH_P_IP:
> +       case ETH_P_IPV6:
> +       case ETH_P_ARP:
> +               return 0;
> +       default:
> +               switch (skb->data[14] & 0xf0) {
> +               case 0x40:
> +                       return htons(ETH_P_IP);
> +               case 0x60:
> +                       return htons(ETH_P_IPV6);
> +               default:
> +                       /* pass on undetectable packets */
> +                       return 0;
> +               }
> +       }
> +       /*NOTREACHED*/
> +       return 0;
> +}
> +
> +/* Make up an ethernet header if the packet doesn't have a correct one.
>   *
>   * A firmware bug common among several devices cause them to send raw
>   * IP packets under some circumstances.  There is no way for the
>   * driver/host to know when this will happen.  And even when the bug
>   * hits, some packets will still arrive with an intact header.
>   *
> - * The supported devices are only capably of sending IPv4, IPv6 and
> + * The supported devices are only capable of sending IPv4, IPv6 and
>   * ARP packets on a point-to-point link. Any packet with an ethernet
>   * header will have either our address or a broadcast/multicast
> - * address as destination.  ARP packets will always have a header.
> + * address as destination. ARP packets will always have a header.
>   *
>   * This means that this function will reliably add the appropriate
> - * header iff necessary, provided our hardware address does not start
> + * header if necessary, provided our hardware address does not start
>   * with 4 or 6.
>   *
>   * Another common firmware bug results in all packets being addressed
>   * to 00:a0:c6:00:00:00 despite the host address being different.
> - * This function will also fixup such packets.
> + *
> + * Some devices will send packets with garbage source/destination MACs and
> + * ethertypes.
> + *
> + * This function will try to fix up all such packets.
> + *
>   */
>  static int qmi_wwan_rx_fixup(struct usbnet *dev, struct sk_buff *skb)
>  {
> @@ -179,8 +211,8 @@ static int qmi_wwan_rx_fixup(struct usbnet *dev, struct sk_buff *skb)
>         bool rawip = info->flags & QMI_WWAN_FLAG_RAWIP;
>         __be16 proto;
>  
> -       /* This check is no longer done by usbnet */
> -       if (skb->len < dev->net->hard_header_len)
> +       /* Shorter is definitely invalid and breaks subsequent tests */
> +       if (skb->len < 15)
>                 return 0;
>  
>         switch (skb->data[0] & 0xf0) {


Makes sense, but could we please use some reasonable macro or something
instead of an arbitrary magic number?  Anything shorter than an IP
header will be bogus, for example.



> @@ -190,17 +222,17 @@ static int qmi_wwan_rx_fixup(struct usbnet *dev, struct sk_buff *skb)
>         case 0x60:
>                 proto = htons(ETH_P_IPV6);
>                 break;
> -       case 0x00:
> +       default:
>                 if (rawip)
>                         return 0;
>                 if (is_multicast_ether_addr(skb->data))
>                         return 1;
> -               /* possibly bogus destination - rewrite just in case */
> -               skb_reset_mac_header(skb);
> -               goto fix_dest;
> -       default:
> -               if (rawip)
> -                       return 0;
> +               proto = detect_bogus_header(skb);
> +               if (proto) {
> +                       /* remove terminally broken header */
> +                       skb_pull(skb, ETH_HLEN);
> +                       break;
> +               }
>                 /* pass along other packets without modifications */
>                 return 1;
>         }


Am I missing somehting, or is this removing the bogus destination
address fixup?  In any case, I don't have any device with that bug
available or the time to go and test it.  So I want you to either leave
that part of the code alone, or verify the workaround on a device with
the "00:a0:c6:00:00:00" bug.


> @@ -208,17 +240,17 @@ static int qmi_wwan_rx_fixup(struct usbnet *dev, struct sk_buff *skb)
>                 skb->dev = dev->net; /* normally set by eth_type_trans */
>                 skb->protocol = proto;
>                 return 1;
> +       } else {


Completely unnecessary "else". We return from the other branch.

In any case, this change is not related to the rest of the patch and is
just making review more confusing.  And why would you add more indenting
levels than strictly necessary?


But please: Try to make the device work with raw-ip first. Qualcomm
never managed to fix their fake ethernet mode, and have given up on it.
Newer devices don't have that mode at all.  The device you have does
still have the mode.  But as is pretty obvious from the observed
behaviour: It's completely untested and not really working.  If you look
at the Windows driver, I'm pretty sure you'll find that the only reason
it works is because they use raw-ip mode.

FWIW, I regret having chosen the fake ethernet mode as the default for
qmi_wwan.  I did not anticipate the number of firmware issues Qualcomm
could manage to create simply adding a made-up ethernet header.  And I
did not anticipate them finally dropping it on the floor, forcing us to
support raw-ip mode anyway.

But with raw-ip support, there is absolutely no reason to play around
with fixing up the ethernet header bugs anymore.  If you have an older
device where it sort of works, then by all means use it.  But otherwise:
Use raw-ip mode.  I don't think I would have wanted any of the existing
header fixups either if the driver had supported raw-ip when they were
proposed.


Bjørn

^ permalink raw reply

* Re: [PATCH 0/5] net: thunderx: Miscellaneous fixes
From: Matthias Brugger @ 2016-11-14 12:01 UTC (permalink / raw)
  To: sunil.kovvuri, netdev; +Cc: Sunil Goutham, linux-kernel, linux-arm-kernel
In-Reply-To: <1479120886-13425-1-git-send-email-sunil.kovvuri@gmail.com>



On 14/11/16 11:54, sunil.kovvuri@gmail.com wrote:
> From: Sunil Goutham <sgoutham@cavium.com>
>
> This patchset includes fixes for incorrect LMAC credits,
> unreliable driver statistics, memory leak upon interface
> down e.t.c
>

Are these fixes relevant to for older kernels as well?
If so, please add "Cc: stable@vger.kernel.org" to the Sigend-off list.

Thanks,
Matthias

> Radha Mohan Chintakuntla (1):
>   net: thunderx: Introduce BGX_ID_MASK macro to extract bgx_id
>
> Sunil Goutham (4):
>   net: thunderx: Program LMAC credits based on MTU
>   net: thunderx: Fix configuration of L3/L4 length checking
>   net: thunderx: Fix VF driver's interface statistics
>   net: thunderx: Fix memory leak and other issues upon interface toggle
>
>  drivers/net/ethernet/cavium/thunder/nic.h          |  64 +++++----
>  drivers/net/ethernet/cavium/thunder/nic_main.c     |  37 +++--
>  drivers/net/ethernet/cavium/thunder/nic_reg.h      |   1 +
>  .../net/ethernet/cavium/thunder/nicvf_ethtool.c    | 105 +++++++-------
>  drivers/net/ethernet/cavium/thunder/nicvf_main.c   | 153 +++++++++++----------
>  drivers/net/ethernet/cavium/thunder/nicvf_queues.c | 118 +++++++++-------
>  drivers/net/ethernet/cavium/thunder/nicvf_queues.h |  24 +---
>  drivers/net/ethernet/cavium/thunder/thunder_bgx.c  |   4 +-
>  drivers/net/ethernet/cavium/thunder/thunder_bgx.h  |   2 +
>  9 files changed, 274 insertions(+), 234 deletions(-)
>

^ permalink raw reply

* Re: Debugging Ethernet issues
From: Mason @ 2016-11-14 12:13 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: netdev, Florian Fainelli, Mans Rullgard, Sergei Shtylyov,
	Tom Lendacky, Zach Brown, Shaohui Xie, Tim Beale, Brian Hill,
	Vince Bridgers, Balakumaran Kannan, David S. Miller,
	Sebastian Frias, Kirill Kapranov
In-Reply-To: <20161113030919.GA2892@lunn.ch>

On 13/11/2016 04:09, Andrew Lunn wrote:

> Mason wrote:
> 
>> When connected to a Gigabit switch
>> 3.4 negotiates a LAN DHCP setup instantly
>> 4.7 requires over 5 seconds to do so
> 
> When you run tcpdump on the DHCP server, are you noticing the first
> request is missing?
> 
> What can happen is the dhclient gets started immediately and sends out
> its first request before auto-negotiation has finished. So this first packet
> gets lost. The retransmit after a few seconds is then successful.

This is what happens on 3.4

# time udhcpc | while read LINE; do date; echo $LINE; done
Mon Nov 14 11:57:12 UTC 2016
udhcpc (v1.22.1) started
Mon Nov 14 11:57:12 UTC 2016
Sending discover...
[   50.150000] tangox-enet.0: link up (1000 Mbps - Full Duplex)
Mon Nov 14 11:57:15 UTC 2016
Sending discover...
Mon Nov 14 11:57:16 UTC 2016
Sending select for 172.27.64.58...
Mon Nov 14 11:57:17 UTC 2016
Lease of 172.27.64.58 obtained, lease time 604800
Mon Nov 14 11:57:17 UTC 2016
deleting routers
Mon Nov 14 11:57:17 UTC 2016
adding dns 172.27.0.17

real    0m4.704s
user    0m0.030s
sys     0m0.550s


The corresponding log on the DHCP server was

# tcpdump -n -i eth1-boards ether host 00:16:e8:4b:b0:7d
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth1-boards, link-type EN10MB (Ethernet), capture size 262144 bytes
11:57:16.095474 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:16:e8:4b:b0:7d, length 300
11:57:16.095638 IP 172.27.64.1 > 172.27.64.58: ICMP echo request, id 29883, seq 0, length 28
11:57:17.096740 IP 172.27.200.1.67 > 172.27.64.58.68: BOOTP/DHCP, Reply, length 391
11:57:17.097182 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:16:e8:4b:b0:7d, length 300
11:57:17.202842 IP 172.27.200.1.67 > 172.27.64.58.68: BOOTP/DHCP, Reply, length 391
11:57:21.101946 ARP, Request who-has 172.27.64.58 tell 172.27.64.1, length 28
11:57:21.102182 ARP, Reply 172.27.64.58 is-at 00:16:e8:4b:b0:7d, length 46


This is a different log which I got earlier, but can no longer reproduce:

# tcpdump -n -i eth1-boards ether host 00:16:e8:4b:b0:7d
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth1-boards, link-type EN10MB (Ethernet), capture size 262144 bytes
11:08:09.610662 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:16:e8:4b:b0:7d, length 300
11:08:10.642852 IP 172.27.200.1.67 > 172.27.64.58.68: BOOTP/DHCP, Reply, length 391
11:08:10.643276 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:16:e8:4b:b0:7d, length 300
11:08:10.790526 IP 172.27.200.1.67 > 172.27.64.58.68: BOOTP/DHCP, Reply, length 391
11:08:11.638146 ARP, Reply 172.27.64.58 is-at 00:16:e8:4b:b0:7d, length 46
11:08:11.638156 IP 172.27.64.1 > 172.27.64.58: ICMP echo request, id 29883, seq 0, length 28
11:08:11.638345 IP 172.27.64.58 > 172.27.64.1: ICMP echo reply, id 29883, seq 0, length 28
11:08:16.642811 ARP, Request who-has 172.27.64.1 tell 172.27.64.58, length 46
11:08:16.642822 ARP, Reply 172.27.64.1 is-at 00:15:17:24:e0:81, length 28



This is what happens on v4.7

# time udhcpc | while read LINE; do date; echo $LINE; done
Mon Nov 14 11:51:25 UTC 2016
udhcpc (v1.22.1) started
Mon Nov 14 11:51:25 UTC 2016
Sending discover...
Mon Nov 14 11:51:28 UTC 2016
Sending discover...
[  342.658572] nb8800 26000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx
Mon Nov 14 11:51:32 UTC 2016
Sending discover...
Mon Nov 14 11:51:33 UTC 2016
Sending select for 172.27.64.58...
Mon Nov 14 11:51:33 UTC 2016
Lease of 172.27.64.58 obtained, lease time 604800
Mon Nov 14 11:51:33 UTC 2016
deleting routers
Mon Nov 14 11:51:33 UTC 2016
adding dns 172.27.0.17

real    0m7.348s
user    0m0.053s
sys     0m0.077s


The corresponding log on the DHCP server was

# tcpdump -n -i eth1-boards ether host 00:16:e8:4b:b0:7d
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth1-boards, link-type EN10MB (Ethernet), capture size 262144 bytes
11:51:31.957245 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:16:e8:4b:b0:7d, length 300
11:51:31.957409 IP 172.27.64.1 > 172.27.64.58: ICMP echo request, id 29883, seq 0, length 28
11:51:32.958514 IP 172.27.200.1.67 > 172.27.64.58.68: BOOTP/DHCP, Reply, length 391
11:51:32.970538 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:16:e8:4b:b0:7d, length 300
11:51:33.038205 IP 172.27.200.1.67 > 172.27.64.58.68: BOOTP/DHCP, Reply, length 391
11:51:36.957949 ARP, Request who-has 172.27.64.58 tell 172.27.64.1, length 28
11:51:36.958112 ARP, Reply 172.27.64.58 is-at 00:16:e8:4b:b0:7d, length 46


For reference, here are the ethtool statistics on 4.7 after the DHCP setup:

# ethtool -S eth0
NIC statistics:
     rx_bytes_ok: 1747
     rx_frames_ok: 9
     rx_undersize_frames: 0
     rx_fragment_frames: 0
     rx_64_byte_frames: 3
     rx_127_byte_frames: 2
     rx_255_byte_frames: 1
     rx_511_byte_frames: 3
     rx_1023_byte_frames: 0
     rx_max_size_frames: 0
     rx_oversize_frames: 0
     rx_bad_fcs_frames: 0
     rx_broadcast_frames: 4
     rx_multicast_frames: 1
     rx_control_frames: 0
     rx_pause_frames: 0
     rx_unsup_control_frames: 0
     rx_align_error_frames: 0
     rx_overrun_frames: 0
     rx_jabber_frames: 0
     rx_bytes: 1747
     rx_frames: 9
     tx_bytes_ok: 756
     tx_frames_ok: 3
     tx_64_byte_frames: 1
     tx_127_byte_frames: 0
     tx_255_byte_frames: 0
     tx_511_byte_frames: 2
     tx_1023_byte_frames: 0
     tx_max_size_frames: 0
     tx_oversize_frames: 0
     tx_broadcast_frames: 2
     tx_multicast_frames: 0
     tx_control_frames: 0
     tx_pause_frames: 0
     tx_underrun_frames: 0
     tx_single_collision_frames: 0
     tx_multi_collision_frames: 0
     tx_deferred_collision_frames: 0
     tx_late_collision_frames: 0
     tx_excessive_collision_frames: 0
     tx_bytes: 756
     tx_frames: 3
     tx_collisions: 0


Regards.

^ permalink raw reply

* Re: [PATCH 3/5] net: thunderx: Fix configuration of L3/L4 length checking
From: Corentin Labbe @ 2016-11-14 12:33 UTC (permalink / raw)
  To: sunil.kovvuri; +Cc: netdev, Sunil Goutham, linux-kernel, linux-arm-kernel
In-Reply-To: <1479120886-13425-4-git-send-email-sunil.kovvuri@gmail.com>

On Mon, Nov 14, 2016 at 04:24:44PM +0530, sunil.kovvuri@gmail.com wrote:
> From: Sunil Goutham <sgoutham@cavium.com>
> 
> This patch fixes enabling of HW verification of L3/L4 length and
> TCP/UDP checksum which is currently being cleared. Also fixed VLAN
> stripping config which is being cleared when multiqset is enabled.
> 
> Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
> ---
>  drivers/net/ethernet/cavium/thunder/nicvf_queues.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
> index f0e0ca6..3050177 100644
> --- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
> +++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
> @@ -538,9 +538,12 @@ static void nicvf_rcv_queue_config(struct nicvf *nic, struct queue_set *qs,
>  	mbx.rq.cfg = (1ULL << 62) | (RQ_CQ_DROP << 8);
>  	nicvf_send_msg_to_pf(nic, &mbx);
>  
> -	nicvf_queue_reg_write(nic, NIC_QSET_RQ_GEN_CFG, 0, 0x00);
> -	if (!nic->sqs_mode)
> +	if (!nic->sqs_mode && (qidx == 0)) {
> +		/* Enable checking L3/L4 length and TCP/UDP checksums */
> +		nicvf_queue_reg_write(nic, NIC_QSET_RQ_GEN_CFG, 0,
> +				      ((1 << 24) | (1 << 23) | (1 << 21)));

Hello

You could use the BIT() macro here

Regards

^ permalink raw reply

* Re: Debugging Ethernet issues
From: Mason @ 2016-11-14 12:45 UTC (permalink / raw)
  To: netdev
  Cc: Andrew Lunn, Florian Fainelli, Mans Rullgard, Sergei Shtylyov,
	Tom Lendacky, Zach Brown, Shaohui Xie, Tim Beale, Brian Hill,
	Vince Bridgers, Balakumaran Kannan, David S. Miller,
	Sebastian Frias, Kirill Kapranov
In-Reply-To: <5829AA80.2090102@free.fr>

On 14/11/2016 13:13, Mason wrote:

> This is a different log which I got earlier, but can no longer reproduce:
> 
> # tcpdump -n -i eth1-boards ether host 00:16:e8:4b:b0:7d
> tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
> listening on eth1-boards, link-type EN10MB (Ethernet), capture size 262144 bytes
> 11:08:09.610662 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:16:e8:4b:b0:7d, length 300
> 11:08:10.642852 IP 172.27.200.1.67 > 172.27.64.58.68: BOOTP/DHCP, Reply, length 391
> 11:08:10.643276 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:16:e8:4b:b0:7d, length 300
> 11:08:10.790526 IP 172.27.200.1.67 > 172.27.64.58.68: BOOTP/DHCP, Reply, length 391
> 11:08:11.638146 ARP, Reply 172.27.64.58 is-at 00:16:e8:4b:b0:7d, length 46
> 11:08:11.638156 IP 172.27.64.1 > 172.27.64.58: ICMP echo request, id 29883, seq 0, length 28
> 11:08:11.638345 IP 172.27.64.58 > 172.27.64.1: ICMP echo reply, id 29883, seq 0, length 28
> 11:08:16.642811 ARP, Request who-has 172.27.64.1 tell 172.27.64.58, length 46
> 11:08:16.642822 ARP, Reply 172.27.64.1 is-at 00:15:17:24:e0:81, length 28

Additional tests on v4.7

If I set the link up *BEFORE* running the DHCP client, then I get:

# ip link set eth0 up /* Wait 4-5 seconds */
[   69.815303] nb8800 26000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx

# time udhcpc | while read LINE; do date; echo $LINE; done
Mon Nov 14 13:27:30 UTC 2016
udhcpc (v1.22.1) started
Mon Nov 14 13:27:30 UTC 2016
Sending discover...
Mon Nov 14 13:27:31 UTC 2016
Sending select for 172.27.64.58...
Mon Nov 14 13:27:32 UTC 2016
Lease of 172.27.64.58 obtained, lease time 604800
Mon Nov 14 13:27:32 UTC 2016
deleting routers
Mon Nov 14 13:27:32 UTC 2016
adding dns 172.27.0.17

real    0m1.292s
user    0m0.037s
sys     0m0.087s

# tcpdump -n -i eth1-boards ether host 00:16:e8:4b:b0:7d
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth1-boards, link-type EN10MB (Ethernet), capture size 262144 bytes
13:27:30.922880 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:16:e8:4b:b0:7d, length 300
13:27:30.923055 IP 172.27.64.1 > 172.27.64.58: ICMP echo request, id 29883, seq 0, length 28
13:27:31.924151 IP 172.27.200.1.67 > 172.27.64.58.68: BOOTP/DHCP, Reply, length 391
13:27:31.936221 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:16:e8:4b:b0:7d, length 300
13:27:32.061869 IP 172.27.200.1.67 > 172.27.64.58.68: BOOTP/DHCP, Reply, length 391
13:27:35.933946 ARP, Request who-has 172.27.64.58 tell 172.27.64.1, length 28
13:27:35.934079 ARP, Reply 172.27.64.58 is-at 00:16:e8:4b:b0:7d, length 46


I did see (once) the 9-packet trace (with the ping echo/reply)

# tcpdump -n -i eth1-boards ether host 00:16:e8:4b:b0:7d
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth1-boards, link-type EN10MB (Ethernet), capture size 262144 bytes
13:17:31.494117 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:16:e8:4b:b0:7d, length 300
13:17:32.495374 IP 172.27.200.1.67 > 172.27.64.58.68: BOOTP/DHCP, Reply, length 391
13:17:32.510753 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:16:e8:4b:b0:7d, length 300
13:17:32.591262 IP 172.27.200.1.67 > 172.27.64.58.68: BOOTP/DHCP, Reply, length 391
13:17:33.494093 ARP, Reply 172.27.64.58 is-at 00:16:e8:4b:b0:7d, length 46
13:17:33.494107 IP 172.27.64.1 > 172.27.64.58: ICMP echo request, id 29883, seq 0, length 28
13:17:33.494242 IP 172.27.64.58 > 172.27.64.1: ICMP echo reply, id 29883, seq 0, length 28
13:17:38.500651 ARP, Request who-has 172.27.64.1 tell 172.27.64.58, length 46
13:17:38.500663 ARP, Reply 172.27.64.1 is-at 00:15:17:24:e0:81, length 28

Since the ping request is sent from the DHCP server, perhaps the server
is checking its ARP table. I think it is not an important difference.


Experiment #2

Set link up, then down, then up. Then send DHCP request.

# ip link set eth0 up
[   39.185326] nb8800 26000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx
[root@buildroot ~]# ip link set eth0 down
/* WAIT A LONG TIME FOR "Link is Down" MESSAGE */
# ip link set eth0 up
[  102.818598] nb8800 26000.ethernet eth0: Link is Down
[  104.828632] nb8800 26000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx

Note: isn't it weird that I have to set link up before "link down" message appears?


# time udhcpc | while read LINE; do date; echo $LINE; done
Mon Nov 14 13:40:08 UTC 2016
udhcpc (v1.22.1) started
Mon Nov 14 13:40:08 UTC 2016
Sending discover...
Mon Nov 14 13:40:11 UTC 2016
Sending discover...
Mon Nov 14 13:40:14 UTC 2016
Sending discover...
Mon Nov 14 13:40:37 UTC 2016
Sending discover...
Mon Nov 14 13:40:40 UTC 2016
Sending discover...
Mon Nov 14 13:40:43 UTC 2016
Sending discover...
Mon Nov 14 13:41:06 UTC 2016
Sending discover...
Mon Nov 14 13:41:09 UTC 2016
Sending discover...
Mon Nov 14 13:41:12 UTC 2016
Sending discover...
Mon Nov 14 13:41:35 UTC 2016
Sending discover...
Mon Nov 14 13:41:38 UTC 2016
Sending discover...
Mon Nov 14 13:41:42 UTC 2016
Sending discover...
\x03^C

real    1m37.623s
user    0m0.100s
sys     0m0.053s


# tcpdump -n -i eth1-boards ether host 00:16:e8:4b:b0:7d
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth1-boards, link-type EN10MB (Ethernet), capture size 262144 bytes
13:40:08.593122 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:16:e8:4b:b0:7d, length 300
13:40:09.594365 IP 172.27.200.1.67 > 172.27.64.58.68: BOOTP/DHCP, Reply, length 391
13:40:11.619772 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:16:e8:4b:b0:7d, length 300
13:40:11.619925 IP 172.27.200.1.67 > 172.27.64.58.68: BOOTP/DHCP, Reply, length 391
13:40:14.646372 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:16:e8:4b:b0:7d, length 300
13:40:14.646535 IP 172.27.200.1.67 > 172.27.64.58.68: BOOTP/DHCP, Reply, length 391
13:40:37.706093 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:16:e8:4b:b0:7d, length 300
13:40:37.706257 IP 172.27.200.1.67 > 172.27.64.58.68: BOOTP/DHCP, Reply, length 391
13:40:40.732693 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:16:e8:4b:b0:7d, length 300
13:40:40.732827 IP 172.27.200.1.67 > 172.27.64.58.68: BOOTP/DHCP, Reply, length 391
13:40:43.759342 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:16:e8:4b:b0:7d, length 300
13:40:43.759475 IP 172.27.200.1.67 > 172.27.64.58.68: BOOTP/DHCP, Reply, length 391
13:41:06.819024 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:16:e8:4b:b0:7d, length 300
13:41:06.819201 IP 172.27.200.1.67 > 172.27.64.58.68: BOOTP/DHCP, Reply, length 391
13:41:09.845671 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:16:e8:4b:b0:7d, length 300
13:41:09.845807 IP 172.27.200.1.67 > 172.27.64.58.68: BOOTP/DHCP, Reply, length 391
13:41:12.872271 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:16:e8:4b:b0:7d, length 300
13:41:12.872396 IP 172.27.200.1.67 > 172.27.64.58.68: BOOTP/DHCP, Reply, length 391
13:41:35.931994 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:16:e8:4b:b0:7d, length 300
13:41:35.932162 IP 172.27.200.1.67 > 172.27.64.58.68: BOOTP/DHCP, Reply, length 391
13:41:38.958593 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:16:e8:4b:b0:7d, length 300
13:41:38.958742 IP 172.27.200.1.67 > 172.27.64.58.68: BOOTP/DHCP, Reply, length 391
13:41:41.985194 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:16:e8:4b:b0:7d, length 300
13:41:41.985359 IP 172.27.200.1.67 > 172.27.64.58.68: BOOTP/DHCP, Reply, length 391
^C
24 packets captured
24 packets received by filter
0 packets dropped by kernel


Link appears to be broken (not receiving) after up/down/up sequence.
Despite the rx_frames_ok: 8 ???
And why 8, server sent 12 replies...

# ethtool -S eth0
NIC statistics:
     rx_bytes_ok: 975
     rx_frames_ok: 8
     rx_undersize_frames: 0
     rx_fragment_frames: 0
     rx_64_byte_frames: 6
     rx_127_byte_frames: 0
     rx_255_byte_frames: 1
     rx_511_byte_frames: 1
     rx_1023_byte_frames: 0
     rx_max_size_frames: 0
     rx_oversize_frames: 0
     rx_bad_fcs_frames: 0
     rx_broadcast_frames: 3
     rx_multicast_frames: 1
     rx_control_frames: 0
     rx_pause_frames: 0
     rx_unsup_control_frames: 0
     rx_align_error_frames: 0
     rx_overrun_frames: 0
     rx_jabber_frames: 0
     rx_bytes: 975
     rx_frames: 8
     tx_bytes_ok: 4344
     tx_frames_ok: 15
     tx_64_byte_frames: 3
     tx_127_byte_frames: 0
     tx_255_byte_frames: 0
     tx_511_byte_frames: 12
     tx_1023_byte_frames: 0
     tx_max_size_frames: 0
     tx_oversize_frames: 0
     tx_broadcast_frames: 12
     tx_multicast_frames: 0
     tx_control_frames: 0
     tx_pause_frames: 0
     tx_underrun_frames: 0
     tx_single_collision_frames: 0
     tx_multi_collision_frames: 0
     tx_deferred_collision_frames: 0
     tx_late_collision_frames: 0
     tx_excessive_collision_frames: 0
     tx_bytes: 4344
     tx_frames: 15
     tx_collisions: 0


Will add a few traces, as suggested by Florian.

Regards.

^ permalink raw reply

* [net-next PATCH v2] net: dummy: Introduce dummy virtual functions
From: Phil Sutter @ 2016-11-14 13:02 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Sabrina Dubroca

The idea for this was born when testing VF support in iproute2 which was
impeded by hardware requirements. In fact, not every VF-capable hardware
driver implements all netdev ops, so testing the interface is still hard
to do even with a well-sorted hardware shelf.

To overcome this and allow for testing the user-kernel interface, this
patch allows to turn dummy into a PF with a configurable amount of VFs.

Due to the assumption that all PFs are PCI devices, this implementation
is not completely straightforward: In order to allow for
rtnl_fill_ifinfo() to see the dummy VFs, a fake PCI parent device is
attached to the dummy netdev. This has to happen at the right spot so
register_netdevice() does not get confused. This patch abuses
ndo_fix_features callback for that. In ndo_uninit callback, the fake
parent is removed again for the same purpose.

Joint work with Sabrina Dubroca.

Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Phil Sutter <phil@nwl.cc>
---
Changes since v1:
- Fixed issues reported by kbuild test robot:
  - pci_dev->sriov is only present if CONFIG_PCI_ATS is active.
  - pci_bus_type does not exist if CONFIG_PCI is not defined.
---
 drivers/net/dummy.c | 199 +++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 197 insertions(+), 2 deletions(-)

diff --git a/drivers/net/dummy.c b/drivers/net/dummy.c
index 69fc8409a9733..a831537145bd9 100644
--- a/drivers/net/dummy.c
+++ b/drivers/net/dummy.c
@@ -34,6 +34,8 @@
 #include <linux/etherdevice.h>
 #include <linux/init.h>
 #include <linux/moduleparam.h>
+#include <linux/pci.h>
+#include "../pci/pci.h"		/* for struct pci_sriov */
 #include <linux/rtnetlink.h>
 #include <net/rtnetlink.h>
 #include <linux/u64_stats_sync.h>
@@ -42,6 +44,37 @@
 #define DRV_VERSION	"1.0"
 
 static int numdummies = 1;
+static int num_vfs;
+
+static struct pci_sriov pdev_sriov;
+
+static struct pci_dev pci_pdev = {
+	.is_physfn = 1,
+#ifdef CONFIG_PCI_ATS
+	.sriov = &pdev_sriov,
+#endif
+#ifdef CONFIG_PCI
+	.dev.bus = &pci_bus_type,
+#endif
+};
+
+struct vf_data_storage {
+	unsigned char vf_mac[ETH_ALEN];
+	u16 pf_vlan; /* When set, guest VLAN config not allowed. */
+	u16 pf_qos;
+	__be16 vlan_proto;
+	u16 min_tx_rate;
+	u16 max_tx_rate;
+	u8 spoofchk_enabled;
+	bool rss_query_enabled;
+	u8 trusted;
+	int link_state;
+};
+
+struct dummy_priv {
+	int num_vfs;
+	struct vf_data_storage *vfinfo;
+};
 
 /* fake multicast ability */
 static void set_multicast_list(struct net_device *dev)
@@ -91,15 +124,29 @@ static netdev_tx_t dummy_xmit(struct sk_buff *skb, struct net_device *dev)
 
 static int dummy_dev_init(struct net_device *dev)
 {
+	struct dummy_priv *priv = netdev_priv(dev);
+
 	dev->dstats = netdev_alloc_pcpu_stats(struct pcpu_dstats);
 	if (!dev->dstats)
 		return -ENOMEM;
 
+	priv->num_vfs = num_vfs;
+	priv->vfinfo = NULL;
+
+	if (!num_vfs)
+		return 0;
+
+	priv->vfinfo = kcalloc(num_vfs, sizeof(struct vf_data_storage),
+			       GFP_KERNEL);
+	if (!priv->vfinfo)
+		return -ENOMEM;
+
 	return 0;
 }
 
 static void dummy_dev_uninit(struct net_device *dev)
 {
+	dev->dev.parent = NULL;
 	free_percpu(dev->dstats);
 }
 
@@ -112,6 +159,129 @@ static int dummy_change_carrier(struct net_device *dev, bool new_carrier)
 	return 0;
 }
 
+/* fake, just to set fake PCI parent after netdev_register_kobject() */
+static netdev_features_t dummy_fix_features(struct net_device *dev,
+					    netdev_features_t features)
+{
+	struct dummy_priv *priv = netdev_priv(dev);
+
+	if (priv->num_vfs)
+		dev->dev.parent = &pci_pdev.dev;
+
+	return features;
+}
+
+static int dummy_set_vf_mac(struct net_device *dev, int vf, u8 *mac)
+{
+	struct dummy_priv *priv = netdev_priv(dev);
+
+	if (!is_valid_ether_addr(mac) || (vf >= priv->num_vfs))
+		return -EINVAL;
+
+	memcpy(priv->vfinfo[vf].vf_mac, mac, ETH_ALEN);
+
+	return 0;
+}
+
+static int dummy_set_vf_vlan(struct net_device *dev, int vf,
+			     u16 vlan, u8 qos, __be16 vlan_proto)
+{
+	struct dummy_priv *priv = netdev_priv(dev);
+
+	if ((vf >= priv->num_vfs) || (vlan > 4095) || (qos > 7))
+		return -EINVAL;
+
+	priv->vfinfo[vf].pf_vlan = vlan;
+	priv->vfinfo[vf].pf_qos = qos;
+	priv->vfinfo[vf].vlan_proto = vlan_proto;
+
+	return 0;
+}
+
+static int dummy_set_vf_rate(struct net_device *dev, int vf, int min, int max)
+{
+	struct dummy_priv *priv = netdev_priv(dev);
+
+	if (vf >= priv->num_vfs)
+		return -EINVAL;
+
+	priv->vfinfo[vf].min_tx_rate = min;
+	priv->vfinfo[vf].max_tx_rate = max;
+
+	return 0;
+}
+
+static int dummy_set_vf_spoofchk(struct net_device *dev, int vf, bool val)
+{
+	struct dummy_priv *priv = netdev_priv(dev);
+
+	if (vf >= priv->num_vfs)
+		return -EINVAL;
+
+	priv->vfinfo[vf].spoofchk_enabled = val;
+
+	return 0;
+}
+
+static int dummy_set_vf_rss_query_en(struct net_device *dev, int vf, bool val)
+{
+	struct dummy_priv *priv = netdev_priv(dev);
+
+	if (vf >= priv->num_vfs)
+		return -EINVAL;
+
+	priv->vfinfo[vf].rss_query_enabled = val;
+
+	return 0;
+}
+
+static int dummy_set_vf_trust(struct net_device *dev, int vf, bool val)
+{
+	struct dummy_priv *priv = netdev_priv(dev);
+
+	if (vf >= priv->num_vfs)
+		return -EINVAL;
+
+	priv->vfinfo[vf].trusted = val;
+
+	return 0;
+}
+
+static int dummy_get_vf_config(struct net_device *dev,
+			       int vf, struct ifla_vf_info *ivi)
+{
+	struct dummy_priv *priv = netdev_priv(dev);
+
+	if (vf >= priv->num_vfs)
+		return -EINVAL;
+
+	ivi->vf = vf;
+	memcpy(&ivi->mac, priv->vfinfo[vf].vf_mac, ETH_ALEN);
+	ivi->vlan = priv->vfinfo[vf].pf_vlan;
+	ivi->qos = priv->vfinfo[vf].pf_qos;
+	ivi->spoofchk = priv->vfinfo[vf].spoofchk_enabled;
+	ivi->linkstate = priv->vfinfo[vf].link_state;
+	ivi->min_tx_rate = priv->vfinfo[vf].min_tx_rate;
+	ivi->max_tx_rate = priv->vfinfo[vf].max_tx_rate;
+	ivi->rss_query_en = priv->vfinfo[vf].rss_query_enabled;
+	ivi->trusted = priv->vfinfo[vf].trusted;
+	ivi->vlan_proto = priv->vfinfo[vf].vlan_proto;
+
+	return 0;
+}
+
+static int dummy_set_vf_link_state(struct net_device *dev, int vf, int state)
+{
+	struct dummy_priv *priv = netdev_priv(dev);
+
+	if (vf >= priv->num_vfs)
+		return -EINVAL;
+
+	priv->vfinfo[vf].link_state = state;
+
+	return 0;
+}
+
 static const struct net_device_ops dummy_netdev_ops = {
 	.ndo_init		= dummy_dev_init,
 	.ndo_uninit		= dummy_dev_uninit,
@@ -121,6 +291,15 @@ static const struct net_device_ops dummy_netdev_ops = {
 	.ndo_set_mac_address	= eth_mac_addr,
 	.ndo_get_stats64	= dummy_get_stats64,
 	.ndo_change_carrier	= dummy_change_carrier,
+	.ndo_fix_features	= dummy_fix_features,
+	.ndo_set_vf_mac		= dummy_set_vf_mac,
+	.ndo_set_vf_vlan	= dummy_set_vf_vlan,
+	.ndo_set_vf_rate	= dummy_set_vf_rate,
+	.ndo_set_vf_spoofchk	= dummy_set_vf_spoofchk,
+	.ndo_set_vf_trust	= dummy_set_vf_trust,
+	.ndo_get_vf_config	= dummy_get_vf_config,
+	.ndo_set_vf_link_state	= dummy_set_vf_link_state,
+	.ndo_set_vf_rss_query_en = dummy_set_vf_rss_query_en,
 };
 
 static void dummy_get_drvinfo(struct net_device *dev,
@@ -134,6 +313,14 @@ static const struct ethtool_ops dummy_ethtool_ops = {
 	.get_drvinfo            = dummy_get_drvinfo,
 };
 
+static void dummy_free_netdev(struct net_device *dev)
+{
+	struct dummy_priv *priv = netdev_priv(dev);
+
+	kfree(priv->vfinfo);
+	free_netdev(dev);
+}
+
 static void dummy_setup(struct net_device *dev)
 {
 	ether_setup(dev);
@@ -141,7 +328,7 @@ static void dummy_setup(struct net_device *dev)
 	/* Initialize the device structure. */
 	dev->netdev_ops = &dummy_netdev_ops;
 	dev->ethtool_ops = &dummy_ethtool_ops;
-	dev->destructor = free_netdev;
+	dev->destructor = dummy_free_netdev;
 
 	/* Fill in device structure with ethernet-generic values. */
 	dev->flags |= IFF_NOARP;
@@ -169,6 +356,7 @@ static int dummy_validate(struct nlattr *tb[], struct nlattr *data[])
 
 static struct rtnl_link_ops dummy_link_ops __read_mostly = {
 	.kind		= DRV_NAME,
+	.priv_size	= sizeof(struct dummy_priv),
 	.setup		= dummy_setup,
 	.validate	= dummy_validate,
 };
@@ -177,12 +365,16 @@ static struct rtnl_link_ops dummy_link_ops __read_mostly = {
 module_param(numdummies, int, 0);
 MODULE_PARM_DESC(numdummies, "Number of dummy pseudo devices");
 
+module_param(num_vfs, int, 0);
+MODULE_PARM_DESC(num_vfs, "Number of dummy VFs per dummy device");
+
 static int __init dummy_init_one(void)
 {
 	struct net_device *dev_dummy;
 	int err;
 
-	dev_dummy = alloc_netdev(0, "dummy%d", NET_NAME_UNKNOWN, dummy_setup);
+	dev_dummy = alloc_netdev(sizeof(struct dummy_priv),
+				 "dummy%d", NET_NAME_UNKNOWN, dummy_setup);
 	if (!dev_dummy)
 		return -ENOMEM;
 
@@ -190,6 +382,7 @@ static int __init dummy_init_one(void)
 	err = register_netdevice(dev_dummy);
 	if (err < 0)
 		goto err;
+
 	return 0;
 
 err:
@@ -201,6 +394,8 @@ static int __init dummy_init_module(void)
 {
 	int i, err = 0;
 
+	pdev_sriov.num_VFs = num_vfs;
+
 	rtnl_lock();
 	err = __rtnl_link_register(&dummy_link_ops);
 	if (err < 0)
-- 
2.10.0

^ permalink raw reply related

* Re: Debugging Ethernet issues
From: Sebastian Frias @ 2016-11-14 13:03 UTC (permalink / raw)
  To: Florian Fainelli, Mason, Andrew Lunn
  Cc: netdev, Mans Rullgard, Sergei Shtylyov, Tom Lendacky, Zach Brown,
	Shaohui Xie, Tim Beale, Brian Hill, Vince Bridgers,
	Balakumaran Kannan, David S. Miller, Kirill Kapranov
In-Reply-To: <9d1f28a7-959b-fdde-3403-f6da5f521125@gmail.com>

On 11/13/2016 08:55 PM, Florian Fainelli wrote:
> Le 13/11/2016 à 11:51, Mason a écrit :
>> On 13/11/2016 04:09, Andrew Lunn wrote:
>>
>>> Mason wrote:
>>>
>>>> When connected to a Gigabit switch
>>>> 3.4 negotiates a LAN DHCP setup instantly
>>>> 4.7 requires over 5 seconds to do so
>>>
>>> When you run tcpdump on the DHCP server, are you noticing the first
>>> request is missing?
>>>
>>> What can happen is the dhclient gets started immediately and sends out
>>> its first request before auto-negotiation has finished. So this first packet
>>> gets lost. The retransmit after a few seconds is then successful.
>>
>> I will run tcpdump on the server as I run udhcpc on the client
>> for Linux 3.4 vs 4.7
>>
>> Do you know what would make auto-negotiation fail at 100 Mbps
>> on 4.7? (whereas it succeeds on 3.4)
>>
>> (Thinking out loud) If the problem were in auto-negotiation,
>> then if should work if I hard-code speed and duplex using
>> ethtool, right? (IIRC, hard-coding doesn't help.)
> 
> I would start with checking basic things:
> 
> - does your Ethernet driver get a link UP being reported correctly
> (netif_carrier_ok returns 1)?
> - if you let the bootloader configure the PHY and utilize the Generic
> PHY driver instead of the Atheros PHY driver, does the problem appear as
> well?

Would using a "fixed-link" serve the same?

It appears that using a fixed-link

&eth0 {
	#address-cells = <1>;
	#size-cells = <0>;

#ifdef WITH_FIXED_LINK
	phy-connection-type = "rgmii";

	fixed-link {
		   speed = <100>;
		   full-duplex;
	};
#else
	phy-connection-type = "rgmii";
	phy-handle = <&eth0_phy>;

	/* Atheros AR8035 */
	eth0_phy: ethernet-phy@4 {
		interrupt-parent = <&irq0>;
		compatible = "ethernet-phy-id004d.d072",
			     "ethernet-phy-ieee802.3-c22";
		interrupts = <37 IRQ_TYPE_EDGE_RISING>;
		reg = <4>;
	};
#endif
};

works.


----

For what is worth, the patch that Mason was talking about earlier
in the thread:

  "...After much hair-pulling, it turned out that *some* of the breakage
was caused by a local patch..."

was setting changing the following delay in 'drivers/net/phy/phy.c:phy_state_machine()'

	/* Only re-schedule a PHY state machine change if we are polling the
	 * PHY, if PHY_IGNORE_INTERRUPT is set, then we will be moving
	 * between states from phy_mac_interrupt()
	 */
	if (phydev->irq == PHY_POLL)
		queue_delayed_work(system_power_efficient_wq, &phydev->state_queue,
				   PHY_STATE_TIME * HZ);

from "PHY_STATE_TIME * HZ" to "0".

That caused 2 of 3 types of boards to fail, while one of them always worked
regardless of the delay.

In a nutshell:
- Board A, chip X: works with delay "PHY_STATE_TIME * HZ" or "0".
- Board B, chip X: does not work with delay "0"
- Board C, chip Y: does not work with delay "0"

Does board A works by "luck" when this delay is "0"?
(this delay has always been there, but it is not clear why)

> - what do transmit/receive counters on the Ethernet driver/MAC return?
> 

^ permalink raw reply

* Re: Is Documentation/networking/phy.txt still up-to-date?
From: Sebastian Frias @ 2016-11-14 13:18 UTC (permalink / raw)
  To: Florian Fainelli, afleming, jgarzik, Måns Rullgård
  Cc: netdev, LKML, David S. Miller, Mason
In-Reply-To: <24676d24-5b07-7d26-6ffa-14e857960a52@gmail.com>

On 11/09/2016 06:07 PM, Florian Fainelli wrote:
> On 11/09/2016 05:24 AM, Sebastian Frias wrote:
>> Hi,
>>
>> Documentation/networking/phy.txt discusses phy_connect and states that:
>>
>>  "...
>>
>>  interface is a u32 which specifies the connection type used
>>  between the controller and the PHY.  Examples are GMII, MII,
>>  RGMII, and SGMII.  For a full list, see include/linux/phy.h
>>
>>  Now just make sure that phydev->supported and phydev->advertising have any
>>  values pruned from them which don't make sense for your controller (a 10/100
>>  controller may be connected to a gigabit capable PHY, so you would need to
>>  mask off SUPPORTED_1000baseT*).  See include/linux/ethtool.h for definitions
>>  for these bitfields. Note that you should not SET any bits, or the PHY may
>>  get put into an unsupported state.
>>
>>  ..."
>>
>> However, 'drivers/net/ethernet/aurora/nb8800.c' for example, does SETs some
>> bits (in function 'nb8800_pause_adv').
> 
> All pause/flow control related bits should be set by the Ethernet MAC
> driver because this is an Ethernet MAC, not PHY, thing. See this
> discussion for some details:
> 
> https://www.mail-archive.com/netdev@vger.kernel.org/msg135347.html
> 
> So the nb8800 drivers does the correct thing here, but the documentation
> should be updated to reflect that this applies to all bits, except the
> Pause capabilities because these need to come from the Ethernet MAC.
> 

Ok, thanks.

>>
>> I checked 'drivers/net/ethernet/broadcom/genet/bcmmii.c' and that one CLEARs
>> bits (as per the documentation).
>>
>> Does anybody knows what is the correct/recommended approach?
> 
> Both drivers do correct things, they just don't set the same things here.
> 

Thanks!

^ permalink raw reply

* [PATCH 1/2] net: arc_emac: annonce IFF_MULTICAST support
From: Alexander Kochetkov @ 2016-11-14 13:20 UTC (permalink / raw)
  To: netdev, linux-kernel, davem, tremyfr, peter.chen, wxt, weiyj.lk
  Cc: Alexander Kochetkov

Multicast support was implemented by commit 775dd682e2b0ec7
('arc_emac: implement promiscuous mode and multicast filtering').

It can be enabled explicity using 'ifconfig eth0 multicast'.
The patch is needed in order to remove explicit configuration
as most devices has multicast mode enabled by default.
---
 drivers/net/ethernet/arc/emac_main.c |    2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/net/ethernet/arc/emac_main.c b/drivers/net/ethernet/arc/emac_main.c
index b0da969..2e4ee86 100644
--- a/drivers/net/ethernet/arc/emac_main.c
+++ b/drivers/net/ethernet/arc/emac_main.c
@@ -764,8 +764,6 @@ int arc_emac_probe(struct net_device *ndev, int interface)
 	ndev->netdev_ops = &arc_emac_netdev_ops;
 	ndev->ethtool_ops = &arc_emac_ethtool_ops;
 	ndev->watchdog_timeo = TX_TIMEOUT;
-	/* FIXME :: no multicast support yet */
-	ndev->flags &= ~IFF_MULTICAST;
 
 	priv = netdev_priv(ndev);
 	priv->dev = dev;
-- 
1.7.9.5

^ permalink raw reply related

* [PATCH 2/2] net: arc_emac: don't pass multicast packets to kernel in non-multicast mode
From: Alexander Kochetkov @ 2016-11-14 13:20 UTC (permalink / raw)
  To: netdev, linux-kernel, davem, tremyfr, peter.chen, wxt, weiyj.lk
  Cc: Alexander Kochetkov
In-Reply-To: <1479129627-27524-1-git-send-email-al.kochet@gmail.com>

The patch disable capturing multicast packets when multicast mode
disabled for ethernet ('ifconfig eth0 -multicast'). In that case
no multicast packet will be passed to kernel.
---
 drivers/net/ethernet/arc/emac_main.c |    6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/arc/emac_main.c b/drivers/net/ethernet/arc/emac_main.c
index 2e4ee86..30ad833 100644
--- a/drivers/net/ethernet/arc/emac_main.c
+++ b/drivers/net/ethernet/arc/emac_main.c
@@ -460,7 +460,7 @@ static void arc_emac_set_rx_mode(struct net_device *ndev)
 		if (ndev->flags & IFF_ALLMULTI) {
 			arc_reg_set(priv, R_LAFL, ~0);
 			arc_reg_set(priv, R_LAFH, ~0);
-		} else {
+		} else if (ndev->flags & IFF_MULTICAST) {
 			struct netdev_hw_addr *ha;
 			unsigned int filter[2] = { 0, 0 };
 			int bit;
@@ -469,9 +469,11 @@ static void arc_emac_set_rx_mode(struct net_device *ndev)
 				bit = ether_crc_le(ETH_ALEN, ha->addr) >> 26;
 				filter[bit >> 5] |= 1 << (bit & 31);
 			}
-
 			arc_reg_set(priv, R_LAFL, filter[0]);
 			arc_reg_set(priv, R_LAFH, filter[1]);
+		} else {
+			arc_reg_set(priv, R_LAFL, 0);
+			arc_reg_set(priv, R_LAFH, 0);
 		}
 	}
 }
-- 
1.7.9.5

^ permalink raw reply related

* Re: [PATCH 1/2] net: ethernet: nb8800: Do not apply TX delay at MAC level
From: Sebastian Frias @ 2016-11-14 13:22 UTC (permalink / raw)
  To: Florian Fainelli, Måns Rullgård
  Cc: David S. Miller, netdev, LKML, Mason, Andrew Lunn
In-Reply-To: <0fc519a7-6e42-de55-7a4a-c7dc9d64c5db@gmail.com>

On 11/09/2016 06:03 PM, Florian Fainelli wrote:
> On 11/09/2016 05:02 AM, Sebastian Frias wrote:
>> On 11/04/2016 05:49 PM, Måns Rullgård wrote:
>>>>> But when doing so, both the Atheros 8035 and the Aurora NB8800 drivers
>>>>> will apply the delay.
>>>>>
>>>>> I think a better way of dealing with this is that both, PHY and MAC
>>>>> drivers exchange information so that the delay is applied only once.
>>>>
>>>> Exchange what information? The PHY device interface (phydev->interface)
>>>> conveys the needed information for both entities.
>>>
>>> There doesn't seem to be any consensus among the drivers regarding where
>>> the delay should be applied.  Since only a few drivers, MAC or PHY, act
>>> on this property, most combinations still work by chance.  It is common
>>> for boards to set the delay at the PHY using external config pins so no
>>> software setup is required (although I have one Sigma based board that
>>> gets this wrong).  I suspect if drivers/net/ethernet/broadcom/genet were
>>> used with one of the four PHY drivers that also set the delay based on
>>> this DT property, things would go wrong.
>>>
>>
>> Exactly, what about a patch like (I can make a formal submission, even
>> merge it with the patch discussed in this thread, consider this a RFC):
> 
> I really don't see a point in doing this when we can just clarify what
> phydev->interface does and already have the knowledge that we need
> without introducing additional flags in the phy driver.
> 

Ok, so who can clarify what "phydev->interface" does, especially in the
context of this discussion?

What happens when a TX delay must be applied and:
  - both the PHY and the MAC support the delay
  - only the PHY supports the delay
  - only the MAC supports the delay

Best regards,

Sebastian

^ permalink raw reply

* Re: Debugging Ethernet issues
From: Mason @ 2016-11-14 13:28 UTC (permalink / raw)
  To: Florian Fainelli, Andrew Lunn
  Cc: netdev, Mans Rullgard, Sergei Shtylyov, Tom Lendacky, Zach Brown,
	Shaohui Xie, Tim Beale, Brian Hill, Vince Bridgers,
	Balakumaran Kannan, David S. Miller, Sebastian Frias,
	Kirill Kapranov
In-Reply-To: <9d1f28a7-959b-fdde-3403-f6da5f521125@gmail.com>

On 13/11/2016 20:55, Florian Fainelli wrote:
> Le 13/11/2016 à 11:51, Mason a écrit :
>> On 13/11/2016 04:09, Andrew Lunn wrote:
>>
>>> Mason wrote:
>>>
>>>> When connected to a Gigabit switch
>>>> 3.4 negotiates a LAN DHCP setup instantly
>>>> 4.7 requires over 5 seconds to do so
>>>
>>> When you run tcpdump on the DHCP server, are you noticing the first
>>> request is missing?
>>>
>>> What can happen is the dhclient gets started immediately and sends out
>>> its first request before auto-negotiation has finished. So this first packet
>>> gets lost. The retransmit after a few seconds is then successful.
>>
>> I will run tcpdump on the server as I run udhcpc on the client
>> for Linux 3.4 vs 4.7
>>
>> Do you know what would make auto-negotiation fail at 100 Mbps
>> on 4.7? (whereas it succeeds on 3.4)
>>
>> (Thinking out loud) If the problem were in auto-negotiation,
>> then if should work if I hard-code speed and duplex using
>> ethtool, right? (IIRC, hard-coding doesn't help.)
> 
> I would start with checking basic things:
> 
> - does your Ethernet driver get a link UP being reported correctly
> (netif_carrier_ok returns 1)?

I don't see any calls to netif_carrier_ok() in the network driver
( drivers/net/ethernet/aurora/nb8800.c )
Maybe it is using some generic infrastructure?

> - if you let the bootloader configure the PHY and utilize the Generic
> PHY driver instead of the Atheros PHY driver, does the problem appear as
> well?

How exactly does one use the generic PHY driver?
Do you mean the following?
Documentation/devicetree/bindings/net/fixed-link.txt

Regards.

^ permalink raw reply

* [PATCHv2 1/2] net: arc_emac: annonce IFF_MULTICAST support
From: Alexander Kochetkov @ 2016-11-14 13:32 UTC (permalink / raw)
  To: netdev, linux-kernel, davem, tremyfr, peter.chen, wxt, weiyj.lk
  Cc: Alexander Kochetkov
In-Reply-To: <1479129627-27524-2-git-send-email-al.kochet@gmail.com>

Multicast support was implemented by commit 775dd682e2b0ec7
('arc_emac: implement promiscuous mode and multicast filtering').

It can be enabled explicity using 'ifconfig eth0 multicast'.
The patch is needed in order to remove explicit configuration
as most devices has multicast mode enabled by default.

Signed-off-by: Alexander Kochetkov <al.kochet@gmail.com>
---
Changes in v2:
Add Signed-off-by text.

 drivers/net/ethernet/arc/emac_main.c |    2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/net/ethernet/arc/emac_main.c b/drivers/net/ethernet/arc/emac_main.c
index b0da969..2e4ee86 100644
--- a/drivers/net/ethernet/arc/emac_main.c
+++ b/drivers/net/ethernet/arc/emac_main.c
@@ -764,8 +764,6 @@ int arc_emac_probe(struct net_device *ndev, int interface)
 	ndev->netdev_ops = &arc_emac_netdev_ops;
 	ndev->ethtool_ops = &arc_emac_ethtool_ops;
 	ndev->watchdog_timeo = TX_TIMEOUT;
-	/* FIXME :: no multicast support yet */
-	ndev->flags &= ~IFF_MULTICAST;
 
 	priv = netdev_priv(ndev);
 	priv->dev = dev;
-- 
1.7.9.5

^ permalink raw reply related

* [PATCHv2 2/2] net: arc_emac: don't pass multicast packets to kernel in non-multicast mode
From: Alexander Kochetkov @ 2016-11-14 13:32 UTC (permalink / raw)
  To: netdev, linux-kernel, davem, tremyfr, peter.chen, wxt, weiyj.lk
  Cc: Alexander Kochetkov
In-Reply-To: <1479130373-28077-1-git-send-email-al.kochet@gmail.com>

The patch disable capturing multicast packets when multicast mode
disabled for ethernet ('ifconfig eth0 -multicast'). In that case
no multicast packet will be passed to kernel.

Signed-off-by: Alexander Kochetkov <al.kochet@gmail.com>
---
Changes in v2:
Add Signed-off-by text.
Removed deleted line from patch.

 drivers/net/ethernet/arc/emac_main.c |    5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/arc/emac_main.c b/drivers/net/ethernet/arc/emac_main.c
index 2e4ee86..be865b4 100644
--- a/drivers/net/ethernet/arc/emac_main.c
+++ b/drivers/net/ethernet/arc/emac_main.c
@@ -460,7 +460,7 @@ static void arc_emac_set_rx_mode(struct net_device *ndev)
 		if (ndev->flags & IFF_ALLMULTI) {
 			arc_reg_set(priv, R_LAFL, ~0);
 			arc_reg_set(priv, R_LAFH, ~0);
-		} else {
+		} else if (ndev->flags & IFF_MULTICAST) {
 			struct netdev_hw_addr *ha;
 			unsigned int filter[2] = { 0, 0 };
 			int bit;
@@ -472,6 +472,9 @@ static void arc_emac_set_rx_mode(struct net_device *ndev)
 
 			arc_reg_set(priv, R_LAFL, filter[0]);
 			arc_reg_set(priv, R_LAFH, filter[1]);
+		} else {
+			arc_reg_set(priv, R_LAFL, 0);
+			arc_reg_set(priv, R_LAFH, 0);
 		}
 	}
 }
-- 
1.7.9.5

^ permalink raw reply related

* Re: Debugging Ethernet issues
From: Andrew Lunn @ 2016-11-14 13:34 UTC (permalink / raw)
  To: Mason
  Cc: Florian Fainelli, netdev, Mans Rullgard, Sergei Shtylyov,
	Tom Lendacky, Zach Brown, Shaohui Xie, Tim Beale, Brian Hill,
	Vince Bridgers, Balakumaran Kannan, David S. Miller,
	Sebastian Frias, Kirill Kapranov
In-Reply-To: <5829BBFC.7040800@free.fr>

> How exactly does one use the generic PHY driver?

If there is no specific driver available for the PHY, linux will fall
back to the generic driver. So just don't compile the specific driver.
You can see under /sys/bus/mdio_bus/devices which driver is being
used.

	Andrew

^ permalink raw reply

* Re: [PATCH net-next v2 1/5] ethtool: (uapi) Add ETHTOOL_PHY_GTUNABLE and ETHTOOL_PHY_STUNABLE
From: Andrew Lunn @ 2016-11-14 13:36 UTC (permalink / raw)
  To: Allan W. Nielsen; +Cc: netdev, raju.lakkaraju
In-Reply-To: <1479115760-1182-2-git-send-email-allan.nielsen@microsemi.com>

On Mon, Nov 14, 2016 at 10:29:16AM +0100, Allan W. Nielsen wrote:
> From: Raju Lakkaraju <Raju.Lakkaraju@microsemi.com>
> 
> Defines a generic API to get/set phy tunables. The API is using the
> existing ethtool_tunable/tunable_type_id types which is already being used
> for mac level tunables.
> 
> Signed-off-by: Raju Lakkaraju <Raju.Lakkaraju@microsemi.com>

Reviewed-by: Andrew Lunn <andrew@lunn.ch>

    Andrew

^ permalink raw reply

* Re: [PATCH net-next v2 2/5] ethtool: Implements ETHTOOL_PHY_GTUNABLE/ETHTOOL_PHY_STUNABLE
From: Andrew Lunn @ 2016-11-14 13:40 UTC (permalink / raw)
  To: Allan W. Nielsen; +Cc: netdev, raju.lakkaraju
In-Reply-To: <1479115760-1182-3-git-send-email-allan.nielsen@microsemi.com>

On Mon, Nov 14, 2016 at 10:29:17AM +0100, Allan W. Nielsen wrote:
> From: Raju Lakkaraju <Raju.Lakkaraju@microsemi.com>
> 
> Adding get_tunable/set_tunable function pointer to the phy_driver
> structure, and uses these function pointers to implement the
> ETHTOOL_PHY_GTUNABLE/ETHTOOL_PHY_STUNABLE ioctls.
> 
> Signed-off-by: Raju Lakkaraju <Raju.Lakkaraju@microsemi.com>

Reviewed-by: Andrew Lunn <andrew@lunn.ch>

    Andrew

^ permalink raw reply

* Re: [PATCH net-next v2 3/5] ethtool: (uapi) Add ETHTOOL_PHY_DOWNSHIFT to PHY tunables
From: Andrew Lunn @ 2016-11-14 13:42 UTC (permalink / raw)
  To: Allan W. Nielsen; +Cc: netdev, raju.lakkaraju
In-Reply-To: <1479115760-1182-4-git-send-email-allan.nielsen@microsemi.com>

On Mon, Nov 14, 2016 at 10:29:18AM +0100, Allan W. Nielsen wrote:
> From: Raju Lakkaraju <Raju.Lakkaraju@microsemi.com>
> 
> For operation in cabling environments that are incompatible with
> 1000BAST-T, PHY device may provide an automatic link speed downshift
> operation. When enabled, the device automatically changes its 1000BAST-T
> auto-negotiation to the next slower speed after a configured number of
> failed attempts at 1000BAST-T.  This feature is useful in setting up in
> networks using older cable installations that include only pairs A and B,
> and not pairs C and D.

Hi Allan

A nitpick:

s/BAST/BASE/g

Otherwise, this is good.

	   Andrew

^ permalink raw reply

* Re: [PATCH net-next v2 4/5] ethtool: Core impl for ETHTOOL_PHY_DOWNSHIFT tunable
From: Andrew Lunn @ 2016-11-14 13:43 UTC (permalink / raw)
  To: Allan W. Nielsen; +Cc: netdev, raju.lakkaraju
In-Reply-To: <1479115760-1182-5-git-send-email-allan.nielsen@microsemi.com>

On Mon, Nov 14, 2016 at 10:29:19AM +0100, Allan W. Nielsen wrote:
> From: Raju Lakkaraju <Raju.Lakkaraju@microsemi.com>
> 
> Adding validation support for the ETHTOOL_PHY_DOWNSHIFT. Functional
> implementation needs to be done in the individual PHY drivers.
> 
> Signed-off-by: Raju Lakkaraju <Raju.Lakkaraju@microsemi.com>

Reviewed-by: Andrew Lunn <andrew@lunn.ch>

    Andrew

^ permalink raw reply

* Re: [PATCH net-next v2 5/5] net: phy: Add downshift get/set support in Microsemi PHYs driver
From: Andrew Lunn @ 2016-11-14 13:48 UTC (permalink / raw)
  To: Allan W. Nielsen; +Cc: netdev, raju.lakkaraju
In-Reply-To: <1479115760-1182-6-git-send-email-allan.nielsen@microsemi.com>

On Mon, Nov 14, 2016 at 10:29:20AM +0100, Allan W. Nielsen wrote:
> From: Raju Lakkaraju <Raju.Lakkaraju@microsemi.com>
> 
> Implements the phy tunable function pointers and implement downshift
> functionality for MSCC PHYs.
> 
> Signed-off-by: Raju Lakkaraju <Raju.Lakkaraju@microsemi.com>

Reviewed-by: Andrew Lunn <andrew@lunn.ch>

    Andrew

^ permalink raw reply

* Re: [PATCH net-next v2 0/5] Adding PHY-Tunables and downshift support
From: Andrew Lunn @ 2016-11-14 13:49 UTC (permalink / raw)
  To: Allan W. Nielsen; +Cc: netdev, raju.lakkaraju
In-Reply-To: <1479115760-1182-1-git-send-email-allan.nielsen@microsemi.com>

1;4600;0cOn Mon, Nov 14, 2016 at 10:29:15AM +0100, Allan W. Nielsen wrote:
> Hi,
> 
> This is a follow-up on the patch series posted at Fri, 4 Nov 2016 10:55:25
> +0100 (old cover letter included below).

Hi Allan

Pretty good. Just one minor nitpick to fix.

       Andrew

^ permalink raw reply

* Re: [net-next PATCH v2] net: dummy: Introduce dummy virtual functions
From: Eric Dumazet @ 2016-11-14 14:23 UTC (permalink / raw)
  To: Phil Sutter; +Cc: David Miller, netdev, Sabrina Dubroca
In-Reply-To: <20161114130205.18333-1-phil@nwl.cc>

On Mon, 2016-11-14 at 14:02 +0100, Phil Sutter wrote:
> The idea for this was born when testing VF support in iproute2 which was
> impeded by hardware requirements. In fact, not every VF-capable hardware
> driver implements all netdev ops, so testing the interface is still hard
> to do even with a well-sorted hardware shelf.
> 
...
>  static int dummy_dev_init(struct net_device *dev)
>  {
> +	struct dummy_priv *priv = netdev_priv(dev);
> +
>  	dev->dstats = netdev_alloc_pcpu_stats(struct pcpu_dstats);
>  	if (!dev->dstats)
>  		return -ENOMEM;
>  
> +	priv->num_vfs = num_vfs;
> +	priv->vfinfo = NULL;
> +
> +	if (!num_vfs)
> +		return 0;
> +
> +	priv->vfinfo = kcalloc(num_vfs, sizeof(struct vf_data_storage),
> +			       GFP_KERNEL);
> +	if (!priv->vfinfo)
> +		return -ENOMEM;

You must free dev->dstats here, otherwise kernel memory will leak. 

> +
>  	return 0;
>  }
>  

^ permalink raw reply

* Re: [PATCH net-next v2] ipv6: sr: fix IPv6 initialization failure without lwtunnels
From: Roopa Prabhu @ 2016-11-14 14:22 UTC (permalink / raw)
  To: David Lebrun; +Cc: David Miller, netdev, lorenzo
In-Reply-To: <5828C619.2020008@uclouvain.be>

On 11/13/16, 11:59 AM, David Lebrun wrote:
> On 11/13/2016 06:23 AM, David Miller wrote:
>> This seems like such a huge mess, quite frankly.
>>
>> IPV6-SR has so many strange dependencies, a weird Kconfig option that is
>> simply controlling what a responsible sysadmin should be allow to do if
>> he chooses anyways.
>>
>> Every distribution is going to say "¯\_(ツ)_/¯" and just turn the thing
>> on in their builds.
> Indeed, the issue is that seg6_iptunnel.o was included in obj-y instead
> of ipv6-y, triggering the bug when CONFIG_IPV6=m. Fixed with the
> following modification to the patch (tested with allyesconfig and
> allmodconfig):
>
> diff --git a/net/ipv6/Makefile b/net/ipv6/Makefile
> index 8979d53..a233136 100644
> --- a/net/ipv6/Makefile
> +++ b/net/ipv6/Makefile
> @@ -53,6 +53,6 @@ obj-$(subst m,y,$(CONFIG_IPV6)) += inet6_hashtables.o
>
>  ifneq ($(CONFIG_IPV6),)
>  obj-$(CONFIG_NET_UDP_TUNNEL) += ip6_udp_tunnel.o
> -obj-$(CONFIG_LWTUNNEL) += seg6_iptunnel.o
> +ipv6-$(CONFIG_LWTUNNEL) += seg6_iptunnel.o
>  obj-y += mcast_snoop.o
>  endif


>
> I agree with you that the way to combine the dependencies is strange,
> even if they are very few. The part of the IPv6-SR patch that is enabled
> by default depends on two things: IPV6 and LWTUNNEL. The problem is that
> LWTUNNEL does not depend on IPV6 and is not necessarily enabled. To fix
> the bug reported by Lorenzo, I propose to select one the three following
> solutions:
>
> 1. Make LWTUNNEL always enabled (removing the option).
>    Pros: remove an option
>    Cons: add always-enabled code
>
> 2. Create an option IPV6_SEG6_LWTUNNEL, which would select LWTUNNEL and
> enable the compilation of seg6_iptunnel.o.
>    Pros: logically dissociate the part of IPv6-SR that depends on
> LWTUNNEL from the core patch and simplifies compilation
>    Cons: add an option

I prefer option b). most LWTUNNEL encaps are done this way.

seg6 and seg6_iptunnel is new segment routing code and can be under CONFIG_IPV6_SEG6 which depends on CONFIG_LWTUNNEL and CONFIG_IPV6. CONFIG_IPV6_SEG6_HMAC could then depend on CONFIG_IPV6_SEG6



>
> 3. Apply the proposed patch with the fix
>    Pros: do not modify options
>    Cons: weird conditional compilation
>
> What do you think ?
>
> David
>

^ permalink raw reply

* Re: [PATCH net-next v5] cadence: Add LSO support.
From: Eric Dumazet @ 2016-11-14 14:35 UTC (permalink / raw)
  To: Rafal Ozieblo
  Cc: David Miller, nicolas.ferre@atmel.com, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org
In-Reply-To: <BN3PR07MB2516C30913D2FB8D4788071AC9BC0@BN3PR07MB2516.namprd07.prod.outlook.com>

On Mon, 2016-11-14 at 09:32 +0000, Rafal Ozieblo wrote:
> From: David Miller [mailto:davem@davemloft.net] 
> Sent: 10 listopada 2016 18:01
> To: Rafal Ozieblo
> Cc: nicolas.ferre@atmel.com; netdev@vger.kernel.org; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH net-next v5]] cadence: Add LSO support.
> 
> From: Rafal Ozieblo <rafalo@cadence.com>
> Date: Wed, 9 Nov 2016 13:41:02 +0000
> 
> > First, please remove the spurious closing bracket in your Subject line in future submittions.
> >
> >> +	if (is_udp) /* is_udp is only set when (is_lso) is checked */
> >> +		/* zero UDP checksum, not calculated by h/w for UFO */
> >> +		udp_hdr(skb)->check = 0;
> >
> > This is really not ok.
> > 
> > If UFO is in use it should not silently disable UDP checksums.
> > 
> > If you cannot support UFO with proper checksumming, then you cannot enable support for that feature.
> 
> According Cadence Gigabit Ethernet MAC documentation:
> 
> "Hardware will not calculate the UDP checksum or modify the UDP checksum field. Therefore software
> must set a value of zero in the checksum field in the UDP header (in the first payload buffer)
> to indicate to the receiver that the UDP datagram does not include a checksum."
> 
> It is hardware requirement.

Then do not claim NETIF_F_UFO suport in the driver, if hardware is
absolutely not capable of handling this.

Linux stack will provide proper udp checksum.

Almost no driver sets NETIF_F_UFO.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox