Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [RFT net-next v2 0/3] dwmac-meson8b: RGMII clock fixes for Meson8b
From: Emiliano Ingrassia @ 2017-12-28 17:51 UTC (permalink / raw)
  To: Martin Blumenstingl
  Cc: netdev, linus.luessing, khilman, linux-amlogic, jbrunet,
	Neil Armstrong, peppe.cavallaro, alexandre.torgue
In-Reply-To: <CAFBinCBMN4Kkai82ZaehytM11JR+TCcz2jUSLU5gbWz9w8PE6A@mail.gmail.com>

Hi Martin,

thank you for the quick response!

On Thu, Dec 28, 2017 at 05:58:34PM +0100, Martin Blumenstingl wrote:
> Hi Emiliano,
> 
> thank you for testing this!
> 
> On Thu, Dec 28, 2017 at 5:16 PM, Emiliano Ingrassia
> <ingrassia@epigenesys.com> wrote:
> > Hi Martin, Hi Dave,
> >
> > On Sun, Dec 24, 2017 at 12:40:57AM +0100, Martin Blumenstingl wrote:
> >> Hi Dave,
> >>
> >> please do not apply this series until it got a Tested-by from Emiliano.
> >>
> >>
> >> Hi Emiliano,
> >>
> >> you reported [0] that you couldn't get dwmac-meson8b to work on your
> >> Odroid-C1. With your findings (register dumps, clk_summary output, etc.)
> >> I think I was able to find a fix: it consists of two patches (which you
> >> find in this series)
> >>
> >> Unfortunately I don't have any Meson8b boards with RGMII PHY so I could
> >> only partially test this (I could only check if the clocks were
> >> calculated correctly when using a dummy 500002394Hz input clock instead
> >> of MPLL2).
> >>
> >> Could you please give this series a try and let me know about the
> >> results?
> >> You obviously still need your two "ARM: dts: meson8b" patches which
> >> - add the amlogic,meson8b-dwmac" compatible to meson8b.dtsi
> >> - enable Ethernet on the Odroid-C1
> >>
> >> I have tested this myself on a Khadas VIM (GXL SoC, internal RMII PHY)
> >> and a Khadas VIM2 (GXM SoC, external RGMII PHY). Both are still working
> >> fine (so let's hope that this also fixes your Meson8b issue :)).
> >>
> >>
> >> changes since v1 at [1]:
> >> - changed the subject of the cover-letter to indicate that this is all
> >>   about the RGMII clock
> >> - added PATCH #1 which ensures that we don't unnecessarily change the
> >>   parent clocks in RMII mode (and also makes the code easier to
> >>   understand)
> >> - changed subject of PATCH #2 (formerly PATCH #1) to state that this
> >>   is about the RGMII clock
> >> - added Jerome's Reviewed-by to PATCH #2 (formerly PATCH #1)
> >> - replaced PATCH #3 (formerly PATCH #2) with one that sets
> >>   CLK_SET_RATE_PARENT on the mux and thus re-configures the MPLL2 clock
> >>   on Meson8b correctly
> >>
> >
> > Really thank you for your help and effort. I tried your patch but
> > unfortunately it didn't solve the problem.
> this is probably my fault: I forgot to mention that it requires a fix
> for the 32-bit SoCs in the clock driver ("clk: meson: mpll: use 64-bit
> maths in params_from_rate", see [0]) to work properly
>

Ok, with that patch applied I got:

xtal                               1            1    24000000          0 0
 sys_pll                           0            0  1200000000          0 0
  cpu_clk                          0            0  1200000000          0 0
 vid_pll                           0            0   732000000          0 0
 fixed_pll                         2            2  2550000000          0 0
  mpll2                            1            1   124999851          0 0
   c9410000.ethernet#m250_sel      1            1   124999851          0 0
    c9410000.ethernet#m250_div     1            1   124999851          0 0
     c9410000.ethernet#m25_div     1            1    24999971          0 0

which is equal to your result. However, the ethernet is still not working.
The prg0 register is set to 0x70A1.

A problem that I see with this solution is that MPLL2 is set to ~125 MHz.
The S805 SoC manual reports that bits 9-7 should contain a value x such
that: MPLL2 = 250 MHz * x (with x >= 1).
In our case, bits 9-7 are set to 1 which is incorrect.
I think that MPLL2 should be 250 MHz at least.

> >
> > Here is the new clk_summary:
> >
> > xtal                            1            1    24000000          0 0
> >  sys_pll                        0            0  1200000000          0 0
> >   cpu_clk                       0            0  1200000000          0 0
> >  vid_pll                        0            0   732000000          0 0
> >  fixed_pll                      2            2  2550000000          0 0
> >   mpll2                         1            1   106250000          0 0
> >    c9410000.ethernet#m250_sel   1            1   106250000          0 0
> >     c9410000.ethernet#m250_div  1            1   106250000          0 0
> >      c9410000.ethernet#m25_div  1            1    21250000          0 0
> >
> > which leads to a value of 0x70a1 in the prg0 ethernet register.
> > As you can see, something is changed but the RGMII clock is not at 25 MHz.
> > In particular, the bit 10 of prg0, which enables the "generation of 25 MHz
> > clock for PHY" (see S805 manual), is 0.
> assuming that the description in the datasheet is correct
> after Kevin and Mike got updated information from Amlogic about the
> PRG_ETHERNET0 register documenation (see [1]) we thought that bit 10
> means "0 = divide by 5, 1 = divide by 10" (see [2]). I didn't question
> this so far, but I'll test this on a newer SoC later (by forcing
> m250_div to 125MHz, then m25_div will have register value 0 = divide
> by 5)
> 
> if the description from the documentation is correct then we need to
> replace m25_div with a fixed-factor clock (mult = 1, div = 5) and make
> it a m25_en gate clock instead
> the resulting clock path would look like this: mpll2 > m250_sel >
> m250_div > fixed_factor > m25_en
> 
> > Please, if you have other suggestions let me know.
> could you please re-test this with the patch from [0] applied? no
> other changes should be needed!
> 
> > Best regards,
> >
> > Emiliano
> >
> >>
> >> [0] http://lists.infradead.org/pipermail/linux-amlogic/2017-December/005596.html
> >> [1] http://lists.infradead.org/pipermail/linux-amlogic/2017-December/005848.html
> >>
> >>
> >> Martin Blumenstingl (3):
> >>   net: stmmac: dwmac-meson8b: only configure the clocks in RGMII mode
> >>   net: stmmac: dwmac-meson8b: fix setting the RGMII clock on Meson8b
> >>   net: stmmac: dwmac-meson8b: propagate rate changes to the parent clock
> >>
> >>  .../net/ethernet/stmicro/stmmac/dwmac-meson8b.c    | 55 +++++++++++-----------
> >>  1 file changed, 27 insertions(+), 28 deletions(-)
> >>
> >> --
> >> 2.15.1
> >>
> 
> Regards
> Martin
> 

Thank you,

Emiliano

> 
> [0] https://patchwork.kernel.org/patch/10131677/
> [1] https://www.mail-archive.com/netdev@vger.kernel.org/msg119290.html
> [2] https://www.mail-archive.com/netdev@vger.kernel.org/msg119293.html

^ permalink raw reply

* Re: [PATCH v2 bpf-next 06/11] bpf: Add sock_ops RTO callback
From: Yuchung Cheng @ 2017-12-28 17:57 UTC (permalink / raw)
  To: Lawrence Brakmo
  Cc: netdev, Kernel Team, Blake Matheny, Alexei Starovoitov,
	Daniel Borkmann, Eric Dumazet, Neal Cardwell
In-Reply-To: <20171222012101.3899534-7-brakmo@fb.com>

On Thu, Dec 21, 2017 at 5:20 PM, Lawrence Brakmo <brakmo@fb.com> wrote:
>
> Adds an optional call to sock_ops BPF program based on whether the
> BPF_SOCK_OPS_RTO_CB_FLAG is set in bpf_sock_ops_flags.
> The BPF program is passed 2 arguments: icsk_retransmits and whether the
> RTO has expired.
>
> Signed-off-by: Lawrence Brakmo <brakmo@fb.com>
> ---
>  include/uapi/linux/bpf.h | 5 +++++
>  include/uapi/linux/tcp.h | 3 +++
>  net/ipv4/tcp_timer.c     | 9 +++++++++
>  3 files changed, 17 insertions(+)
>
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 62b2c89..3cf9014 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -995,6 +995,11 @@ enum {
>                                          * a congestion threshold. RTTs above
>                                          * this indicate congestion
>                                          */
> +       BPF_SOCK_OPS_RTO_CB,            /* Called when an RTO has triggered.
> +                                        * Arg1: value of icsk_retransmits
> +                                        * Arg2: value of icsk_rto
> +                                        * Arg3: whether RTO has expired
> +                                        */
>  };
>
>  #define TCP_BPF_IW             1001    /* Set TCP initial congestion window */
> diff --git a/include/uapi/linux/tcp.h b/include/uapi/linux/tcp.h
> index b4a4f64..089c19e 100644
> --- a/include/uapi/linux/tcp.h
> +++ b/include/uapi/linux/tcp.h
> @@ -259,6 +259,9 @@ struct tcp_md5sig {
>         __u8    tcpm_key[TCP_MD5SIG_MAXKEYLEN];         /* key (binary) */
>  };
>
> +/* Definitions for bpf_sock_ops_flags */
> +#define BPF_SOCK_OPS_RTO_CB_FLAG       (1<<0)
> +
>  /* INET_DIAG_MD5SIG */
>  struct tcp_diag_md5sig {
>         __u8    tcpm_family;
> diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c
> index 6db3124..f9c57e2 100644
> --- a/net/ipv4/tcp_timer.c
> +++ b/net/ipv4/tcp_timer.c
> @@ -215,9 +215,18 @@ static int tcp_write_timeout(struct sock *sk)
>         tcp_fastopen_active_detect_blackhole(sk, expired);
can't we just call it here once w/ 'expired' as a parameter, instead
of duplicating the code?

>         if (expired) {
>                 /* Has it gone just too far? */
> +               if (BPF_SOCK_OPS_TEST_FLAG(tp, BPF_SOCK_OPS_RTO_CB_FLAG))
> +                       tcp_call_bpf_3arg(sk, BPF_SOCK_OPS_RTO_CB,
> +                                         icsk->icsk_retransmits,
> +                                         icsk->icsk_rto, 1);
>                 tcp_write_err(sk);
>                 return 1;
>         }
> +
> +       if (BPF_SOCK_OPS_TEST_FLAG(tp, BPF_SOCK_OPS_RTO_CB_FLAG))
> +               tcp_call_bpf_3arg(sk, BPF_SOCK_OPS_RTO_CB,
> +                                 icsk->icsk_retransmits,
> +                                 icsk->icsk_rto, 0);
>         return 0;
>  }
>
> --
> 2.9.5
>

^ permalink raw reply

* Re: [PATCH iproute2 1/3] vxcan,veth: Forbid "type" for peer device
From: Stephen Hemminger @ 2017-12-28 17:40 UTC (permalink / raw)
  To: Serhey Popovych; +Cc: netdev
In-Reply-To: <1514458864-17640-1-git-send-email-serhe.popovych@gmail.com>

On Thu, 28 Dec 2017 13:01:04 +0200
Serhey Popovych <serhe.popovych@gmail.com> wrote:

> It is already given for original device we configure this
> peer for.
> 
> Results from following command before/after change applied
> are shown below:
> 
>   $ ip link add dev veth1a type veth peer name veth1b \
>                            type veth peer name veth1c
> 
> Before:
> -------
> 
> <no output, no netdevs created>
> 
> After:
> ------
> 
> Error: duplicate "type": "veth" is the second value.
> 
> Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>

Applied this one. The other util patches have some issues

^ permalink raw reply

* Re: [PATCH iproute2 3/3] ip/tunnel: Document "external" parameter
From: Stephen Hemminger @ 2017-12-28 17:41 UTC (permalink / raw)
  To: Serhey Popovych; +Cc: netdev
In-Reply-To: <1514459502-20011-1-git-send-email-serhe.popovych@gmail.com>

On Thu, 28 Dec 2017 13:11:42 +0200
Serhey Popovych <serhe.popovych@gmail.com> wrote:

> Add it to ip-link(8) "type gre" output help message
> as well as to ip-link(8) page.
> 
> Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>

Applied. Thanks

^ permalink raw reply

* Re: [PATCH iproute2 3/3] utils: ll_map: Make network device name fixed size array of char
From: Serhey Popovych @ 2017-12-28 18:17 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev
In-Reply-To: <20171228094532.4a653cc0@xeon-e3>


[-- Attachment #1.1: Type: text/plain, Size: 3721 bytes --]

Stephen Hemminger wrote:
> On Wed, 20 Dec 2017 09:37:31 +0200
> Serhey Popovych <serhe.popovych@gmail.com> wrote:
> 
>> Network device names are fixed in size and never exceed
>> IFNAMSIZ (16 bytes).
>>
>> Make name fixed size array to always malloc() same size chunk
>> of memory and use memcpy()/memcmp() with constant IFNAMSIZ
>> to benefit from possible compiler optimizations replacing
>> call to a function with two/four load/store instructions
>> on 64/32 bit systems.
>>
>> Check if IFLA_IFNAME attribute present in netlink message
>> (should always) and use strncpy() to pad name with zeros.
>>
>> Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
>> ---
>>  lib/ll_map.c |   20 ++++++++++++--------
>>  1 file changed, 12 insertions(+), 8 deletions(-)
>>
>> diff --git a/lib/ll_map.c b/lib/ll_map.c
>> index abe7bdc..fcbf0fb 100644
>> --- a/lib/ll_map.c
>> +++ b/lib/ll_map.c
>> @@ -30,7 +30,7 @@ struct ll_cache {
>>  	unsigned	flags;
>>  	unsigned 	index;
>>  	unsigned short	type;
>> -	char		name[];
>> +	char		name[IFNAMSIZ];
>>  };
>>  
>>  #define IDXMAP_SIZE	1024
>> @@ -71,7 +71,7 @@ static struct ll_cache *ll_get_by_name(const char *name)
>>  		struct ll_cache *im
>>  			= container_of(n, struct ll_cache, name_hash);
>>  
>> -		if (strncmp(im->name, name, IFNAMSIZ) == 0)
>> +		if (!strcmp(im->name, name))
>>  			return im;
>>  	}
>>  
>> @@ -82,7 +82,7 @@ int ll_remember_index(const struct sockaddr_nl *who,
>>  		      struct nlmsghdr *n, void *arg)
>>  {
>>  	unsigned int h;
>> -	const char *ifname;
>> +	char ifname[IFNAMSIZ];
>>  	struct ifinfomsg *ifi = NLMSG_DATA(n);
>>  	struct ll_cache *im;
>>  	struct rtattr *tb[IFLA_MAX+1];
>> @@ -105,17 +105,21 @@ int ll_remember_index(const struct sockaddr_nl *who,
>>  	}
>>  
>>  	parse_rtattr(tb, IFLA_MAX, IFLA_RTA(ifi), IFLA_PAYLOAD(n));
>> -	ifname = rta_getattr_str(tb[IFLA_IFNAME]);
>> -	if (ifname == NULL)
>> +
>> +	if (!tb[IFLA_IFNAME])
>> +		return 0;
>> +	strncpy(ifname, rta_getattr_str(tb[IFLA_IFNAME]), IFNAMSIZ);
>> +	if (!ifname[0])
>>  		return 0;
>> +	ifname[IFNAMSIZ - 1] = '\0';
>>  
>>  	if (im) {
>>  		/* change to existing entry */
>> -		rehash = strcmp(im->name, ifname);
>> +		rehash = memcmp(im->name, ifname, IFNAMSIZ);
> 
> This is not safe. There is not guarantee that bytes after end of string are zero.

Sorry Stephen, correct if my assumptions are wrong:

  1. struct ll_cache entries are only modified in ll_remember_index().
     There are no places where we may modify ll_cache entries.

  2. strncpy() always pad with zeroes to the end of IFNAMSIZ sized
     buffer.

  3. strncpy() may not return null terminated string: this addressed
     with ifname[IFNAMSIZ - 1] = '\0' in the code above.

Assuming 1 and 2 we always have im->name[] initialized with string and
zero pads up to IFNAMSIZ. We prepare ifname using strncpy() to, so it
is zero padded and we can safely use memcmp() to compare byte by byte.

> And in your code, strncpy() will overwrite characters from the beginning to null,
> it will not overwrite after that. Then comparison with entry may not work because
> of the data after that.
strncpy() will not pad with zeroes up to IFNAMSIZ? I get from strncpy(3)
it will pad to the end of buf, so IFNAMSIZ is initialized.

Please correct me if I'm wrong at some point.

Thanks.

> 
> I really doubt this is critical path on anything. Probably just having a better
> hash table implementation would solve that.
> 

Well this is critical with thousands of interfaces. I guess. Didn't go
with tests, but can do that. Of course difference might not be so big as
I expect, but anyway.


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply

* Re: [PATCH iproute2 3/3] utils: ll_map: Make network device name fixed size array of char
From: Stephen Hemminger @ 2017-12-28 17:45 UTC (permalink / raw)
  To: Serhey Popovych; +Cc: netdev
In-Reply-To: <1513755451-9800-4-git-send-email-serhe.popovych@gmail.com>

On Wed, 20 Dec 2017 09:37:31 +0200
Serhey Popovych <serhe.popovych@gmail.com> wrote:

> Network device names are fixed in size and never exceed
> IFNAMSIZ (16 bytes).
> 
> Make name fixed size array to always malloc() same size chunk
> of memory and use memcpy()/memcmp() with constant IFNAMSIZ
> to benefit from possible compiler optimizations replacing
> call to a function with two/four load/store instructions
> on 64/32 bit systems.
> 
> Check if IFLA_IFNAME attribute present in netlink message
> (should always) and use strncpy() to pad name with zeros.
> 
> Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
> ---
>  lib/ll_map.c |   20 ++++++++++++--------
>  1 file changed, 12 insertions(+), 8 deletions(-)
> 
> diff --git a/lib/ll_map.c b/lib/ll_map.c
> index abe7bdc..fcbf0fb 100644
> --- a/lib/ll_map.c
> +++ b/lib/ll_map.c
> @@ -30,7 +30,7 @@ struct ll_cache {
>  	unsigned	flags;
>  	unsigned 	index;
>  	unsigned short	type;
> -	char		name[];
> +	char		name[IFNAMSIZ];
>  };
>  
>  #define IDXMAP_SIZE	1024
> @@ -71,7 +71,7 @@ static struct ll_cache *ll_get_by_name(const char *name)
>  		struct ll_cache *im
>  			= container_of(n, struct ll_cache, name_hash);
>  
> -		if (strncmp(im->name, name, IFNAMSIZ) == 0)
> +		if (!strcmp(im->name, name))
>  			return im;
>  	}
>  
> @@ -82,7 +82,7 @@ int ll_remember_index(const struct sockaddr_nl *who,
>  		      struct nlmsghdr *n, void *arg)
>  {
>  	unsigned int h;
> -	const char *ifname;
> +	char ifname[IFNAMSIZ];
>  	struct ifinfomsg *ifi = NLMSG_DATA(n);
>  	struct ll_cache *im;
>  	struct rtattr *tb[IFLA_MAX+1];
> @@ -105,17 +105,21 @@ int ll_remember_index(const struct sockaddr_nl *who,
>  	}
>  
>  	parse_rtattr(tb, IFLA_MAX, IFLA_RTA(ifi), IFLA_PAYLOAD(n));
> -	ifname = rta_getattr_str(tb[IFLA_IFNAME]);
> -	if (ifname == NULL)
> +
> +	if (!tb[IFLA_IFNAME])
> +		return 0;
> +	strncpy(ifname, rta_getattr_str(tb[IFLA_IFNAME]), IFNAMSIZ);
> +	if (!ifname[0])
>  		return 0;
> +	ifname[IFNAMSIZ - 1] = '\0';
>  
>  	if (im) {
>  		/* change to existing entry */
> -		rehash = strcmp(im->name, ifname);
> +		rehash = memcmp(im->name, ifname, IFNAMSIZ);

This is not safe. There is not guarantee that bytes after end of string are zero.
And in your code, strncpy() will overwrite characters from the beginning to null,
it will not overwrite after that. Then comparison with entry may not work because
of the data after that.

I really doubt this is critical path on anything. Probably just having a better
hash table implementation would solve that.

^ permalink raw reply

* Re: [PATCH iproute2 2/3] utils: ll_map: Update name and type for existing entry
From: Stephen Hemminger @ 2017-12-28 17:46 UTC (permalink / raw)
  To: Serhey Popovych; +Cc: netdev
In-Reply-To: <1513755451-9800-3-git-send-email-serhe.popovych@gmail.com>

On Wed, 20 Dec 2017 09:37:30 +0200
Serhey Popovych <serhe.popovych@gmail.com> wrote:

> In case of we update existing entry we need not only rehash
> but also update name in existing entry.
> 
> Need to update device type too since cached interface might
> be deleted and new with same index, but different type
> added (e.g. eth0 and ppp0).
> 
> Reuse new entry initialization path to avoid duplications.
> 
> Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>

Can you provide an example where this is an observed bug?
I suspect that unless you use a batch mode command the reload
of the cache on next invocation is solving this.

^ permalink raw reply

* Re: [PATCH iproute2 2/3] utils: ll_map: Update name and type for existing entry
From: Serhey Popovych @ 2017-12-28 18:22 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev
In-Reply-To: <20171228094645.19549fcc@xeon-e3>


[-- Attachment #1.1: Type: text/plain, Size: 1128 bytes --]

Stephen Hemminger wrote:
> On Wed, 20 Dec 2017 09:37:30 +0200
> Serhey Popovych <serhe.popovych@gmail.com> wrote:
> 
>> In case of we update existing entry we need not only rehash
>> but also update name in existing entry.
>>
>> Need to update device type too since cached interface might
>> be deleted and new with same index, but different type
>> added (e.g. eth0 and ppp0).
>>
>> Reuse new entry initialization path to avoid duplications.
>>
>> Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
> 
> Can you provide an example where this is an observed bug?
> I suspect that unless you use a batch mode command the reload
> of the cache on next invocation is solving this.
> 
From my side example from description is pretty clear: eth0 -> ppp0
or rename eth0 -> eth1, etc.

According to ll_remember_index() code we might be called with
non-empty cache. If ll_get_by_index() returns an entry with
name that differs from current we need:

  1. Rehash in ->name_hash (done with current code)
  2. Update entry name     (not done with current code)

That's my point of view.

Thanks.



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply

* Re: [PATCH net-next 2/2] l2tp: add peer_offset parameter
From: Lorenzo Bianconi @ 2017-12-28 18:23 UTC (permalink / raw)
  To: Guillaume Nault; +Cc: davem, netdev, jchapman, liuhangbin
In-Reply-To: <20171228145340.GA1292@alphalink.fr>

On Dec 28, Guillaume Nault wrote:
> On Fri, Dec 22, 2017 at 03:10:18PM +0100, Lorenzo Bianconi wrote:
> > Introduce peer_offset parameter in order to add the capability
> > to specify two different values for payload offset on tx/rx side.
> > If just offset is provided by userspace use it for rx side as well
> > in order to maintain compatibility with older l2tp versions
> > 
> Sorry for being late on this, I originally missed this patchset and
> only noticed it yesterday.
> 
> Lorenzo, can you give some context around this new feature?
> Quite frankly I can't see the point of it. I've never heard of offsets
> in L2TPv3, and for L2TPv2, the offset value is already encoded in the
> header.

Hi Guillaume,

thanks for your feedback.

> 
> After a quick review of L2TPv3 and pseudowires RFCs, I still don't see
> how adding some padding between the L2TPv3 header and the payload could
> constitute a valid frame. Of course, the base feature is already there,
> but after a quick test, it looks like the padding bits aren't
> initialised and leak memory.

Do you mean for L2TPv2 or L2TPv3? For L2TPv3 offset/peer_offset are initialized
in l2tp_nl_cmd_session_create()

> 
> So, unless I missed something, setting offsets in L2TPv3 is
> non-compliant, the current implementation is buggy and most likely
> unused. I'd really prefer getting the implementation fixed, or even
> removed entirely. Extending it to allow asymmetrical offset values
> looks wrong to me, unless you have a use case in mind.
> 
> Regards,
> 
> Guillaume
> 
> PS: I also noticed that iproute2 has a "peer_offset" option, but it's a
> noop.

Setting session data offset is already supported in L2TP kernel module
(and could be already used by userspace applications);
for L2TPv2 there is an optional 16-bit value in the header while for L2TPv3
the offset is configured by userspace.
At the moment the kernel (for L2TPv3) uses offset for both tx and rx side.
Userspace (iproute2) allows to distinguish tx offset (offset) from rx one
(peer_offset) but since the rx part is not handled at the moment
(I fixed peer_offset support in iproute2, I have not sent the patch upstream yet, attached below)
this leads to a misalignment between tunnel endpoints.
You can easily reproduce the issue using this setup (and the below patch for iproute2):

ip l2tp add tunnel local <ip0> remote <ip1> tunnel_id <id0> peer_tunnel_id <id1> udp_sport <p0> udp_dport <p1>
ip l2tp add tunnel local <ip1> remote <ip0> tunnel_id <id1> peer_tunnel_id <id0> udp_sport <p1> udp_dport <p0>

ip l2tp add session name l2tp0 tunnel_id <id0> session_id <s0> peer_session_id <s1> offset 8 peer_offset 16
ip l2tp add session name l2tp0 tunnel_id <id1> session_id <s1> peer_session_id <s0> offset 16 peer_offset 8

commit ee1b976f22fbea530c94a5233ac8c7cd8d643ae9
Author: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Date:   Thu Dec 21 14:41:39 2017 +0100

    ip: l2tp: add peer_offset netlink callback

diff --git a/include/uapi/linux/l2tp.h b/include/uapi/linux/l2tp.h
index 472e9924..21223df7 100644
--- a/include/uapi/linux/l2tp.h
+++ b/include/uapi/linux/l2tp.h
@@ -127,6 +127,7 @@ enum {
 	L2TP_ATTR_UDP_ZERO_CSUM6_TX,	/* flag */
 	L2TP_ATTR_UDP_ZERO_CSUM6_RX,	/* flag */
 	L2TP_ATTR_PAD,
+	L2TP_ATTR_PEER_OFFSET,		/* u16 */
 	__L2TP_ATTR_MAX,
 };
 
diff --git a/ip/ipl2tp.c b/ip/ipl2tp.c
index 7c5ed313..a3220a8b 100644
--- a/ip/ipl2tp.c
+++ b/ip/ipl2tp.c
@@ -176,6 +176,8 @@ static int create_session(struct l2tp_parm *p)
 					  p->reorder_timeout);
 	if (p->offset)
 		addattr16(&req.n, 1024, L2TP_ATTR_OFFSET, p->offset);
+	if (p->peer_offset)
+		addattr16(&req.n, 1024, L2TP_ATTR_PEER_OFFSET, p->peer_offset);
 	if (p->cookie_len)
 		addattr_l(&req.n, 1024, L2TP_ATTR_COOKIE,
 			  p->cookie, p->cookie_len);
@@ -316,6 +318,8 @@ static int get_response(struct nlmsghdr *n, void *arg)
 		p->encap = rta_getattr_u16(attrs[L2TP_ATTR_ENCAP_TYPE]);
 	if (attrs[L2TP_ATTR_OFFSET])
 		p->offset = rta_getattr_u16(attrs[L2TP_ATTR_OFFSET]);
+	if (attrs[L2TP_ATTR_PEER_OFFSET])
+		p->peer_offset = rta_getattr_u16(attrs[L2TP_ATTR_PEER_OFFSET]);
 	if (attrs[L2TP_ATTR_DATA_SEQ])
 		p->data_seq = rta_getattr_u16(attrs[L2TP_ATTR_DATA_SEQ]);
 	if (attrs[L2TP_ATTR_CONN_ID])


Regards,
Lorenzo

^ permalink raw reply related

* Re: [PATCH net-next 1/6] phy: add 2.5G SGMII mode to the phy_mode enum
From: Antoine Tenart @ 2017-12-28 18:24 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: Antoine Tenart, Andrew Lunn, thomas.petazzoni, ymarkman, jason,
	netdev, linux-kernel, linux, kishon, nadavh, miquel.raynal,
	gregory.clement, stefanc, mw, davem, linux-arm-kernel,
	sebastian.hesselbarth
In-Reply-To: <91838ce5-a1a8-c41a-36e8-bef7adaf82fd@gmail.com>

Hi Florian,

On Thu, Dec 28, 2017 at 06:16:51AM -0800, Florian Fainelli wrote:
> 
> And since you are respinning, please make sure you update phy_modes() in
> the same header file as well as
> Documentation/devicetree/bindings/net/ethernet.txt with the newly added
> PHY interface mode.

You're right. Thanks for pointing this out!

Antoine

-- 
Antoine Ténart, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply

* Re: [PATCH net-next 5/6] arm64: dts: marvell: mcbin: enable the fourth network interface
From: Antoine Tenart @ 2017-12-28 18:27 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: Antoine Tenart, Andrew Lunn, thomas.petazzoni, ymarkman, jason,
	netdev, linux-kernel, Russell King - ARM Linux, kishon, nadavh,
	miquel.raynal, gregory.clement, stefanc, mw, davem,
	linux-arm-kernel, sebastian.hesselbarth
In-Reply-To: <462da70b-ba7d-6299-3e21-b619d3c4c7e6@gmail.com>

Hi Florian,

On Thu, Dec 28, 2017 at 07:02:09AM -0800, Florian Fainelli wrote:
> On 12/28/2017 02:05 AM, Antoine Tenart wrote:
> > On Thu, Dec 28, 2017 at 08:46:23AM +0100, Andrew Lunn wrote:
> >> On Wed, Dec 27, 2017 at 10:24:01PM +0000, Russell King - ARM Linux wrote:
> >>> On Wed, Dec 27, 2017 at 11:14:45PM +0100, Antoine Tenart wrote:
> >>>>  
> >>>> +&cps_eth2 {
> >>>> +	/* CPS Lane 5 */
> >>>> +	status = "okay";
> >>>> +	phy-mode = "2500base-x";
> >>>> +	/* Generic PHY, providing serdes lanes */
> >>>> +	phys = <&cps_comphy5 2>;
> >>>> +};
> >>>> +
> >>>
> >>> This is wrong.  This lane is connected to a SFP cage which can support
> >>> more than just 2500base-X.  Tying it in this way to 2500base-X means
> >>> that this port does not support conenctions at 1000base-X, despite
> >>> that's one of the most popular and more standardised speeds.
> >>>
> >>
> >> I agree with Russell here. SFP modules are hot pluggable, and support
> >> a range of interface modes. You need to query what the SFP module is
> >> in order to know how to configure the SERDES interface. The phylink
> >> infrastructure does that for you.
> > 
> > Sure, I understand. We'll be able to support such interfaces only when
> > the phylink PPv2 support lands in.
> 
> Should we expect PHYLINK support to make it as the first patch in your
> v2 of this patch series, or is someone else doing that?

No, the phylink patch conflicts with Marcin's ACPI series and we agreed
to let him get his series merged first. And I will probably work on a
few other topics before having the chance to work on it. So it'll
probably be me doing that, but not right now.

This isn't an issue regarding the PPv2 and PHY patches of this series,
but yes we probably won't get the fourth network interface supported on
the mcbin during this release.

Thanks!
Antoine

-- 
Antoine Ténart, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply

* Re: WARNING in strp_data_ready
From: Dmitry Vyukov @ 2017-12-28 18:35 UTC (permalink / raw)
  To: Ozgur
  Cc: Tom Herbert, John Fastabend, syzbot, David S. Miller,
	Eric Biggers, LKML, Linux Kernel Network Developers,
	syzkaller-bugs@googlegroups.com, Tom Herbert, Cong Wang
In-Reply-To: <2278061514485284@web47j.yandex.ru>

On Thu, Dec 28, 2017 at 7:21 PM, Ozgur <ozgur@goosey.org> wrote:
>
>
> 28.12.2017, 19:33, "Dmitry Vyukov" <dvyukov@google.com>:
>> On Thu, Dec 28, 2017 at 5:14 PM, Tom Herbert <tom@herbertland.com> wrote:
>>>  On Thu, Dec 28, 2017 at 12:59 AM, Ozgur <ozgur@goosey.org> wrote:
>>>>  28.12.2017, 04:19, "Tom Herbert" <tom@herbertland.com>:
>>>>>  On Wed, Dec 27, 2017 at 12:20 PM, Ozgur <ozgur@goosey.org> wrote:
>>>>>>   27.12.2017, 23:14, "Dmitry Vyukov" <dvyukov@google.com>:
>>>>>>>   On Wed, Dec 27, 2017 at 9:08 PM, Ozgur <ozgur@goosey.org> wrote:
>>>>>>>>    27.12.2017, 22:21, "Dmitry Vyukov" <dvyukov@google.com>:
>>>>>>>>>    On Wed, Dec 27, 2017 at 8:09 PM, Tom Herbert <tom@herbertland.com> wrote:
>>>>>>>>>>     Did you try the patch I posted?
>>>>>>>>>
>>>>>>>>>    Hi Tom,
>>>>>>>>
>>>>>>>>    Hello Dmitry,
>>>>>>>>
>>>>>>>>>    No. And I didn't know I need to. Why?
>>>>>>>>>    If you think the patch needs additional testing, you can ask syzbot to
>>>>>>>>>    test it. See https://github.com/google/syzkaller/blob/master/docs/syzbot.md#communication-with-syzbot
>>>>>>>>>    Otherwise proceed with committing it. Or what are we waiting for?
>>>>>>>>>
>>>>>>>>>    Thanks
>>>>>>>>
>>>>>>>>    I think we need to fixed patch for crash, in fact check to patch code and test solve the bug.
>>>>>>>>    How do test it because there is no patch in the following bug?
>>>>>>>
>>>>>>>   Hi Ozgur,
>>>>>>>
>>>>>>>   I am not sure I completely understand what you mean. But the
>>>>>>>   reproducer for this bug (which one can use for testing) is here:
>>>>>>>   https://groups.google.com/forum/#!topic/syzkaller-bugs/Kxs05ziCpgY
>>>>>>>   Tom also mentions there is some patch for this, but I don't know where
>>>>>>>   it is, it doesn't seem to be referenced from this thread.
>>>>>>
>>>>>>   Hello Dmitry,
>>>>>>
>>>>>>   Ah, I'm sorry I don't seen Tom mail and I don't have a patch not tested :)
>>>>>>   I think Tom send patch to only you and are you tested?
>>>>>>
>>>>>>   kcmsock.c will change and strp_data_ready I think locked.
>>>>>>
>>>>>>   Tom, please send a patch for me? I can test and inform you.
>>>>>
>>>>>  Hi Ozgur,
>>>>>
>>>>>  I reposted the patches as RFC "kcm: Fix lockdep issue". Please test if you can!
>>>>>
>>>>>  Thanks,
>>>>>  Tom
>>>>
>>>>  Hello Tom,
>>>>
>>>>  Which are you use the repos? I pulled but I don't seen this patches.
>>>  They are not in any public repo yet. I posted the patches to netdev
>>>  list so they can be reviewed and tested by third parties. Posting
>>>  patches to the list a normal path to get patches into the kernel
>>>  (http://nickdesaulniers.github.io/blog/2017/05/16/submitting-your-first-patch-to-the-linux-kernel-and-responding-to-feedback/).
>>>
>>>  These patches were applied to net-next but are simple enough that they
>>>  should apply to other branches. I will repost and target to net per
>>>  Dave's directive once they are verified to fix the issue.
>
> Hello,
>
> thanks Tom and I have tested the fixed patch for linux-next builds and don't have to kernel panic. when nocheck funcs call sk_lock.owned and kernel doesn't give a panic.  I have compiled and uploaded next-kernel.
>
> Dmitry,
> could you test it on linux-next?

If you are trying to test how many times I can repeat this, I can
repeat this lots of times:

If you think the patch needs additional testing, you can ask syzbot to
test it. See https://github.com/google/syzkaller/blob/master/docs/syzbot.md#communication-with-syzbot

^ permalink raw reply

* Re: [PATCH net-next 5/6] arm64: dts: marvell: mcbin: enable the fourth network interface
From: Russell King - ARM Linux @ 2017-12-28 18:46 UTC (permalink / raw)
  To: Antoine Tenart
  Cc: Florian Fainelli, Andrew Lunn, thomas.petazzoni, ymarkman, jason,
	netdev, linux-kernel, kishon, nadavh, miquel.raynal,
	gregory.clement, stefanc, mw, davem, linux-arm-kernel,
	sebastian.hesselbarth
In-Reply-To: <20171228182739.GH2626@kwain>

On Thu, Dec 28, 2017 at 07:27:39PM +0100, Antoine Tenart wrote:
> Hi Florian,
> 
> On Thu, Dec 28, 2017 at 07:02:09AM -0800, Florian Fainelli wrote:
> > On 12/28/2017 02:05 AM, Antoine Tenart wrote:
> > > On Thu, Dec 28, 2017 at 08:46:23AM +0100, Andrew Lunn wrote:
> > >> On Wed, Dec 27, 2017 at 10:24:01PM +0000, Russell King - ARM Linux wrote:
> > >>> On Wed, Dec 27, 2017 at 11:14:45PM +0100, Antoine Tenart wrote:
> > >>>>  
> > >>>> +&cps_eth2 {
> > >>>> +	/* CPS Lane 5 */
> > >>>> +	status = "okay";
> > >>>> +	phy-mode = "2500base-x";
> > >>>> +	/* Generic PHY, providing serdes lanes */
> > >>>> +	phys = <&cps_comphy5 2>;
> > >>>> +};
> > >>>> +
> > >>>
> > >>> This is wrong.  This lane is connected to a SFP cage which can support
> > >>> more than just 2500base-X.  Tying it in this way to 2500base-X means
> > >>> that this port does not support conenctions at 1000base-X, despite
> > >>> that's one of the most popular and more standardised speeds.
> > >>>
> > >>
> > >> I agree with Russell here. SFP modules are hot pluggable, and support
> > >> a range of interface modes. You need to query what the SFP module is
> > >> in order to know how to configure the SERDES interface. The phylink
> > >> infrastructure does that for you.
> > > 
> > > Sure, I understand. We'll be able to support such interfaces only when
> > > the phylink PPv2 support lands in.
> > 
> > Should we expect PHYLINK support to make it as the first patch in your
> > v2 of this patch series, or is someone else doing that?
> 
> No, the phylink patch conflicts with Marcin's ACPI series and we agreed
> to let him get his series merged first. And I will probably work on a
> few other topics before having the chance to work on it. So it'll
> probably be me doing that, but not right now.

ACPI is going to be a problem with phylink for a while.  There's patches
queued in net-next which convert phylink and SFP mostly to the fwnode
and property based systems, but phylib and i2c do not seem to have the
necessary bits to be able to deal with those.

Specifically, in DT we have "of_find_i2c_adapter_by_node()" but afaics
there is no equivalent in ACPI - which means in an ACPI based system
we have no way to determine the I2C bus associated with a SFP socket,
which is a rather fundamental issue for SFP modules.

For phylib side, there's "of_phy_attach()" but again there is no
equivalent in ACPI. This should not be that much of a problem, because
network drivers using the DT phylib calls (eg, "of_phy_connect()") are
already restricted by this. That may have been solved by Marcin's
series, but I've not seen it to know.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up

^ permalink raw reply

* Re: [PATCH net-next 5/6] arm64: dts: marvell: mcbin: enable the fourth network interface
From: Russell King - ARM Linux @ 2017-12-28 18:59 UTC (permalink / raw)
  To: Antoine Tenart
  Cc: davem, kishon, andrew, jason, sebastian.hesselbarth,
	gregory.clement, mw, stefanc, ymarkman, thomas.petazzoni,
	miquel.raynal, nadavh, netdev, linux-arm-kernel, linux-kernel,
	Jon Nettleton
In-Reply-To: <20171228100416.GD2626@kwain>

On Thu, Dec 28, 2017 at 11:04:16AM +0100, Antoine Tenart wrote:
> Hi Russell,
> 
> On Wed, Dec 27, 2017 at 11:20:00PM +0000, Russell King - ARM Linux wrote:
> > On Wed, Dec 27, 2017 at 11:42:52PM +0100, Antoine Tenart wrote:
> > > 
> > > What do you suggest to describe this in the dt, to enable a port using
> > > the current PPv2 driver?
> > 
> > I don't - I'm merely pointing out that you're bodging support for the
> > SFP cage rather than productively discussing phylink for mvpp2.
> > 
> > As far as I remember, the discussion stalled at this point:
> > 
> > - You think there's modes that mvpp2 supports that are not supportable
> >   if you use phylink.
> > 
> > - I've described what phylink supports, and I've asked you for details
> >   about what you can't support.
> 
> That's not what I remembered. You had some valid points, and others
> related to PHY modes the driver wasn't supporting before the phylink
> transition. My understanding of this was that you wanted a full
> featured support while I only wanted to convert the already supported
> modes.

You are mistaken - you can get a full refresher on where things were
at via https://patchwork.kernel.org/patch/9963971/

There are two points in that thread where discussion stopped awaiting
input:

1. I asked for details about what mvpp2.c supports that phylink does
   not (as you indicated that there were certain things that mvpp2
   supports that phylink does not.)  I'm still awaiting a response.

2. 25th Sept, you indicated that you would get someone to test
   an issue related to in-band AN. No results of that testing have
   been forthcoming.

Consequently, the ball is in your court on both these issues.

I am not after a full featured support, what I'm after is ensuring
that phylink is (a) used correctly and (b) implementations using it
are correct.  Part of that is ensuring that users don't introduce
unexpected failure conditions.

So, when you do this in the validate() callback:

 +       phylink_set(mask, 1000baseX_Full);

and then do this in the mac_config() callback:

 +       if (!phy_interface_mode_is_rgmii(port->phy_interface) &&
 +           port->phy_interface != PHY_INTERFACE_MODE_SGMII)
 +               return;

and this in the link_state() callback:

 +       if (!phy_interface_mode_is_rgmii(port->phy_interface) &&
 +           port->phy_interface != PHY_INTERFACE_MODE_SGMII)
 +               return 0;

the result is that phylink thinks that you support 1000base-X modes,
and it will call mac_config() asking for 1000base-X, but you silently
ignore that, leaving the hardware configured in whatever state it was.
That leads to a silent failure as far as the user is concerned.

So, if you do not intend to support 1000base-X initially, don't
allow it in the validate callback until you do.

It gets worse, because the return in link_state() means that phylink
thinks that the link is up if it has requested 1000base-X, which it
won't be unless you've properly configured it.

It's this kind of unreliability that I was concerned about in your
patch.  I'm not demanding "full featured implementation" but I do
want you to use it correctly.

> You're probably right about not wanting this dt patch. The non-dt
> patches still are relevant regardless of phylink being supported in the
> PPv2 driver. I'll send a v2 without the dt parts.

Thanks.

> > What I'm most concerned about, given the bindings for comphy that
> > have been merged, is that Free Electrons is pushing forward seemingly
> > with no regard to the requirement that the serdes lanes are dynamically
> > reconfigurable, and that's a basic requirement for SFP, and for the
> > 88x3310 PHYs on the Macchiatobin platform.
> 
> The main idea behind the comphy driver is to provide a way to
> reconfigure the serdes lanes at runtime. Could you develop what are
> blocking points to properly support SFP, regarding the current comphy
> support?

If it supports serdes lane mode reconfiguration (iow, switching between
1000base-X, 2500base-X, SGMII, 10G-KR), then that's all that's required.
Note that you need comphy to switch between SGMII / 10G-KR to support
the 88x3310 fully too.

Having looked deeper at this, it seems it does have the capability of
doing what's required, sorry for the noise.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up

^ permalink raw reply

* [PATCH net 0/2] strparser: Fix lockdep issue
From: Tom Herbert @ 2017-12-28 19:00 UTC (permalink / raw)
  To: davem; +Cc: netdev, dvyukov, ozgur, rohit, Tom Herbert

When sock_owned_by_user returns true in strparser. Fix is to add and
call sock_owned_by_user_nocheck since the check for owned by user is
not an error condition in this case.

Fixes: 43a0c6751a322847 ("strparser: Stream parser for messages")
Reported-by: syzbot <syzkaller@googlegroups.com>
Reported-and-tested-by: <syzbot+c91c53af67f9ebe599a337d2e70950366153b295@syzkaller.appspotmail.com>

Tom Herbert (2):
  sock: Add sock_owned_by_user_nocheck
  strparser: Call sock_owned_by_user_nocheck

 include/net/sock.h        | 5 +++++
 net/strparser/strparser.c | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

-- 
2.11.0

^ permalink raw reply

* [PATCH net 1/2] sock: Add sock_owned_by_user_nocheck
From: Tom Herbert @ 2017-12-28 19:00 UTC (permalink / raw)
  To: davem; +Cc: netdev, dvyukov, ozgur, rohit, Tom Herbert
In-Reply-To: <20171228190044.3748-1-tom@quantonium.net>

This allows checking socket lock ownership with producing lockdep
warnings.

Signed-off-by: Tom Herbert <tom@quantonium.net>
---
 include/net/sock.h | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/include/net/sock.h b/include/net/sock.h
index 9155da422692..7a7b14e9628a 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1514,6 +1514,11 @@ static inline bool sock_owned_by_user(const struct sock *sk)
 	return sk->sk_lock.owned;
 }
 
+static inline bool sock_owned_by_user_nocheck(const struct sock *sk)
+{
+	return sk->sk_lock.owned;
+}
+
 /* no reclassification while locks are held */
 static inline bool sock_allow_reclassification(const struct sock *csk)
 {
-- 
2.11.0

^ permalink raw reply related

* [PATCH net 2/2] strparser: Call sock_owned_by_user_nocheck
From: Tom Herbert @ 2017-12-28 19:00 UTC (permalink / raw)
  To: davem; +Cc: netdev, dvyukov, ozgur, rohit, Tom Herbert
In-Reply-To: <20171228190044.3748-1-tom@quantonium.net>

strparser wants to check socket ownership without producing any
warnings. As indicated by the comment in the code, it is permissible
for owned_by_user to return true.

Fixes: 43a0c6751a322847 ("strparser: Stream parser for messages")
Reported-by: syzbot <syzkaller@googlegroups.com>
Reported-and-tested-by: <syzbot+c91c53af67f9ebe599a337d2e70950366153b295@syzkaller.appspotmail.com>
Signed-off-by: Tom Herbert <tom@quantonium.net>
---
 net/strparser/strparser.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/strparser/strparser.c b/net/strparser/strparser.c
index c5fda15ba319..1fdab5c4eda8 100644
--- a/net/strparser/strparser.c
+++ b/net/strparser/strparser.c
@@ -401,7 +401,7 @@ void strp_data_ready(struct strparser *strp)
 	 * allows a thread in BH context to safely check if the process
 	 * lock is held. In this case, if the lock is held, queue work.
 	 */
-	if (sock_owned_by_user(strp->sk)) {
+	if (sock_owned_by_user_nocheck(strp->sk)) {
 		queue_work(strp_wq, &strp->work);
 		return;
 	}
-- 
2.11.0

^ permalink raw reply related

* Re: [PATCH net-next] virtio_net: implement VIRTIO_CONFIG_S_NEEDS_RESET
From: Willem de Bruijn @ 2017-12-28 19:11 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Jason Wang, Network Development, David Miller, virtualization,
	Willem de Bruijn
In-Reply-To: <20171017064254-mutt-send-email-mst@kernel.org>

On Mon, Oct 16, 2017 at 11:44 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> On Tue, Oct 17, 2017 at 11:05:07AM +0800, Jason Wang wrote:
>>
>>
>> On 2017年10月17日 06:34, Willem de Bruijn wrote:
>> > On Mon, Oct 16, 2017 at 12:38 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
>> > > On Mon, Oct 16, 2017 at 12:04:57PM -0400, Willem de Bruijn wrote:
>> > > > On Mon, Oct 16, 2017 at 11:31 AM, Michael S. Tsirkin <mst@redhat.com> wrote:
>> > > > > On Mon, Oct 16, 2017 at 11:03:18AM -0400, Willem de Bruijn wrote:
>> > > > > > > > +static int virtnet_reset(struct virtnet_info *vi)
>> > > > > > > > +{
>> > > > > > > > +     struct virtio_device *dev = vi->vdev;
>> > > > > > > > +     int ret;
>> > > > > > > > +
>> > > > > > > > +     virtio_config_disable(dev);
>> > > > > > > > +     dev->failed = dev->config->get_status(dev) & VIRTIO_CONFIG_S_FAILED;
>> > > > > > > > +     virtnet_freeze_down(dev, true);
>> > > > > > > > +     remove_vq_common(vi);
>> > > > > > > > +
>> > > > > > > > +     virtio_add_status(dev, VIRTIO_CONFIG_S_ACKNOWLEDGE);
>> > > > > > > > +     virtio_add_status(dev, VIRTIO_CONFIG_S_DRIVER);
>> > > > > > > > +
>> > > > > > > > +     ret = virtio_finalize_features(dev);
>> > > > > > > > +     if (ret)
>> > > > > > > > +             goto err;
>> > > > > > > > +
>> > > > > > > > +     ret = virtnet_restore_up(dev);
>> > > > > > > > +     if (ret)
>> > > > > > > > +             goto err;
>> > > > > > > > +
>> > > > > > > > +     ret = virtnet_set_queues(vi, vi->curr_queue_pairs);
>> > > > > > > > +     if (ret)
>> > > > > > > > +             goto err;
>> > > > > > > > +
>> > > > > > > > +     virtio_add_status(dev, VIRTIO_CONFIG_S_DRIVER_OK);
>> > > > > > > > +     virtio_config_enable(dev);
>> > > > > > > > +     return 0;
>> > > > > > > > +
>> > > > > > > > +err:
>> > > > > > > > +     virtio_add_status(dev, VIRTIO_CONFIG_S_FAILED);
>> > > > > > > > +     return ret;
>> > > > > > > > +}
>> > > > > > > > +
>> > > > > > > >   static int virtnet_set_guest_offloads(struct virtnet_info *vi, u64 offloads)
>> > > > > > > >   {
>> > > > > > > >        struct scatterlist sg;
>> > > > > > > I have a question here though. How do things like MAC address
>> > > > > > > get restored?
>> > > > > > >
>> > > > > > > What about the rx mode?
>> > > > > > >
>> > > > > > > vlans?
>> > > > > > The function as is releases and reinitializes only ring state.
>> > > > > > Device configuration such as mac and vlan persist across
>> > > > > > the reset.
>> > > > > What gave you this impression? Take a look at e.g. this
>> > > > > code in qemu:
>> > > > >
>> > > > > static void virtio_net_reset(VirtIODevice *vdev)
>> > > > > {
>> > > > >      VirtIONet *n = VIRTIO_NET(vdev);
>> > > > >
>> > > > >      /* Reset back to compatibility mode */
>> > > > >      n->promisc = 1;
>> > > > >      n->allmulti = 0;
>> > > > >      n->alluni = 0;
>> > > > >      n->nomulti = 0;
>> > > > >      n->nouni = 0;
>> > > > >      n->nobcast = 0;
>> > > > >      /* multiqueue is disabled by default */
>> > > > >      n->curr_queues = 1;
>> > > > >      timer_del(n->announce_timer);
>> > > > >      n->announce_counter = 0;
>> > > > >      n->status &= ~VIRTIO_NET_S_ANNOUNCE;
>> > > > >
>> > > > >      /* Flush any MAC and VLAN filter table state */
>> > > > >      n->mac_table.in_use = 0;
>> > > > >      n->mac_table.first_multi = 0;
>> > > > >      n->mac_table.multi_overflow = 0;
>> > > > >      n->mac_table.uni_overflow = 0;
>> > > > >      memset(n->mac_table.macs, 0, MAC_TABLE_ENTRIES * ETH_ALEN);
>> > > > >      memcpy(&n->mac[0], &n->nic->conf->macaddr, sizeof(n->mac));
>> > > > >      qemu_format_nic_info_str(qemu_get_queue(n->nic), n->mac);
>> > > > >      memset(n->vlans, 0, MAX_VLAN >> 3);
>> > > > > }
>> > > > >
>> > > > > So device seems to lose all state, you have to re-program it.
>> > > > Oh, indeed! The guest does not reset its state, so it might
>> > > > be out of sync with the host after the operation. Was this not
>> > > > an issue when previously resetting in the context of xdp?
>> > > I suspect it was broken back then, too.
>> > Okay. I guess that in principle this is all programmable through
>> > virtnet_set_rx_mode, virtnet_vlan_rx_add_vid, etc. But it's a
>> > lot more complex than just restoring virtnet_reset. Will need to
>> > be careful about concurrency issues at the least. Similar to the
>> > ones you point out below.
>> >
>>
>> The problem has been pointed out during developing virtio-net XDP. But it
>> may not be a big issue since vhost_net ignores all kinds of the filters now.
>>
>> Thanks
>
> It might not keep doing that in the future though.
> And virtio-net in userspace doesn't ignore the filters.

How about the guest honor the request only if no state has been
offloaded to the host?

This is the common case for vhost_net, and not expected to change
soon.

Even when it does, we have a graceful degradation strategy. Guest
revert state prior to reset and reapply. Though for the time being,
solving this only in the case without state offload would be solve my
use case.

^ permalink raw reply

* Re: [RFT net-next v2 0/3] dwmac-meson8b: RGMII clock fixes for Meson8b
From: Martin Blumenstingl @ 2017-12-28 19:28 UTC (permalink / raw)
  To: Emiliano Ingrassia
  Cc: netdev, linus.luessing, khilman, linux-amlogic, jbrunet,
	Neil Armstrong, peppe.cavallaro, alexandre.torgue
In-Reply-To: <20171228175128.GA22111@ingrassia.epigenesys.com>

Hi Emiliano,

On Thu, Dec 28, 2017 at 6:51 PM, Emiliano Ingrassia
<ingrassia@epigenesys.com> wrote:
> Hi Martin,
>
> thank you for the quick response!
>
> On Thu, Dec 28, 2017 at 05:58:34PM +0100, Martin Blumenstingl wrote:
>> Hi Emiliano,
>>
>> thank you for testing this!
>>
>> On Thu, Dec 28, 2017 at 5:16 PM, Emiliano Ingrassia
>> <ingrassia@epigenesys.com> wrote:
>> > Hi Martin, Hi Dave,
>> >
>> > On Sun, Dec 24, 2017 at 12:40:57AM +0100, Martin Blumenstingl wrote:
>> >> Hi Dave,
>> >>
>> >> please do not apply this series until it got a Tested-by from Emiliano.
>> >>
>> >>
>> >> Hi Emiliano,
>> >>
>> >> you reported [0] that you couldn't get dwmac-meson8b to work on your
>> >> Odroid-C1. With your findings (register dumps, clk_summary output, etc.)
>> >> I think I was able to find a fix: it consists of two patches (which you
>> >> find in this series)
>> >>
>> >> Unfortunately I don't have any Meson8b boards with RGMII PHY so I could
>> >> only partially test this (I could only check if the clocks were
>> >> calculated correctly when using a dummy 500002394Hz input clock instead
>> >> of MPLL2).
>> >>
>> >> Could you please give this series a try and let me know about the
>> >> results?
>> >> You obviously still need your two "ARM: dts: meson8b" patches which
>> >> - add the amlogic,meson8b-dwmac" compatible to meson8b.dtsi
>> >> - enable Ethernet on the Odroid-C1
>> >>
>> >> I have tested this myself on a Khadas VIM (GXL SoC, internal RMII PHY)
>> >> and a Khadas VIM2 (GXM SoC, external RGMII PHY). Both are still working
>> >> fine (so let's hope that this also fixes your Meson8b issue :)).
>> >>
>> >>
>> >> changes since v1 at [1]:
>> >> - changed the subject of the cover-letter to indicate that this is all
>> >>   about the RGMII clock
>> >> - added PATCH #1 which ensures that we don't unnecessarily change the
>> >>   parent clocks in RMII mode (and also makes the code easier to
>> >>   understand)
>> >> - changed subject of PATCH #2 (formerly PATCH #1) to state that this
>> >>   is about the RGMII clock
>> >> - added Jerome's Reviewed-by to PATCH #2 (formerly PATCH #1)
>> >> - replaced PATCH #3 (formerly PATCH #2) with one that sets
>> >>   CLK_SET_RATE_PARENT on the mux and thus re-configures the MPLL2 clock
>> >>   on Meson8b correctly
>> >>
>> >
>> > Really thank you for your help and effort. I tried your patch but
>> > unfortunately it didn't solve the problem.
>> this is probably my fault: I forgot to mention that it requires a fix
>> for the 32-bit SoCs in the clock driver ("clk: meson: mpll: use 64-bit
>> maths in params_from_rate", see [0]) to work properly
>>
>
> Ok, with that patch applied I got:
>
> xtal                               1            1    24000000          0 0
>  sys_pll                           0            0  1200000000          0 0
>   cpu_clk                          0            0  1200000000          0 0
>  vid_pll                           0            0   732000000          0 0
>  fixed_pll                         2            2  2550000000          0 0
>   mpll2                            1            1   124999851          0 0
>    c9410000.ethernet#m250_sel      1            1   124999851          0 0
>     c9410000.ethernet#m250_div     1            1   124999851          0 0
>      c9410000.ethernet#m25_div     1            1    24999971          0 0
in theory this looks good...!

> which is equal to your result. However, the ethernet is still not working.
OK, I'll send an updated version later (or tomorrow, depending on how
much time I have left today) which adds a fixed divider and converts
bit 10 to a gate
on GXBB and newer bit 10 is always true since the m250_div is only
3-bit wide (= max divider of 7, 0 is invalid according to the
datasheet). with a 1000MHz input clock (fclk_div2) m250_div will
divide this by 4 and m25_div divided this by 10.

> The prg0 register is set to 0x70A1.
>
> A problem that I see with this solution is that MPLL2 is set to ~125 MHz.
> The S805 SoC manual reports that bits 9-7 should contain a value x such
> that: MPLL2 = 250 MHz * x (with x >= 1).
> In our case, bits 9-7 are set to 1 which is incorrect.
> I think that MPLL2 should be 250 MHz at least.
when looking at the GXBB clock tree we need a fixed divide by 10.
this also means that the mpll2 clock will probably be set to ~250MHz,
the m250_div to 1 and with the fixed divider "10" we get close to our
desired 25MHz

>> >
>> > Here is the new clk_summary:
>> >
>> > xtal                            1            1    24000000          0 0
>> >  sys_pll                        0            0  1200000000          0 0
>> >   cpu_clk                       0            0  1200000000          0 0
>> >  vid_pll                        0            0   732000000          0 0
>> >  fixed_pll                      2            2  2550000000          0 0
>> >   mpll2                         1            1   106250000          0 0
>> >    c9410000.ethernet#m250_sel   1            1   106250000          0 0
>> >     c9410000.ethernet#m250_div  1            1   106250000          0 0
>> >      c9410000.ethernet#m25_div  1            1    21250000          0 0
>> >
>> > which leads to a value of 0x70a1 in the prg0 ethernet register.
>> > As you can see, something is changed but the RGMII clock is not at 25 MHz.
>> > In particular, the bit 10 of prg0, which enables the "generation of 25 MHz
>> > clock for PHY" (see S805 manual), is 0.
>> assuming that the description in the datasheet is correct
>> after Kevin and Mike got updated information from Amlogic about the
>> PRG_ETHERNET0 register documenation (see [1]) we thought that bit 10
>> means "0 = divide by 5, 1 = divide by 10" (see [2]). I didn't question
>> this so far, but I'll test this on a newer SoC later (by forcing
>> m250_div to 125MHz, then m25_div will have register value 0 = divide
>> by 5)
>>
>> if the description from the documentation is correct then we need to
>> replace m25_div with a fixed-factor clock (mult = 1, div = 5) and make
>> it a m25_en gate clock instead
>> the resulting clock path would look like this: mpll2 > m250_sel >
>> m250_div > fixed_factor > m25_en
>>
>> > Please, if you have other suggestions let me know.
>> could you please re-test this with the patch from [0] applied? no
>> other changes should be needed!
>>
>> > Best regards,
>> >
>> > Emiliano
>> >
>> >>
>> >> [0] http://lists.infradead.org/pipermail/linux-amlogic/2017-December/005596.html
>> >> [1] http://lists.infradead.org/pipermail/linux-amlogic/2017-December/005848.html
>> >>
>> >>
>> >> Martin Blumenstingl (3):
>> >>   net: stmmac: dwmac-meson8b: only configure the clocks in RGMII mode
>> >>   net: stmmac: dwmac-meson8b: fix setting the RGMII clock on Meson8b
>> >>   net: stmmac: dwmac-meson8b: propagate rate changes to the parent clock
>> >>
>> >>  .../net/ethernet/stmicro/stmmac/dwmac-meson8b.c    | 55 +++++++++++-----------
>> >>  1 file changed, 27 insertions(+), 28 deletions(-)
>> >>
>> >> --
>> >> 2.15.1
>> >>
>>
>> Regards
>> Martin
>>
>
> Thank you,
>
> Emiliano
>
>>
>> [0] https://patchwork.kernel.org/patch/10131677/
>> [1] https://www.mail-archive.com/netdev@vger.kernel.org/msg119290.html
>> [2] https://www.mail-archive.com/netdev@vger.kernel.org/msg119293.html


Regards
Martin

^ permalink raw reply

* Re: [PATCH net 0/2] strparser: Fix lockdep issue
From: David Miller @ 2017-12-28 19:29 UTC (permalink / raw)
  To: tom; +Cc: netdev, dvyukov, ozgur, rohit
In-Reply-To: <20171228190044.3748-1-tom@quantonium.net>

From: Tom Herbert <tom@quantonium.net>
Date: Thu, 28 Dec 2017 11:00:42 -0800

> When sock_owned_by_user returns true in strparser. Fix is to add and
> call sock_owned_by_user_nocheck since the check for owned by user is
> not an error condition in this case.
> 
> Fixes: 43a0c6751a322847 ("strparser: Stream parser for messages")
> Reported-by: syzbot <syzkaller@googlegroups.com>
> Reported-and-tested-by: <syzbot+c91c53af67f9ebe599a337d2e70950366153b295@syzkaller.appspotmail.com>

Series applied.

^ permalink raw reply

* Re: [PATCH net] skbuff: in skb_copy_ubufs unclone before releasing zerocopy
From: David Miller @ 2017-12-28 19:30 UTC (permalink / raw)
  To: willemdebruijn.kernel; +Cc: netdev, willemb
In-Reply-To: <20171228173813.19010-1-willemdebruijn.kernel@gmail.com>

From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Date: Thu, 28 Dec 2017 12:38:13 -0500

> From: Willem de Bruijn <willemb@google.com>
> 
> skb_copy_ubufs must unclone before it is safe to modify its
> skb_shared_info with skb_zcopy_clear.
> 
> Commit b90ddd568792 ("skbuff: skb_copy_ubufs must release uarg even
> without user frags") ensures that all skbs release their zerocopy
> state, even those without frags.
> 
> But I forgot an edge case where such an skb arrives that is cloned.
> 
> The stack does not build such packets. Vhost/tun skbs have their
> frags orphaned before cloning. TCP skbs only attach zerocopy state
> when a frag is added.
> 
> But if TCP packets can be trimmed or linearized, this might occur.
> Tracing the code I found no instance so far (e.g., skb_linearize
> ends up calling skb_zcopy_clear if !skb->data_len).
> 
> Still, it is non-obvious that no path exists. And it is fragile to
> rely on this.
> 
> Fixes: b90ddd568792 ("skbuff: skb_copy_ubufs must release uarg even without user frags")
> Signed-off-by: Willem de Bruijn <willemb@google.com>

Applied and queued up for -stable.

^ permalink raw reply

* Re: [PATCH net-next 2/2] l2tp: add peer_offset parameter
From: Guillaume Nault @ 2017-12-28 19:45 UTC (permalink / raw)
  To: Lorenzo Bianconi; +Cc: davem, netdev, jchapman, liuhangbin
In-Reply-To: <20171228182347.GA8732@localhost.localdomain>

On Thu, Dec 28, 2017 at 07:23:48PM +0100, Lorenzo Bianconi wrote:
> On Dec 28, Guillaume Nault wrote:
> > After a quick review of L2TPv3 and pseudowires RFCs, I still don't see
> > how adding some padding between the L2TPv3 header and the payload could
> > constitute a valid frame. Of course, the base feature is already there,
> > but after a quick test, it looks like the padding bits aren't
> > initialised and leak memory.
> 
> Do you mean for L2TPv2 or L2TPv3? For L2TPv3 offset/peer_offset are initialized
> in l2tp_nl_cmd_session_create()
>
That's the offsets stored in the l2tp_session_cfg structure. But I was
talking about the xmit path: l2tp_build_l2tpv3_header() doesn't
initialise the padding between the header and the payload. So when
someone activates this option, then every transmitted packet leaks
memory on the wire.

> Setting session data offset is already supported in L2TP kernel module
> (and could be already used by userspace applications);
> for L2TPv2 there is an optional 16-bit value in the header while for L2TPv3
> the offset is configured by userspace.
> At the moment the kernel (for L2TPv3) uses offset for both tx and rx side.
> Userspace (iproute2) allows to distinguish tx offset (offset) from rx one
> (peer_offset) but since the rx part is not handled at the moment
> (I fixed peer_offset support in iproute2, I have not sent the patch upstream yet, attached below)
> this leads to a misalignment between tunnel endpoints.
> You can easily reproduce the issue using this setup (and the below patch for iproute2):
> 
> ip l2tp add tunnel local <ip0> remote <ip1> tunnel_id <id0> peer_tunnel_id <id1> udp_sport <p0> udp_dport <p1>
> ip l2tp add tunnel local <ip1> remote <ip0> tunnel_id <id1> peer_tunnel_id <id0> udp_sport <p1> udp_dport <p0>
> 
> ip l2tp add session name l2tp0 tunnel_id <id0> session_id <s0> peer_session_id <s1> offset 8 peer_offset 16
> ip l2tp add session name l2tp0 tunnel_id <id1> session_id <s1> peer_session_id <s0> offset 16 peer_offset 8
> 
Yes, I'm well aware of that. And thanks for having worked on a full
solution including iproute2. But does one really need to set
asymetrical offset values? It doesn't look wrong to require setting the
same value on both sides. Other options need this, like "l2spec_type".

Here we have an option that:
  * creates invalid packets (AFAIK),
  * is buggy and leaks memory on the network,
  * doesn't seem to have any use case (even the manpage
    says "This is hardly ever used").

So I'm sorry, but I don't see the point in expanding this option to
allow even stranger setups. If there's a use case, then fine.
Otherwise, let's just acknowledge that the "peer_offset" option of
iproute2 is a noop (and maybe remove it from the manpage).

And the kernel "offset" option needs to be fixed. Actually, I wouldn't
mind if it was converted to be a noop, or even rejected. L2TP already
has its share of unused features that complicate the code and hamper
evolution and bug fixing. As I said earlier, if it's buggy, unused and
can't even produce valid packets, then why bothering with it?

But that's just my point of view. James, do you have an opinion on
this?

^ permalink raw reply

* Re: pull-request: bpf-next 2017-12-28
From: Daniel Borkmann @ 2017-12-28 20:19 UTC (permalink / raw)
  To: David Miller; +Cc: ast, netdev
In-Reply-To: <20171227.204108.334274235776263929.davem@davemloft.net>

On 12/28/2017 02:41 AM, David Miller wrote:
> From: Daniel Borkmann <daniel@iogearbox.net>
> Date: Thu, 28 Dec 2017 01:18:21 +0100
> 
>> The following pull-request contains BPF updates for your *net-next*
>> tree.
> 
> Pulled.

Thanks!

> Any progress on those tests failing on strict alignment architectures?

Sorry for the delay, was swamped right before Christmas break; I'm back
from vacation 2nd week of Jan, so I'll get right to it then.

Thanks,
Daniel

^ permalink raw reply

* [PATCH net 0/3] bug fixes for ENA Ethernet driver
From: netanel @ 2017-12-28 21:30 UTC (permalink / raw)
  To: davem, netdev
  Cc: Netanel Belgazal, dwmw, zorik, matua, saeedb, msw, aliguori,
	nafea, evgenys, gtzalik

From: Netanel Belgazal <netanel@amazon.com>



This patchset contains 3 bug fixes:
* handle rare race condition during MSI-X initialization
* fix error processing in ena_down()
* call netif_carrier_off() only after netdev is registered

Netanel Belgazal (3):
  net: ena: unmask MSI-X only after device initialization is completed
  net: ena: fix error handling in ena_down() sequence
  eet: ena: invoke netif_carrier_off() only after netdev registered

 drivers/net/ethernet/amazon/ena/ena_netdev.c | 49 ++++++++++++++++++----------
 1 file changed, 32 insertions(+), 17 deletions(-)

-- 
2.7.3.AMZN

^ permalink raw reply

* [PATCH net 1/3] net: ena: unmask MSI-X only after device initialization is completed
From: netanel @ 2017-12-28 21:30 UTC (permalink / raw)
  To: davem, netdev
  Cc: Netanel Belgazal, dwmw, zorik, matua, saeedb, msw, aliguori,
	nafea, evgenys, gtzalik
In-Reply-To: <1514496620-69953-1-git-send-email-netanel@amazon.com>

From: Netanel Belgazal <netanel@amazon.com>

Under certain conditions MSI-X interrupt might arrive right after it
was unmasked in ena_up(). There is a chance it would be processed by
the driver before device ENA_FLAG_DEV_UP flag is set. In such a case
the interrupt is ignored.
ENA device operates in auto-masked mode, therefore ignoring
interrupt leaves it masked for good.
Moving unmask of interrupt to be the last step in ena_up().

Signed-off-by: Netanel Belgazal <netanel@amazon.com>
---
 drivers/net/ethernet/amazon/ena/ena_netdev.c | 26 +++++++++++++-------------
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c
index 97c5a89a9cf7..6fb28fd43eb3 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
@@ -1565,7 +1565,7 @@ static int ena_rss_configure(struct ena_adapter *adapter)
 
 static int ena_up_complete(struct ena_adapter *adapter)
 {
-	int rc, i;
+	int rc;
 
 	rc = ena_rss_configure(adapter);
 	if (rc)
@@ -1584,17 +1584,6 @@ static int ena_up_complete(struct ena_adapter *adapter)
 
 	ena_napi_enable_all(adapter);
 
-	/* Enable completion queues interrupt */
-	for (i = 0; i < adapter->num_queues; i++)
-		ena_unmask_interrupt(&adapter->tx_ring[i],
-				     &adapter->rx_ring[i]);
-
-	/* schedule napi in case we had pending packets
-	 * from the last time we disable napi
-	 */
-	for (i = 0; i < adapter->num_queues; i++)
-		napi_schedule(&adapter->ena_napi[i].napi);
-
 	return 0;
 }
 
@@ -1731,7 +1720,7 @@ static int ena_create_all_io_rx_queues(struct ena_adapter *adapter)
 
 static int ena_up(struct ena_adapter *adapter)
 {
-	int rc;
+	int rc, i;
 
 	netdev_dbg(adapter->netdev, "%s\n", __func__);
 
@@ -1774,6 +1763,17 @@ static int ena_up(struct ena_adapter *adapter)
 
 	set_bit(ENA_FLAG_DEV_UP, &adapter->flags);
 
+	/* Enable completion queues interrupt */
+	for (i = 0; i < adapter->num_queues; i++)
+		ena_unmask_interrupt(&adapter->tx_ring[i],
+				     &adapter->rx_ring[i]);
+
+	/* schedule napi in case we had pending packets
+	 * from the last time we disable napi
+	 */
+	for (i = 0; i < adapter->num_queues; i++)
+		napi_schedule(&adapter->ena_napi[i].napi);
+
 	return rc;
 
 err_up:
-- 
2.7.3.AMZN

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox