Netdev List
 help / color / mirror / Atom feed
* Start working
From: Julie @ 2019-01-31 13:39 UTC (permalink / raw)
  To: netdev

Want to make white background for your images?

We can add clipping path, or give retouching for your photos if needed.

Let's start testing for your photos.

Thanks,
Julie



















Wesel


Pirna


^ permalink raw reply

* Start working
From: Julie @ 2019-01-31 11:55 UTC (permalink / raw)
  To: netdev

Want to make white background for your images?

We can add clipping path, or give retouching for your photos if needed.

Let's start testing for your photos.

Thanks,
Julie



















Pforzheim


Pirna


^ permalink raw reply

* linux-next: Fixes tag needs some work in the net-next tree
From: Stephen Rothwell @ 2019-01-31 20:35 UTC (permalink / raw)
  To: David Miller, Networking
  Cc: Linux Next Mailing List, Linux Kernel Mailing List, Huazhong Tan,
	Peng Li

[-- Attachment #1: Type: text/plain, Size: 356 bytes --]

Hi all,

In commit

  9fc55413270f ("net: hns3: fix improper error handling in the hclge_init_ae_dev()")

Fixes tag

  Fixes: 288475b2ad01 ("{topost} net: hns3: refine umv space allocation")

has these problem(s):

  - Target SHA1 does not exist

I can't even find it by simple searches for the subject ...

-- 
Cheers,
Stephen Rothwell

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply

* Re: [PATCH bpf-next v4 5/7] samples/bpf: Add a "force" flag to XDP samples
From: Jesper Dangaard Brouer @ 2019-01-31 20:19 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: Maciej Fijalkowski, daniel, ast, netdev, brouer
In-Reply-To: <20190130182653.0f3bb01e@cakuba.hsd1.ca.comcast.net>

On Wed, 30 Jan 2019 18:26:53 -0800
Jakub Kicinski <jakub.kicinski@netronome.com> wrote:

> On Wed, 30 Jan 2019 20:08:59 +0100, Jesper Dangaard Brouer wrote:
> > > I'll post a v5 with libbpf_strerror() usage when bpf_set_link_xdp_fd failed in
> > > samples but at this point it will only give us a standard "device or resource
> > > busy" error    
> > 
> > That is a good first iteration improvement.  And if QA complains, I can
> > quickly diagnose and explain the issue, based on this generic message,
> > without digging further and wasting more time.
> >   
> > > , so if we need some more meaningful message that libbpf will give
> > > us then I guess we need to define a new libbpf_errno entry (as well as entry in
> > > libbpf_strerror_table for this new errno value) and set the errno in
> > > bpf_set_link_xdp_fd in case of a failure?    
> > 
> > It likely require more work do provide more meaningful messages, and I
> > guess it is out-of-scope for your patchset.  
> 
> Perhaps we could put the error message in extack instead?

That is actually a good idea, for providing these more meaningful
messages.

--Jesper

^ permalink raw reply

* Re: [PATCH net] virtio_net: Account for tx bytes and packets on sending xdp_frames
From: Jesper Dangaard Brouer @ 2019-01-31 20:15 UTC (permalink / raw)
  To: David Miller
  Cc: mst, makita.toshiaki, jasowang, netdev, virtualization, dsahern,
	hawk, Toke Høiland-Jørgensen
In-Reply-To: <20190131.094523.2248120325911339180.davem@davemloft.net>

On Thu, 31 Jan 2019 09:45:23 -0800 (PST)
David Miller <davem@davemloft.net> wrote:

> From: "Michael S. Tsirkin" <mst@redhat.com>
> Date: Thu, 31 Jan 2019 10:25:17 -0500
> 
> > On Thu, Jan 31, 2019 at 08:40:30PM +0900, Toshiaki Makita wrote:  
> >> Previously virtnet_xdp_xmit() did not account for device tx counters,
> >> which caused confusions.
> >> To be consistent with SKBs, account them on freeing xdp_frames.
> >> 
> >> Reported-by: David Ahern <dsahern@gmail.com>
> >> Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>  
> > 
> > Well we count them on receive so I guess it makes sense for consistency
> > 
> > Acked-by: Michael S. Tsirkin <mst@redhat.com>
> > 
> > however, I really wonder whether adding more and more standard net stack
> > things like this will end up costing most of XDP its speed.
> > 
> > Should we instead make sure *not* to account XDP packets
> > in any counters at all? XDP programs can use maps
> > to do their own counting...  
> 
> This has been definitely a discussion point, and something we should
> develop a clear, strong, policy on.
> 
> David, Jesper, care to chime in where we ended up in that last thread
> discussion this?

IHMO packets RX and TX on a device need to be accounted, in standard
counters, regardless of XDP.  For XDP RX the packet is counted as RX,
regardless if XDP choose to XDP_DROP.  On XDP TX which is via
XDP_REDIRECT or XDP_TX, the driver that transmit the packet need to
account the packet in a TX counter (this if often delayed to DMA TX
completion handling).  We cannot break the expectation that RX and TX
counter are visible to userspace stats tools. XDP should not make these
packets invisible.

Performance wise, I don't see an issue. As updating these counters
(packets and bytes) can be done as a bulk, either when driver NAPI RX
func ends, or in TX DMA completion, like most drivers already do today.
Further more, most drivers save this in per RX ring data-area, which
are only summed when userspace read these.


A separate question (and project) raised by David Ahern, was if we
should have more detailed stats on the different XDP action return
codes, as an easy means for sysadms to diagnose running XDP programs.
That is something that require more discussions, as it can impact
performance, and likely need to be opt-in.  My opinion is yes we should
do this for the sake of better User eXperience, BUT *only* if we can find
a technical solution that does not hurt performance.

--Jesper

^ permalink raw reply

* Re: [PATCH v2 bpf-next] bpf: add optional memory accounting for maps
From: Jakub Kicinski @ 2019-01-31 20:04 UTC (permalink / raw)
  To: Martynas Pumputis; +Cc: netdev, ys114321, ast, daniel, Yonghong Song
In-Reply-To: <20190131093801.32220-1-m@lambda.lt>

On Thu, 31 Jan 2019 10:38:01 +0100, Martynas Pumputis wrote:
> Previously, memory allocated for a map was not accounted. Therefore,
> this memory could not be taken into consideration by the cgroups
> memory controller.
> 
> This patch introduces the "BPF_F_ACCOUNT_MEM" flag which enables
> the memory accounting for a map, and it can be set during
> the map creation ("BPF_MAP_CREATE") in "map_flags".

What should happen for no-prealloc maps?  Would it make some sense to
charge the max map size to the user and not each allocation?  Or
perhaps remember the owner to be able to charge the data path
allocations which don't happen in process context as well?

^ permalink raw reply

* Re: [PATCH net] net/mlx4_en: Force CHECKSUM_NONE for short ethernet frames
From: Eric Dumazet @ 2019-01-31 19:42 UTC (permalink / raw)
  To: Saeed Mahameed
  Cc: Eran Ben Elisha, davem@davemloft.net, netdev@vger.kernel.org,
	Tariq Toukan
In-Reply-To: <7520daf2c35b485660474c6e76a9ad4e24dc817e.camel@mellanox.com>

On Thu, Jan 31, 2019 at 11:27 AM Saeed Mahameed <saeedm@mellanox.com> wrote:
>
> Are you sure ? you are claiming that the hardware will skip csum
> complete i.e cqe->checksum will be 0xffff for padded short IP frames.
> i don't think this is the case, the whole bug is that the hw does
> provide a partial cqe->checksum (i.e doesn't included the padding
> bytes) even for short eth frames.

If the padding is not included, then cqe->checksum is 0xFFFF for
correctly received frames.

Otherwise, what would be cqe->checksum in this case ? A random value ?

^ permalink raw reply

* Re: [PATCH bpf-next v3 1/3] libbpf: move pr_*() functions to common header file
From: Y Song @ 2019-01-31 19:35 UTC (permalink / raw)
  To: Magnus Karlsson
  Cc: Björn Töpel, Alexei Starovoitov, Daniel Borkmann,
	netdev, Jakub Kicinski, Björn Töpel, qi.z.zhang,
	Jesper Dangaard Brouer
In-Reply-To: <1548774737-16579-2-git-send-email-magnus.karlsson@intel.com>

On Tue, Jan 29, 2019 at 7:13 AM Magnus Karlsson
<magnus.karlsson@intel.com> wrote:
>
> Move the pr_*() functions in libbpf.c to a common header file called
> libbpf_internal.h. This so that the later libbpf AF_XDP helper library
> code in xsk.c can use these printing functions too.
>
> Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
> ---
>  tools/lib/bpf/libbpf.c          | 30 +-----------------------------
>  tools/lib/bpf/libbpf_internal.h | 41 +++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 42 insertions(+), 29 deletions(-)
>  create mode 100644 tools/lib/bpf/libbpf_internal.h
>
> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> index 2ccde17..1d7fe26 100644
> --- a/tools/lib/bpf/libbpf.c
> +++ b/tools/lib/bpf/libbpf.c
> @@ -39,6 +39,7 @@
>  #include <gelf.h>
>
>  #include "libbpf.h"
> +#include "libbpf_internal.h"
>  #include "bpf.h"
>  #include "btf.h"
>  #include "str_error.h"
> @@ -51,34 +52,6 @@
>  #define BPF_FS_MAGIC           0xcafe4a11
>  #endif
>
> -#define __printf(a, b) __attribute__((format(printf, a, b)))
> -
> -__printf(1, 2)
> -static int __base_pr(const char *format, ...)
> -{
> -       va_list args;
> -       int err;
> -
> -       va_start(args, format);
> -       err = vfprintf(stderr, format, args);
> -       va_end(args);
> -       return err;
> -}
> -
> -static __printf(1, 2) libbpf_print_fn_t __pr_warning = __base_pr;
> -static __printf(1, 2) libbpf_print_fn_t __pr_info = __base_pr;
> -static __printf(1, 2) libbpf_print_fn_t __pr_debug;
> -
> -#define __pr(func, fmt, ...)   \
> -do {                           \
> -       if ((func))             \
> -               (func)("libbpf: " fmt, ##__VA_ARGS__); \
> -} while (0)
> -
> -#define pr_warning(fmt, ...)   __pr(__pr_warning, fmt, ##__VA_ARGS__)
> -#define pr_info(fmt, ...)      __pr(__pr_info, fmt, ##__VA_ARGS__)
> -#define pr_debug(fmt, ...)     __pr(__pr_debug, fmt, ##__VA_ARGS__)
> -
>  void libbpf_set_print(libbpf_print_fn_t warn,
>                       libbpf_print_fn_t info,
>                       libbpf_print_fn_t debug)
> @@ -96,7 +69,6 @@ void libbpf_set_print(libbpf_print_fn_t warn,
>                 goto out;               \
>  } while(0)
>
> -
>  /* Copied from tools/perf/util/util.h */
>  #ifndef zfree
>  # define zfree(ptr) ({ free(*ptr); *ptr = NULL; })
> diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h
> new file mode 100644
> index 0000000..951c078
> --- /dev/null
> +++ b/tools/lib/bpf/libbpf_internal.h
> @@ -0,0 +1,41 @@
> +/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */
> +
> +/*
> + * Common internal libbpf functions and definitions.
> + *
> + * Copyright (C) 2013-2015 Alexei Starovoitov <ast@kernel.org>
> + * Copyright (C) 2015 Wang Nan <wangnan0@huawei.com>
> + * Copyright (C) 2015 Huawei Inc.
> + */
> +#ifndef __LIBBPF_LIBBPF_INTERNAL_H
> +#define __LIBBPF_LIBBPF_INTERNAL_H
> +
> +#define __printf(a, b) __attribute__((format(printf, a, b)))
> +
> +__printf(1, 2)
> +static int __base_pr(const char *format, ...)
> +{
> +       va_list args;
> +       int err;
> +
> +       va_start(args, format);
> +       err = vfprintf(stderr, format, args);
> +       va_end(args);
> +       return err;
> +}

Hmm. This will introduce this static function into every .c file.

Maybe we could define a global function in libbpf.c like
  libbpf_debug_pr(level, ...)
This function will be internal to the library and implemented in libbpf.c.
All files can call this function to print out the debug information
based on level?

> +
> +static __maybe_unused __printf(1, 2) libbpf_print_fn_t __pr_warning = __base_pr;
> +static __maybe_unused __printf(1, 2) libbpf_print_fn_t __pr_info = __base_pr;
> +static __maybe_unused __printf(1, 2) libbpf_print_fn_t __pr_debug;
> +
> +#define __pr(func, fmt, ...)   \
> +do {                           \
> +       if ((func))             \
> +               (func)("libbpf: " fmt, ##__VA_ARGS__); \
> +} while (0)
> +
> +#define pr_warning(fmt, ...)   __pr(__pr_warning, fmt, ##__VA_ARGS__)
> +#define pr_info(fmt, ...)      __pr(__pr_info, fmt, ##__VA_ARGS__)
> +#define pr_debug(fmt, ...)     __pr(__pr_debug, fmt, ##__VA_ARGS__)
> +
> +#endif /* __LIBBPF_LIBBPF_INTERNAL_H */
> --
> 2.7.4
>

^ permalink raw reply

* [PATCH net] bnxt_en: Disable interrupts when allocating CP rings or NQs.
From: Michael Chan @ 2019-01-31 19:31 UTC (permalink / raw)
  To: davem; +Cc: netdev

When calling firmware to allocate a CP ring or NQ, an interrupt associated
with that ring may be generated immediately before the doorbell is even
setup after the firmware call returns.  When servicing the interrupt, the
driver may crash when trying to access the doorbell.

Fix it by disabling interrupt on that vector until the doorbell is
set up.

Fixes: 697197e5a173 ("bnxt_en: Re-structure doorbells.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---

Please queue this for 4.20 stable as well.  Thanks.

 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 6a51287..8bc7e49 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -4973,12 +4973,18 @@ static int bnxt_hwrm_ring_alloc(struct bnxt *bp)
 		struct bnxt_cp_ring_info *cpr = &bnapi->cp_ring;
 		struct bnxt_ring_struct *ring = &cpr->cp_ring_struct;
 		u32 map_idx = ring->map_idx;
+		unsigned int vector;
 
+		vector = bp->irq_tbl[map_idx].vector;
+		disable_irq_nosync(vector);
 		rc = hwrm_ring_alloc_send_msg(bp, ring, type, map_idx);
-		if (rc)
+		if (rc) {
+			enable_irq(vector);
 			goto err_out;
+		}
 		bnxt_set_db(bp, &cpr->cp_db, type, map_idx, ring->fw_ring_id);
 		bnxt_db_nq(bp, &cpr->cp_db, cpr->cp_raw_cons);
+		enable_irq(vector);
 		bp->grp_info[i].cp_fw_ring_id = ring->fw_ring_id;
 
 		if (!i) {
-- 
2.5.1


^ permalink raw reply related

* Re: [RFC 00/14] netlink/hierarchical stats
From: Jakub Kicinski @ 2019-01-31 19:30 UTC (permalink / raw)
  To: Roopa Prabhu
  Cc: David Miller, oss-drivers, netdev, Jiří Pírko,
	Florian Fainelli, Andrew Lunn, Michal Kubecek, David Ahern,
	Simon Horman, Brandeburg, Jesse, maciejromanfijalkowski,
	vasundhara-v.volam, Michael Chan, shalomt, Ido Schimmel
In-Reply-To: <CAJieiUhJMF3QQ75xU-qGPYRiVNjJ09sfRqFx9fVKF3yKUQUAYA@mail.gmail.com>

On Thu, 31 Jan 2019 08:31:51 -0800, Roopa Prabhu wrote:
> On Thu, Jan 31, 2019 at 8:16 AM Roopa Prabhu wrote:
> > On Wed, Jan 30, 2019 at 4:24 PM Jakub Kicinski wrote:  
> > > On Wed, 30 Jan 2019 14:14:34 -0800, Roopa Prabhu wrote:  
> > >
> > > My thinking was that we should leave truly custom/strange stats to
> > > ethtool API which works quite well for that and at the same time be
> > > very accepting of people adding new IDs to HSTAT (only requirement is
> > > basically defining the meaning very clearly).  
> >
> > that sounds reasonable. But the 'defining meaning clearly' gets tricky
> > sometimes.
> > The vendor who gets their ID or meaning first wins :) and the rest
> > will have to live with
> > ethtool and explain to rest of the world that ethtool is more reliable
> > for their hardware :)

Right, that's the trade off inherent to standardization.  I don't see
any way to work around the fact that the definition may not fit all.

What I want as a end user and what I want for my customers is the
ability to switch the NIC on their system and not spend two months
"integrating" into their automation :(  If the definition of statistics
is not solid we're back to square one.

> > I am also concerned that this getting the ID into common HSTAT ID
> > space will  slow down the process of adding new counters
> > for vendors. Which will lead to vendors sticking with ethtool API. 

I feel like whatever we did here will end up looking much like the
ethtool interface, which is why I decided to leave that part out.
Ethtool -S works pretty well for custom stats.  Standard and structured
stats don't fit with it in any way, the two seem best left separate.

> > It would be great if people can get all stats in one place and not
> > rely on another API for 'more'.

One place in the driver or for the user?  I'm happy to add the code to
ethtool to also dump hstats and render them in a standard way.  In fact
the tool I have for testing has a "simplified" output format which
looks exactly like ethtool -S.

One place for the driver to report is hard, as I said I think the
custom stats are best left with ethtool.  Adding an extra incentive to
standardize.

> > > For the first stab I looked at two drivers and added all the stats that
> > > were common.
> > >
> > > Given this set is identifying statistics by ID - how would we make that
> > > extensible to drivers?  Would we go back to strings or have some
> > > "driver specific" ID space?  
> >
> > I was looking for ideas from you really, to see if you had considered
> > this. agree per driver ID space seems ugly.
> > ethtool strings are great today...if we can control the duplication.
> > But thinking some more..., i did see some
> > patches recently for vendor specific parameter (with ID) space in
> > devlink. maybe something like that will be
> > reasonable ?

I thought about this for a year and I basically came to the conclusion
I can't find any perfect solution, if there is one.

The devlink parameters are useful, but as anticipated they became the
laziest excuse of an ABI... Don't get me started ;)

> > > Is there any particular type of statistic you'd expect drivers to want
> > > to add?  For NICs I think IEEE/RMON should pretty much cover the
> > > silicon ones, but I don't know much about switches :)  
> >
> > I will have to go through the list. But switch asics do support
> > flexible stats/counters that can be attached at various points.
> > And new chip versions come with more support. Having that flexibility
> > to expose/extend such stats incrementally is very valuable on a per
> > hardware/vendor basis.  

Yes, I'm not too familiar with those counters.  Do they need to be
enabled to start counting?  Do they have performance impact?  Can the
"sample" events perf-style?  How is the condition on which they trigger
defined?  Is it maybe just "match a packet and increment a counter"?
Would such counters benefit from hierarchical structure?

I was trying to cover the long standing use cases - namely the
IEEE/RMON stats which all MAC have had for years and per queue stats
which all drivers have had for years.  But if we can cater to more
cases I'm open.

> Just want to clarify that I am suggesting a nested HSTATS extension
> infra for drivers (just like ethtool).
> 'Common stats' stays at the top-level.

I got a concept of groups here.  The dump generally looks like this:

[root group A (say MAC stats)]
  [sub group RX]
  [sub group TX]
[root group B (say PCIe stats)]
  [sub group RX]
  [sub group TX]
[root group C (say per-q driver stats]
  [sub group RX]
    [q1 group]
    [q2 group]
    [q3 group]
  [sub group TX]
    [q1 group]
    [q2 group]
    [q3 group]

Each root group representing a "point in the pipeline".

So it's not too hard to add a root group with whatever, the questions
are move how would it benefit over existing ethtool if the stats are
custom anyway?  Hm..

^ permalink raw reply

* Re: [PATCH net] net/mlx4_en: Force CHECKSUM_NONE for short ethernet frames
From: Saeed Mahameed @ 2019-01-31 19:27 UTC (permalink / raw)
  To: edumazet@google.com
  Cc: Eran Ben Elisha, davem@davemloft.net, netdev@vger.kernel.org,
	Tariq Toukan
In-Reply-To: <CANn89iKPmQ0=zWt69KHWi3+64GdX4S9JBXuQMG+6VR1TwRH_FA@mail.gmail.com>

On Thu, 2019-01-31 at 11:07 -0800, Eric Dumazet wrote:
> On Thu, Jan 31, 2019 at 11:04 AM Saeed Mahameed <saeedm@mellanox.com>
> wrote:
> > On Thu, 2019-01-31 at 07:02 -0800, Eric Dumazet wrote:
> > > On Thu, Jan 31, 2019 at 5:04 AM Tariq Toukan <tariqt@mellanox.com
> > > >
> > > wrote:
> > > > From: Saeed Mahameed <saeedm@mellanox.com>
> > > > 
> > > > When an ethernet frame is padded to meet the minimum ethernet
> > > > frame
> > > > size, the padding octets are not covered by the hardware
> > > > checksum.
> > > > Fortunately the padding octets are usually zero's, which don't
> > > > affect
> > > > checksum. However, it is not guaranteed. For example, switches
> > > > might
> > > > choose to make other use of these octets.
> > > > This repeatedly causes kernel hardware checksum fault.
> > > > 
> > > > Prior to the cited commit below, skb checksum was forced to be
> > > > CHECKSUM_NONE when padding is detected. After it, we need to
> > > > keep
> > > > skb->csum updated. However, fixing up CHECKSUM_COMPLETE
> > > > requires to
> > > > verify and parse IP headers, it does not worth the effort as
> > > > the
> > > > packets
> > > > are so small that CHECKSUM_COMPLETE has no significant
> > > > advantage.
> > > > 
> > > > Fixes: 88078d98d1bb ("net: pskb_trim_rcsum() and
> > > > CHECKSUM_COMPLETE
> > > > are friends")
> > > > Cc: Eric Dumazet <edumazet@google.com>
> > > > Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
> > > > Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
> > > > ---
> > > >  drivers/net/ethernet/mellanox/mlx4/en_rx.c | 19
> > > > +++++++++++++++++-
> > > > -
> > > >  1 file changed, 17 insertions(+), 2 deletions(-)
> > > > 
> > > > Hi Dave,
> > > > Please queue for -stable >= v4.18.
> > > > 
> > > > Thanks,
> > > > Tariq
> > > > 
> > > > diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
> > > > b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
> > > > index 9a0881cb7f51..fc8a11444e1a 100644
> > > > --- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
> > > > +++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
> > > > @@ -617,6 +617,8 @@ static int get_fixed_ipv6_csum(__wsum
> > > > hw_checksum, struct sk_buff *skb,
> > > >  }
> > > >  #endif
> > > > 
> > > > +#define short_frame(size) ((size) <= ETH_ZLEN + ETH_FCS_LEN)
> > > > +
> > > >  /* We reach this function only after checking that any of
> > > >   * the (IPv4 | IPv6) bits are set in cqe->status.
> > > >   */
> > > > @@ -624,9 +626,22 @@ static int check_csum(struct mlx4_cqe
> > > > *cqe,
> > > > struct sk_buff *skb, void *va,
> > > >                       netdev_features_t dev_features)
> > > >  {
> > > >         __wsum hw_checksum = 0;
> > > > +       void *hdr;
> > > > +
> > > > +       /* CQE csum doesn't cover padding octets in short
> > > > ethernet
> > > > +        * frames. And the pad field is appended prior to
> > > > calculating
> > > > +        * and appending the FCS field.
> > > > +        *
> > > > +        * Detecting these padded frames requires to verify and
> > > > parse
> > > > +        * IP headers, so we simply force all those small
> > > > frames to
> > > > be
> > > > +        * CHECKSUM_NONE even if they are not padded.
> > > > +        * TODO: better if we report CHECKSUM_UNNECESSARY but
> > > > this
> > > > +        * demands some refactroing.
> > > 
> > > This TODO: looks bogus to me.
> > > 
> > > mlx4 driver first tries to use CHECKSUM_UNNECESSARY.
> > > 
> > > if ((cqe->status & cpu_to_be16(MLX4_CQE_STATUS_TCP |
> > >                                                   MLX4_CQE_STATUS
> > > _UDP
> > > )) &&
> > >      (cqe->status & cpu_to_be16(MLX4_CQE_STATUS_IPOK)) &&
> > >       cqe->checksum == cpu_to_be16(0xffff)) {
> > >      ...
> > >       ip_summed = CHECKSUM_UNNECESSARY;
> > >       ...
> > > }
> > > 
> > > Then if this attempt failed, it tries to use CHECKSUM_COMPLETE
> > > (and
> > > calls this check_sum() function)
> > > 
> > > So there is no refactoring to be done for mlx4 : short
> > > IPv[46]+{UDP|TCP} frames get CHECKSUM_UNNECESSARY already
> > 
> > not exactly, considering mlx4 weird condition to decide to do
> > CHECKSUM_UNNECESSARY:
> > 
> > if ((cqe csum fields are valid) && cqe->checksum ==
> > cpu_to_be16(0xffff))
> >      { do CHECKSUM_UNNECESSARY }
> > else { do CHECKSUM_COMPLETE }
> > 
> > CHECKSUM_UNNECESSARY will be skipped if csum complete field is
> > valid
> > i.e (cqe->checksum != 0xffff)
> > 
> > but if checksum complete is to be skipped due to short_frame we
> > want to
> > go back to CHECKSUM_UNNECESSARY, this is the refactoring i am
> > talking
> > about :).
> > 
> > 
> > I hope now it is clear..
> 
> Not really, since in case of short frame, cqe->checksum will be
> 0xffff, since mlx4 does not include the padding bytes in the checksum
> in the first place.
> 

Are you sure ? you are claiming that the hardware will skip csum
complete i.e cqe->checksum will be 0xffff for padded short IP frames.
i don't think this is the case, the whole bug is that the hw does
provide a partial cqe->checksum (i.e doesn't included the padding
bytes) even for short eth frames.


> (For native IPv4/IPV6 TCP/UDP frames that is)
> 
> > 
> > > > +        */
> > > > +       if (short_frame(skb->len))
> > > > +               return -EINVAL;
> > > > 
> > > > -       void *hdr = (u8 *)va + sizeof(struct ethhdr);
> > > > -
> > > > +       hdr = (u8 *)va + sizeof(struct ethhdr);
> > > >         hw_checksum = csum_unfold((__force __sum16)cqe-
> > > > >checksum);
> > > > 
> > > >         if (cqe->vlan_my_qpn &
> > > > cpu_to_be32(MLX4_CQE_CVLAN_PRESENT_MASK) &&
> > > > --
> > > > 1.8.3.1
> > > > 

^ permalink raw reply

* Re: [PATCH bpf-next v3 1/3] libbpf: move pr_*() functions to common header file
From: Yonghong Song @ 2019-01-31 19:12 UTC (permalink / raw)
  To: Alexei Starovoitov, Magnus Karlsson
  Cc: bjorn.topel@intel.com, ast@kernel.org, daniel@iogearbox.net,
	netdev@vger.kernel.org, jakub.kicinski@netronome.com,
	bjorn.topel@gmail.com, qi.z.zhang@intel.com, brouer@redhat.com,
	acme@kernel.org
In-Reply-To: <20190131185206.6rvqxlvutpedwnqc@ast-mbp.dhcp.thefacebook.com>



On 1/31/19 10:52 AM, Alexei Starovoitov wrote:
> On Tue, Jan 29, 2019 at 04:12:15PM +0100, Magnus Karlsson wrote:
>> Move the pr_*() functions in libbpf.c to a common header file called
>> libbpf_internal.h. This so that the later libbpf AF_XDP helper library
>> code in xsk.c can use these printing functions too.
>>
>> Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
>> ---
>>   tools/lib/bpf/libbpf.c          | 30 +-----------------------------
>>   tools/lib/bpf/libbpf_internal.h | 41 +++++++++++++++++++++++++++++++++++++++++
>>   2 files changed, 42 insertions(+), 29 deletions(-)
>>   create mode 100644 tools/lib/bpf/libbpf_internal.h
>>
>> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
>> index 2ccde17..1d7fe26 100644
>> --- a/tools/lib/bpf/libbpf.c
>> +++ b/tools/lib/bpf/libbpf.c
>> @@ -39,6 +39,7 @@
>>   #include <gelf.h>
>>   
>>   #include "libbpf.h"
>> +#include "libbpf_internal.h"
>>   #include "bpf.h"
>>   #include "btf.h"
>>   #include "str_error.h"
>> @@ -51,34 +52,6 @@
>>   #define BPF_FS_MAGIC		0xcafe4a11
>>   #endif
>>   
>> -#define __printf(a, b)	__attribute__((format(printf, a, b)))
>> -
>> -__printf(1, 2)
>> -static int __base_pr(const char *format, ...)
>> -{
>> -	va_list args;
>> -	int err;
>> -
>> -	va_start(args, format);
>> -	err = vfprintf(stderr, format, args);
>> -	va_end(args);
>> -	return err;
>> -}
>> -
>> -static __printf(1, 2) libbpf_print_fn_t __pr_warning = __base_pr;
>> -static __printf(1, 2) libbpf_print_fn_t __pr_info = __base_pr;
>> -static __printf(1, 2) libbpf_print_fn_t __pr_debug;
>> -
>> -#define __pr(func, fmt, ...)	\
>> -do {				\
>> -	if ((func))		\
>> -		(func)("libbpf: " fmt, ##__VA_ARGS__); \
>> -} while (0)
>> -
>> -#define pr_warning(fmt, ...)	__pr(__pr_warning, fmt, ##__VA_ARGS__)
>> -#define pr_info(fmt, ...)	__pr(__pr_info, fmt, ##__VA_ARGS__)
>> -#define pr_debug(fmt, ...)	__pr(__pr_debug, fmt, ##__VA_ARGS__)
> 
> since these funcs are about to be used more widely
> let's clean this api while we still can.
> How about we convert it to single pr_log callback function
> with verbosity flag instead of three callbacks ?

Another possible change related to the API function
   libbpf_set_print

Currently the function takes three parameters,

LIBBPF_API void libbpf_set_print(libbpf_print_fn_t warn,
                                  libbpf_print_fn_t info,
                                  libbpf_print_fn_t debug);

So it currently supports three level of debugging output.
Is it possible in the future more debugging output level
may be supported? if this is the case, maybe we could
change the API function libbpf_set_print to something like
the below before the library version bumps into 1.0.0?

  LIBBPF_API void libbpf_set_print(libbpf_print_fn_t dprint);
and
   typedef int (*libbpf_print_fn_t)(enum libbpf_debug_level level,
                                    const char *, ...)
         __attribute__((format(printf, 1, 2)));
   enum libbpf_debug_level {
     LIBBPF_WARN,
     LIBBPF_INFO,
     LIBBPF_DEBUG,
   };

Basically, the user provided callback function must have
the first parameters as the level.

Any comments? Arnaldo?


^ permalink raw reply

* Re: [PATCH net] net/mlx4_en: Force CHECKSUM_NONE for short ethernet frames
From: Eric Dumazet @ 2019-01-31 19:07 UTC (permalink / raw)
  To: Saeed Mahameed
  Cc: Tariq Toukan, Eran Ben Elisha, davem@davemloft.net,
	netdev@vger.kernel.org
In-Reply-To: <b18f88b222920fc980d4085682f4ac6a2d19e832.camel@mellanox.com>

On Thu, Jan 31, 2019 at 11:04 AM Saeed Mahameed <saeedm@mellanox.com> wrote:
>
> On Thu, 2019-01-31 at 07:02 -0800, Eric Dumazet wrote:
> > On Thu, Jan 31, 2019 at 5:04 AM Tariq Toukan <tariqt@mellanox.com>
> > wrote:
> > > From: Saeed Mahameed <saeedm@mellanox.com>
> > >
> > > When an ethernet frame is padded to meet the minimum ethernet frame
> > > size, the padding octets are not covered by the hardware checksum.
> > > Fortunately the padding octets are usually zero's, which don't
> > > affect
> > > checksum. However, it is not guaranteed. For example, switches
> > > might
> > > choose to make other use of these octets.
> > > This repeatedly causes kernel hardware checksum fault.
> > >
> > > Prior to the cited commit below, skb checksum was forced to be
> > > CHECKSUM_NONE when padding is detected. After it, we need to keep
> > > skb->csum updated. However, fixing up CHECKSUM_COMPLETE requires to
> > > verify and parse IP headers, it does not worth the effort as the
> > > packets
> > > are so small that CHECKSUM_COMPLETE has no significant advantage.
> > >
> > > Fixes: 88078d98d1bb ("net: pskb_trim_rcsum() and CHECKSUM_COMPLETE
> > > are friends")
> > > Cc: Eric Dumazet <edumazet@google.com>
> > > Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
> > > Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
> > > ---
> > >  drivers/net/ethernet/mellanox/mlx4/en_rx.c | 19 +++++++++++++++++-
> > > -
> > >  1 file changed, 17 insertions(+), 2 deletions(-)
> > >
> > > Hi Dave,
> > > Please queue for -stable >= v4.18.
> > >
> > > Thanks,
> > > Tariq
> > >
> > > diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
> > > b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
> > > index 9a0881cb7f51..fc8a11444e1a 100644
> > > --- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
> > > +++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
> > > @@ -617,6 +617,8 @@ static int get_fixed_ipv6_csum(__wsum
> > > hw_checksum, struct sk_buff *skb,
> > >  }
> > >  #endif
> > >
> > > +#define short_frame(size) ((size) <= ETH_ZLEN + ETH_FCS_LEN)
> > > +
> > >  /* We reach this function only after checking that any of
> > >   * the (IPv4 | IPv6) bits are set in cqe->status.
> > >   */
> > > @@ -624,9 +626,22 @@ static int check_csum(struct mlx4_cqe *cqe,
> > > struct sk_buff *skb, void *va,
> > >                       netdev_features_t dev_features)
> > >  {
> > >         __wsum hw_checksum = 0;
> > > +       void *hdr;
> > > +
> > > +       /* CQE csum doesn't cover padding octets in short ethernet
> > > +        * frames. And the pad field is appended prior to
> > > calculating
> > > +        * and appending the FCS field.
> > > +        *
> > > +        * Detecting these padded frames requires to verify and
> > > parse
> > > +        * IP headers, so we simply force all those small frames to
> > > be
> > > +        * CHECKSUM_NONE even if they are not padded.
> > > +        * TODO: better if we report CHECKSUM_UNNECESSARY but this
> > > +        * demands some refactroing.
> >
> > This TODO: looks bogus to me.
> >
> > mlx4 driver first tries to use CHECKSUM_UNNECESSARY.
> >
> > if ((cqe->status & cpu_to_be16(MLX4_CQE_STATUS_TCP |
> >                                                   MLX4_CQE_STATUS_UDP
> > )) &&
> >      (cqe->status & cpu_to_be16(MLX4_CQE_STATUS_IPOK)) &&
> >       cqe->checksum == cpu_to_be16(0xffff)) {
> >      ...
> >       ip_summed = CHECKSUM_UNNECESSARY;
> >       ...
> > }
> >
> > Then if this attempt failed, it tries to use CHECKSUM_COMPLETE (and
> > calls this check_sum() function)
> >
> > So there is no refactoring to be done for mlx4 : short
> > IPv[46]+{UDP|TCP} frames get CHECKSUM_UNNECESSARY already
>
> not exactly, considering mlx4 weird condition to decide to do
> CHECKSUM_UNNECESSARY:
>
> if ((cqe csum fields are valid) && cqe->checksum ==
> cpu_to_be16(0xffff))
>      { do CHECKSUM_UNNECESSARY }
> else { do CHECKSUM_COMPLETE }
>
> CHECKSUM_UNNECESSARY will be skipped if csum complete field is valid
> i.e (cqe->checksum != 0xffff)
>
> but if checksum complete is to be skipped due to short_frame we want to
> go back to CHECKSUM_UNNECESSARY, this is the refactoring i am talking
> about :).
>
>
> I hope now it is clear..

Not really, since in case of short frame, cqe->checksum will be
0xffff, since mlx4 does not include the padding bytes in the checksum
in the first place.

(For native IPv4/IPV6 TCP/UDP frames that is)

>
>
> >
> > > +        */
> > > +       if (short_frame(skb->len))
> > > +               return -EINVAL;
> > >
> > > -       void *hdr = (u8 *)va + sizeof(struct ethhdr);
> > > -
> > > +       hdr = (u8 *)va + sizeof(struct ethhdr);
> > >         hw_checksum = csum_unfold((__force __sum16)cqe->checksum);
> > >
> > >         if (cqe->vlan_my_qpn &
> > > cpu_to_be32(MLX4_CQE_CVLAN_PRESENT_MASK) &&
> > > --
> > > 1.8.3.1
> > >

^ permalink raw reply

* Re: [PATCH net-next v2 7/7] ethtool: add compat for devlink info
From: kbuild test robot @ 2019-01-31 19:05 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: kbuild-all, davem, netdev, oss-drivers, jiri, andrew, f.fainelli,
	mkubecek, eugenem, jonathan.lemon, Jakub Kicinski
In-Reply-To: <20190130190513.25718-8-jakub.kicinski@netronome.com>

[-- Attachment #1: Type: text/plain, Size: 2583 bytes --]

Hi Jakub,

I love your patch! Yet something to improve:

[auto build test ERROR on net-next/master]

url:    https://github.com/0day-ci/linux/commits/Jakub-Kicinski/devlink-add-device-driver-information-API/20190131-222221
config: i386-randconfig-n0-01300126 (attached as .config)
compiler: gcc-8 (Debian 8.2.0-15) 8.2.0
reproduce:
        # save the attached .config to linux build tree
        make ARCH=i386 

All errors (new ones prefixed by >>):

   ld: net/core/ethtool.o: in function `ethtool_get_drvinfo':
>> net/core/ethtool.c:809: undefined reference to `devlink_compat_running_versions'

vim +809 net/core/ethtool.c

   760	
   761	static noinline_for_stack int ethtool_get_drvinfo(struct net_device *dev,
   762							  void __user *useraddr)
   763	{
   764		struct ethtool_drvinfo info;
   765		const struct ethtool_ops *ops = dev->ethtool_ops;
   766	
   767		memset(&info, 0, sizeof(info));
   768		info.cmd = ETHTOOL_GDRVINFO;
   769		if (ops->get_drvinfo) {
   770			ops->get_drvinfo(dev, &info);
   771		} else if (dev->dev.parent && dev->dev.parent->driver) {
   772			strlcpy(info.bus_info, dev_name(dev->dev.parent),
   773				sizeof(info.bus_info));
   774			strlcpy(info.driver, dev->dev.parent->driver->name,
   775				sizeof(info.driver));
   776		} else {
   777			return -EOPNOTSUPP;
   778		}
   779	
   780		/*
   781		 * this method of obtaining string set info is deprecated;
   782		 * Use ETHTOOL_GSSET_INFO instead.
   783		 */
   784		if (ops->get_sset_count) {
   785			int rc;
   786	
   787			rc = ops->get_sset_count(dev, ETH_SS_TEST);
   788			if (rc >= 0)
   789				info.testinfo_len = rc;
   790			rc = ops->get_sset_count(dev, ETH_SS_STATS);
   791			if (rc >= 0)
   792				info.n_stats = rc;
   793			rc = ops->get_sset_count(dev, ETH_SS_PRIV_FLAGS);
   794			if (rc >= 0)
   795				info.n_priv_flags = rc;
   796		}
   797		if (ops->get_regs_len) {
   798			int ret = ops->get_regs_len(dev);
   799	
   800			if (ret > 0)
   801				info.regdump_len = ret;
   802		}
   803	
   804		if (ops->get_eeprom_len)
   805			info.eedump_len = ops->get_eeprom_len(dev);
   806	
   807		rtnl_unlock();
   808		if (!info.fw_version[0])
 > 809			devlink_compat_running_versions(dev, info.fw_version,
   810							ARRAY_SIZE(info.fw_version));
   811		rtnl_lock();
   812	
   813		if (copy_to_user(useraddr, &info, sizeof(info)))
   814			return -EFAULT;
   815		return 0;
   816	}
   817	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 26890 bytes --]

^ permalink raw reply

* Re: [PATCH net] net/mlx4_en: Force CHECKSUM_NONE for short ethernet frames
From: Saeed Mahameed @ 2019-01-31 19:04 UTC (permalink / raw)
  To: Tariq Toukan, edumazet@google.com
  Cc: Eran Ben Elisha, davem@davemloft.net, netdev@vger.kernel.org
In-Reply-To: <CANn89iJs+auP-p4PFhO3uyMjwEgxJh7N3SqhtgaZJQJobOL4Bw@mail.gmail.com>

On Thu, 2019-01-31 at 07:02 -0800, Eric Dumazet wrote:
> On Thu, Jan 31, 2019 at 5:04 AM Tariq Toukan <tariqt@mellanox.com>
> wrote:
> > From: Saeed Mahameed <saeedm@mellanox.com>
> > 
> > When an ethernet frame is padded to meet the minimum ethernet frame
> > size, the padding octets are not covered by the hardware checksum.
> > Fortunately the padding octets are usually zero's, which don't
> > affect
> > checksum. However, it is not guaranteed. For example, switches
> > might
> > choose to make other use of these octets.
> > This repeatedly causes kernel hardware checksum fault.
> > 
> > Prior to the cited commit below, skb checksum was forced to be
> > CHECKSUM_NONE when padding is detected. After it, we need to keep
> > skb->csum updated. However, fixing up CHECKSUM_COMPLETE requires to
> > verify and parse IP headers, it does not worth the effort as the
> > packets
> > are so small that CHECKSUM_COMPLETE has no significant advantage.
> > 
> > Fixes: 88078d98d1bb ("net: pskb_trim_rcsum() and CHECKSUM_COMPLETE
> > are friends")
> > Cc: Eric Dumazet <edumazet@google.com>
> > Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
> > Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
> > ---
> >  drivers/net/ethernet/mellanox/mlx4/en_rx.c | 19 +++++++++++++++++-
> > -
> >  1 file changed, 17 insertions(+), 2 deletions(-)
> > 
> > Hi Dave,
> > Please queue for -stable >= v4.18.
> > 
> > Thanks,
> > Tariq
> > 
> > diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
> > b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
> > index 9a0881cb7f51..fc8a11444e1a 100644
> > --- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
> > +++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
> > @@ -617,6 +617,8 @@ static int get_fixed_ipv6_csum(__wsum
> > hw_checksum, struct sk_buff *skb,
> >  }
> >  #endif
> > 
> > +#define short_frame(size) ((size) <= ETH_ZLEN + ETH_FCS_LEN)
> > +
> >  /* We reach this function only after checking that any of
> >   * the (IPv4 | IPv6) bits are set in cqe->status.
> >   */
> > @@ -624,9 +626,22 @@ static int check_csum(struct mlx4_cqe *cqe,
> > struct sk_buff *skb, void *va,
> >                       netdev_features_t dev_features)
> >  {
> >         __wsum hw_checksum = 0;
> > +       void *hdr;
> > +
> > +       /* CQE csum doesn't cover padding octets in short ethernet
> > +        * frames. And the pad field is appended prior to
> > calculating
> > +        * and appending the FCS field.
> > +        *
> > +        * Detecting these padded frames requires to verify and
> > parse
> > +        * IP headers, so we simply force all those small frames to
> > be
> > +        * CHECKSUM_NONE even if they are not padded.
> > +        * TODO: better if we report CHECKSUM_UNNECESSARY but this
> > +        * demands some refactroing.
> 
> This TODO: looks bogus to me.
> 
> mlx4 driver first tries to use CHECKSUM_UNNECESSARY.
> 
> if ((cqe->status & cpu_to_be16(MLX4_CQE_STATUS_TCP |
>                                                   MLX4_CQE_STATUS_UDP
> )) &&
>      (cqe->status & cpu_to_be16(MLX4_CQE_STATUS_IPOK)) &&
>       cqe->checksum == cpu_to_be16(0xffff)) {
>      ...
>       ip_summed = CHECKSUM_UNNECESSARY;
>       ...
> }
> 
> Then if this attempt failed, it tries to use CHECKSUM_COMPLETE (and
> calls this check_sum() function)
> 
> So there is no refactoring to be done for mlx4 : short
> IPv[46]+{UDP|TCP} frames get CHECKSUM_UNNECESSARY already

not exactly, considering mlx4 weird condition to decide to do
CHECKSUM_UNNECESSARY:

if ((cqe csum fields are valid) && cqe->checksum ==
cpu_to_be16(0xffff)) 
     { do CHECKSUM_UNNECESSARY }
else { do CHECKSUM_COMPLETE }

CHECKSUM_UNNECESSARY will be skipped if csum complete field is valid 
i.e (cqe->checksum != 0xffff)

but if checksum complete is to be skipped due to short_frame we want to
go back to CHECKSUM_UNNECESSARY, this is the refactoring i am talking
about :).


I hope now it is clear.. 


> 
> > +        */
> > +       if (short_frame(skb->len))
> > +               return -EINVAL;
> > 
> > -       void *hdr = (u8 *)va + sizeof(struct ethhdr);
> > -
> > +       hdr = (u8 *)va + sizeof(struct ethhdr);
> >         hw_checksum = csum_unfold((__force __sum16)cqe->checksum);
> > 
> >         if (cqe->vlan_my_qpn &
> > cpu_to_be32(MLX4_CQE_CVLAN_PRESENT_MASK) &&
> > --
> > 1.8.3.1
> > 

^ permalink raw reply

* [PATCH v3 net-next] r8169: improve WoL handling
From: Heiner Kallweit @ 2019-01-31 18:57 UTC (permalink / raw)
  To: Realtek linux nic maintainers, David Miller; +Cc: netdev@vger.kernel.org

WoL handling for the RTL8168 family is a little bit tricky because of
different types of broken BIOS and/or chip quirks.

Two known issues:
1. Network properly resumes from suspend only if WoL is enabled in the chip.
2. Some notebooks wake up immediately if system is suspended and network
   device is wakeup-enabled.

Few patches tried to deal with this:
7edf6d314cd0 ("r8169: disable WOL per default")
18041b523692 ("r8169: restore previous behavior to accept BIOS WoL
settings")

Currently we have the situation that the chip WoL settings as set by
the BIOS are respected (to prevent issue 1), but the device doesn't get
wakeup-enabled (to prevent issue 2).

This leads to another issue:
If systemd is told to set WoL it first checks whether the requested
settings are active already (and does nothing if yes). Due to the chip
WoL flags being set properly systemd assumes that WoL is configured
properly in our case. Result is that device doesn't get wakeup-enabled
and WoL doesn't work (until it's set e.g. by ethtool).

This patch now:
- leaves the chip WoL settings as is (to prevent issue 1)
- keeps the behavior to not wakeup-enable the device initially
  (to prevent issue 2)
- In addition we report WoL as being disabled in get_wol, matching
  that device isn't wakeup-enabled. If systemd is told to enable WoL,
  it will therefore detect that it has to do something and will
  call set_wol.

Of course the user still has the option to override this with
e.g. ethtool.

v2:
- Don't just exclude __rtl8169_get_wol() from compiling, remove it.
v3:
- adjust commit message

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
---
 drivers/net/ethernet/realtek/r8169.c | 39 +---------------------------
 1 file changed, 1 insertion(+), 38 deletions(-)

diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
index 3e650bd9e..9dc689817 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -1371,41 +1371,6 @@ static void rtl_link_chg_patch(struct rtl8169_private *tp)
 
 #define WAKE_ANY (WAKE_PHY | WAKE_MAGIC | WAKE_UCAST | WAKE_BCAST | WAKE_MCAST)
 
-static u32 __rtl8169_get_wol(struct rtl8169_private *tp)
-{
-	u8 options;
-	u32 wolopts = 0;
-
-	options = RTL_R8(tp, Config1);
-	if (!(options & PMEnable))
-		return 0;
-
-	options = RTL_R8(tp, Config3);
-	if (options & LinkUp)
-		wolopts |= WAKE_PHY;
-	switch (tp->mac_version) {
-	case RTL_GIGA_MAC_VER_34 ... RTL_GIGA_MAC_VER_38:
-	case RTL_GIGA_MAC_VER_40 ... RTL_GIGA_MAC_VER_51:
-		if (rtl_eri_read(tp, 0xdc, ERIAR_EXGMAC) & MagicPacket_v2)
-			wolopts |= WAKE_MAGIC;
-		break;
-	default:
-		if (options & MagicPacket)
-			wolopts |= WAKE_MAGIC;
-		break;
-	}
-
-	options = RTL_R8(tp, Config5);
-	if (options & UWF)
-		wolopts |= WAKE_UCAST;
-	if (options & BWF)
-		wolopts |= WAKE_BCAST;
-	if (options & MWF)
-		wolopts |= WAKE_MCAST;
-
-	return wolopts;
-}
-
 static void rtl8169_get_wol(struct net_device *dev, struct ethtool_wolinfo *wol)
 {
 	struct rtl8169_private *tp = netdev_priv(dev);
@@ -4284,7 +4249,7 @@ static void rtl_wol_suspend_quirk(struct rtl8169_private *tp)
 
 static bool rtl_wol_pll_power_down(struct rtl8169_private *tp)
 {
-	if (!__rtl8169_get_wol(tp))
+	if (!device_may_wakeup(tp_to_dev(tp)))
 		return false;
 
 	phy_speed_down(tp->phydev, false);
@@ -7441,8 +7406,6 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 		return rc;
 	}
 
-	tp->saved_wolopts = __rtl8169_get_wol(tp);
-
 	mutex_init(&tp->wk.mutex);
 	INIT_WORK(&tp->wk.work, rtl_task);
 	u64_stats_init(&tp->rx_stats.syncp);
-- 
2.20.1


^ permalink raw reply related

* [PATCH v2 net-next] r8169: improve WoL handling
From: Heiner Kallweit @ 2019-01-31 18:52 UTC (permalink / raw)
  To: Realtek linux nic maintainers, David Miller; +Cc: netdev@vger.kernel.org

WoL handling for the RTL8168 family is a little bit tricky because of
different types of broken BIOS and/or chip quirks.

Two known issues:
1. Network properly resumes from suspend only if WoL is enabled in the chip.
2. Some notebooks wake up immediately if system is suspended and network
   device is wakeup-enabled.

Few patches tried to deal with this:
7edf6d314cd0 ("r8169: disable WOL per default")
18041b523692 ("r8169: restore previous behavior to accept BIOS WoL
settings")

Currently we have the situation that the chip WoL settings as set by
the BIOS are respected (to prevent issue 1), but the device doesn't get
wakeup-enabled (to prevent issue 2).

This leads to another issue:
If systemd is told to set WoL it first checks whether the requested
settings are active already (and does nothing if yes). Due to the chip
WoL flags being set properly systemd assumes that WoL is configured
properly in our case. Result is that device doesn't get wakeup-enabled
and WoL doesn't work (until it's set e.g. by ethtool).

This patch now:
- leaves the chip WoL settings as is (to prevent issue 1)
- keeps the behavior to not wakeup-enable the device initially
  (to prevent issue 2)
- In addition we report WoL as being disabled in get_wol, matching
  that device isn't wakeup-enabled. If systemd is told to enable WoL,
  it will therefore detect that it has to do something and will
  call set_wol.

Of course the user still has the option to override this with
e.g. ethtool.

Because there are lots of consumer mainboards with members of the
RTL8168 family as onboard network, there's a certain chance that also
the new approach may trigger a BIOS bug and cause issues. Therefore
don't remove __rtl8169_get_wol() completely.

v2:
- Don't just exclude __rtl8169_get_wol() from compiling, remove it.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
---
 drivers/net/ethernet/realtek/r8169.c | 39 +---------------------------
 1 file changed, 1 insertion(+), 38 deletions(-)

diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
index 3e650bd9e..9dc689817 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -1371,41 +1371,6 @@ static void rtl_link_chg_patch(struct rtl8169_private *tp)
 
 #define WAKE_ANY (WAKE_PHY | WAKE_MAGIC | WAKE_UCAST | WAKE_BCAST | WAKE_MCAST)
 
-static u32 __rtl8169_get_wol(struct rtl8169_private *tp)
-{
-	u8 options;
-	u32 wolopts = 0;
-
-	options = RTL_R8(tp, Config1);
-	if (!(options & PMEnable))
-		return 0;
-
-	options = RTL_R8(tp, Config3);
-	if (options & LinkUp)
-		wolopts |= WAKE_PHY;
-	switch (tp->mac_version) {
-	case RTL_GIGA_MAC_VER_34 ... RTL_GIGA_MAC_VER_38:
-	case RTL_GIGA_MAC_VER_40 ... RTL_GIGA_MAC_VER_51:
-		if (rtl_eri_read(tp, 0xdc, ERIAR_EXGMAC) & MagicPacket_v2)
-			wolopts |= WAKE_MAGIC;
-		break;
-	default:
-		if (options & MagicPacket)
-			wolopts |= WAKE_MAGIC;
-		break;
-	}
-
-	options = RTL_R8(tp, Config5);
-	if (options & UWF)
-		wolopts |= WAKE_UCAST;
-	if (options & BWF)
-		wolopts |= WAKE_BCAST;
-	if (options & MWF)
-		wolopts |= WAKE_MCAST;
-
-	return wolopts;
-}
-
 static void rtl8169_get_wol(struct net_device *dev, struct ethtool_wolinfo *wol)
 {
 	struct rtl8169_private *tp = netdev_priv(dev);
@@ -4284,7 +4249,7 @@ static void rtl_wol_suspend_quirk(struct rtl8169_private *tp)
 
 static bool rtl_wol_pll_power_down(struct rtl8169_private *tp)
 {
-	if (!__rtl8169_get_wol(tp))
+	if (!device_may_wakeup(tp_to_dev(tp)))
 		return false;
 
 	phy_speed_down(tp->phydev, false);
@@ -7441,8 +7406,6 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 		return rc;
 	}
 
-	tp->saved_wolopts = __rtl8169_get_wol(tp);
-
 	mutex_init(&tp->wk.mutex);
 	INIT_WORK(&tp->wk.work, rtl_task);
 	u64_stats_init(&tp->rx_stats.syncp);
-- 
2.20.1


^ permalink raw reply related

* [PATCH net-next v4 5/8] nfp: devlink: report fixed versions
From: Jakub Kicinski @ 2019-01-31 18:50 UTC (permalink / raw)
  To: davem
  Cc: netdev, oss-drivers, jiri, andrew, f.fainelli, mkubecek, eugenem,
	jonathan.lemon, Jakub Kicinski
In-Reply-To: <20190131185047.27685-1-jakub.kicinski@netronome.com>

Report information about the hardware.

RFCv2:
 - add defines for board IDs which are likely to be reusable for
   other drivers (Jiri).

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
---
 .../net/ethernet/netronome/nfp/nfp_devlink.c  | 36 ++++++++++++++++++-
 1 file changed, 35 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/netronome/nfp/nfp_devlink.c b/drivers/net/ethernet/netronome/nfp/nfp_devlink.c
index 2ba3b3891d99..75eda34fc1b4 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_devlink.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_devlink.c
@@ -172,6 +172,40 @@ static int nfp_devlink_eswitch_mode_set(struct devlink *devlink, u16 mode,
 	return ret;
 }
 
+static const struct nfp_devlink_versions_simple {
+	const char *key;
+	const char *hwinfo;
+} nfp_devlink_versions_hwinfo[] = {
+	{ DEVLINK_INFO_VERSION_GENERIC_BOARD_ID,	"assembly.partno", },
+	{ DEVLINK_INFO_VERSION_GENERIC_BOARD_REV,	"assembly.revision", },
+	{ "board.vendor", /* fab */			"assembly.vendor", },
+	{ "board.model", /* code name */		"assembly.model", },
+};
+
+static int
+nfp_devlink_versions_get_hwinfo(struct nfp_pf *pf, struct devlink_info_req *req)
+{
+	unsigned int i;
+	int err;
+
+	for (i = 0; i < ARRAY_SIZE(nfp_devlink_versions_hwinfo); i++) {
+		const struct nfp_devlink_versions_simple *info;
+		const char *val;
+
+		info = &nfp_devlink_versions_hwinfo[i];
+
+		val = nfp_hwinfo_lookup(pf->hwinfo, info->hwinfo);
+		if (!val)
+			continue;
+
+		err = devlink_info_version_fixed_put(req, info->key, val);
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
+
 static int
 nfp_devlink_info_get(struct devlink *devlink, struct devlink_info_req *req,
 		     struct netlink_ext_ack *extack)
@@ -191,7 +225,7 @@ nfp_devlink_info_get(struct devlink *devlink, struct devlink_info_req *req,
 			return err;
 	}
 
-	return 0;
+	return nfp_devlink_versions_get_hwinfo(pf, req);
 }
 
 const struct devlink_ops nfp_devlink_ops = {
-- 
2.19.2


^ permalink raw reply related

* [PATCH net-next v4 8/8] ethtool: add compat for devlink info
From: Jakub Kicinski @ 2019-01-31 18:50 UTC (permalink / raw)
  To: davem
  Cc: netdev, oss-drivers, jiri, andrew, f.fainelli, mkubecek, eugenem,
	jonathan.lemon, Jakub Kicinski
In-Reply-To: <20190131185047.27685-1-jakub.kicinski@netronome.com>

If driver did not fill the fw_version field, try to call into
the new devlink get_info op and collect the versions that way.
We assume ethtool was always reporting running versions.

v4:
 - use IS_REACHABLE() to avoid problems with DEVLINK=m (kbuildbot).
v3 (Jiri):
 - do a dump and then parse it instead of special handling;
 - concatenate all versions (well, all that fit :)).

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
 include/net/devlink.h | 10 +++++++
 net/core/devlink.c    | 63 +++++++++++++++++++++++++++++++++++++++++++
 net/core/ethtool.c    |  7 +++++
 3 files changed, 80 insertions(+)

diff --git a/include/net/devlink.h b/include/net/devlink.h
index 6b417f141fd6..1c8523920f66 100644
--- a/include/net/devlink.h
+++ b/include/net/devlink.h
@@ -972,4 +972,14 @@ devlink_info_version_running_put(struct devlink_info_req *req,
 }
 #endif
 
+#if IS_REACHABLE(CONFIG_NET_DEVLINK)
+void devlink_compat_running_version(struct net_device *dev,
+				    char *buf, size_t len);
+#else
+static inline void
+devlink_compat_running_version(struct net_device *dev, char *buf, size_t len)
+{
+}
+#endif
+
 #endif /* _NET_DEVLINK_H_ */
diff --git a/net/core/devlink.c b/net/core/devlink.c
index e31b6d617837..eb839d74bcc0 100644
--- a/net/core/devlink.c
+++ b/net/core/devlink.c
@@ -5278,6 +5278,69 @@ int devlink_region_snapshot_create(struct devlink_region *region, u64 data_len,
 }
 EXPORT_SYMBOL_GPL(devlink_region_snapshot_create);
 
+static void __devlink_compat_running_version(struct devlink *devlink,
+					     char *buf, size_t len)
+{
+	const struct nlattr *nlattr;
+	struct devlink_info_req req;
+	struct sk_buff *msg;
+	int rem, err;
+
+	if (!devlink->ops->info_get)
+		return;
+
+	msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
+	if (!msg)
+		return;
+
+	req.msg = msg;
+	err = devlink->ops->info_get(devlink, &req, NULL);
+	if (err)
+		goto free_msg;
+
+	nla_for_each_attr(nlattr, (void *)msg->data, msg->len, rem) {
+		const struct nlattr *kv;
+		int rem_kv;
+
+		if (nla_type(nlattr) != DEVLINK_ATTR_INFO_VERSION_RUNNING)
+			continue;
+
+		nla_for_each_nested(kv, nlattr, rem_kv) {
+			if (nla_type(kv) != DEVLINK_ATTR_INFO_VERSION_VALUE)
+				continue;
+
+			strlcat(buf, nla_data(kv), len);
+			strlcat(buf, " ", len);
+		}
+	}
+free_msg:
+	nlmsg_free(msg);
+}
+
+void devlink_compat_running_version(struct net_device *dev,
+				    char *buf, size_t len)
+{
+	struct devlink_port *devlink_port;
+	struct devlink *devlink;
+
+	mutex_lock(&devlink_mutex);
+	list_for_each_entry(devlink, &devlink_list, list) {
+		mutex_lock(&devlink->lock);
+		list_for_each_entry(devlink_port, &devlink->port_list, list) {
+			if (devlink_port->type == DEVLINK_PORT_TYPE_ETH ||
+			    devlink_port->type_dev == dev) {
+				__devlink_compat_running_version(devlink,
+								 buf, len);
+				mutex_unlock(&devlink->lock);
+				goto out;
+			}
+		}
+		mutex_unlock(&devlink->lock);
+	}
+out:
+	mutex_unlock(&devlink_mutex);
+}
+
 static int __init devlink_module_init(void)
 {
 	return genl_register_family(&devlink_nl_family);
diff --git a/net/core/ethtool.c b/net/core/ethtool.c
index 158264f7cfaf..197a4dfb712d 100644
--- a/net/core/ethtool.c
+++ b/net/core/ethtool.c
@@ -27,6 +27,7 @@
 #include <linux/rtnetlink.h>
 #include <linux/sched/signal.h>
 #include <linux/net.h>
+#include <net/devlink.h>
 #include <net/xdp_sock.h>
 
 /*
@@ -803,6 +804,12 @@ static noinline_for_stack int ethtool_get_drvinfo(struct net_device *dev,
 	if (ops->get_eeprom_len)
 		info.eedump_len = ops->get_eeprom_len(dev);
 
+	rtnl_unlock();
+	if (!info.fw_version[0])
+		devlink_compat_running_version(dev, info.fw_version,
+					       sizeof(info.fw_version));
+	rtnl_lock();
+
 	if (copy_to_user(useraddr, &info, sizeof(info)))
 		return -EFAULT;
 	return 0;
-- 
2.19.2


^ permalink raw reply related

* [PATCH net-next v4 6/8] nfp: nsp: add support for versions command
From: Jakub Kicinski @ 2019-01-31 18:50 UTC (permalink / raw)
  To: davem
  Cc: netdev, oss-drivers, jiri, andrew, f.fainelli, mkubecek, eugenem,
	jonathan.lemon, Jakub Kicinski
In-Reply-To: <20190131185047.27685-1-jakub.kicinski@netronome.com>

Retrieve the FW versions with the new command.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
 .../ethernet/netronome/nfp/nfpcore/nfp_nsp.c  | 61 +++++++++++++++++++
 .../ethernet/netronome/nfp/nfpcore/nfp_nsp.h  | 20 ++++++
 2 files changed, 81 insertions(+)

diff --git a/drivers/net/ethernet/netronome/nfp/nfpcore/nfp_nsp.c b/drivers/net/ethernet/netronome/nfp/nfpcore/nfp_nsp.c
index ce1577bbbd2a..a9d53df0070c 100644
--- a/drivers/net/ethernet/netronome/nfp/nfpcore/nfp_nsp.c
+++ b/drivers/net/ethernet/netronome/nfp/nfpcore/nfp_nsp.c
@@ -7,6 +7,7 @@
  *         Jason McMullan <jason.mcmullan@netronome.com>
  */
 
+#include <asm/unaligned.h>
 #include <linux/bitfield.h>
 #include <linux/delay.h>
 #include <linux/firmware.h>
@@ -62,6 +63,16 @@
 
 #define NFP_HWINFO_LOOKUP_SIZE	GENMASK(11, 0)
 
+#define NFP_VERSIONS_SIZE	GENMASK(11, 0)
+#define NFP_VERSIONS_CNT_OFF	0
+#define NFP_VERSIONS_BSP_OFF	2
+#define NFP_VERSIONS_CPLD_OFF	6
+#define NFP_VERSIONS_APP_OFF	10
+#define NFP_VERSIONS_BUNDLE_OFF	14
+#define NFP_VERSIONS_UNDI_OFF	18
+#define NFP_VERSIONS_NCSI_OFF	22
+#define NFP_VERSIONS_CFGR_OFF	26
+
 enum nfp_nsp_cmd {
 	SPCODE_NOOP		= 0, /* No operation */
 	SPCODE_SOFT_RESET	= 1, /* Soft reset the NFP */
@@ -77,6 +88,7 @@ enum nfp_nsp_cmd {
 	SPCODE_NSP_IDENTIFY	= 13, /* Read NSP version */
 	SPCODE_FW_STORED	= 16, /* If no FW loaded, load flash app FW */
 	SPCODE_HWINFO_LOOKUP	= 17, /* Lookup HWinfo with overwrites etc. */
+	SPCODE_VERSIONS		= 21, /* Report FW versions */
 };
 
 static const struct {
@@ -711,3 +723,52 @@ int nfp_nsp_hwinfo_lookup(struct nfp_nsp *state, void *buf, unsigned int size)
 
 	return 0;
 }
+
+int nfp_nsp_versions(struct nfp_nsp *state, void *buf, unsigned int size)
+{
+	struct nfp_nsp_command_buf_arg versions = {
+		{
+			.code		= SPCODE_VERSIONS,
+			.option		= min_t(u32, size, NFP_VERSIONS_SIZE),
+		},
+		.out_buf	= buf,
+		.out_size	= min_t(u32, size, NFP_VERSIONS_SIZE),
+	};
+
+	return nfp_nsp_command_buf(state, &versions);
+}
+
+const char *nfp_nsp_versions_get(enum nfp_nsp_versions id, bool flash,
+				 const u8 *buf, unsigned int size)
+{
+	static const u32 id2off[] = {
+		[NFP_VERSIONS_BSP] =	NFP_VERSIONS_BSP_OFF,
+		[NFP_VERSIONS_CPLD] =	NFP_VERSIONS_CPLD_OFF,
+		[NFP_VERSIONS_APP] =	NFP_VERSIONS_APP_OFF,
+		[NFP_VERSIONS_BUNDLE] =	NFP_VERSIONS_BUNDLE_OFF,
+		[NFP_VERSIONS_UNDI] =	NFP_VERSIONS_UNDI_OFF,
+		[NFP_VERSIONS_NCSI] =	NFP_VERSIONS_NCSI_OFF,
+		[NFP_VERSIONS_CFGR] =	NFP_VERSIONS_CFGR_OFF,
+	};
+	unsigned int field, buf_field_cnt, buf_off;
+
+	if (id >= ARRAY_SIZE(id2off) || !id2off[id])
+		return ERR_PTR(-EINVAL);
+
+	field = id * 2 + flash;
+
+	buf_field_cnt = get_unaligned_le16(buf);
+	if (buf_field_cnt <= field)
+		return ERR_PTR(-ENOENT);
+
+	buf_off = get_unaligned_le16(buf + id2off[id] + flash * 2);
+	if (!buf_off)
+		return ERR_PTR(-ENOENT);
+
+	if (buf_off >= size)
+		return ERR_PTR(-EINVAL);
+	if (strnlen(&buf[buf_off], size - buf_off) == size - buf_off)
+		return ERR_PTR(-EINVAL);
+
+	return (const char *)&buf[buf_off];
+}
diff --git a/drivers/net/ethernet/netronome/nfp/nfpcore/nfp_nsp.h b/drivers/net/ethernet/netronome/nfp/nfpcore/nfp_nsp.h
index ff33ac54097a..246e213f1514 100644
--- a/drivers/net/ethernet/netronome/nfp/nfpcore/nfp_nsp.h
+++ b/drivers/net/ethernet/netronome/nfp/nfpcore/nfp_nsp.h
@@ -38,6 +38,11 @@ static inline bool nfp_nsp_has_hwinfo_lookup(struct nfp_nsp *state)
 	return nfp_nsp_get_abi_ver_minor(state) > 24;
 }
 
+static inline bool nfp_nsp_has_versions(struct nfp_nsp *state)
+{
+	return nfp_nsp_get_abi_ver_minor(state) > 27;
+}
+
 enum nfp_eth_interface {
 	NFP_INTERFACE_NONE	= 0,
 	NFP_INTERFACE_SFP	= 1,
@@ -208,4 +213,19 @@ enum nfp_nsp_sensor_id {
 int nfp_hwmon_read_sensor(struct nfp_cpp *cpp, enum nfp_nsp_sensor_id id,
 			  long *val);
 
+#define NFP_NSP_VERSION_BUFSZ	1024 /* reasonable size, not in the ABI */
+
+enum nfp_nsp_versions {
+	NFP_VERSIONS_BSP,
+	NFP_VERSIONS_CPLD,
+	NFP_VERSIONS_APP,
+	NFP_VERSIONS_BUNDLE,
+	NFP_VERSIONS_UNDI,
+	NFP_VERSIONS_NCSI,
+	NFP_VERSIONS_CFGR,
+};
+
+int nfp_nsp_versions(struct nfp_nsp *state, void *buf, unsigned int size);
+const char *nfp_nsp_versions_get(enum nfp_nsp_versions id, bool flash,
+				 const u8 *buf, unsigned int size);
 #endif
-- 
2.19.2


^ permalink raw reply related

* [PATCH net-next v4 7/8] nfp: devlink: report the running and flashed versions
From: Jakub Kicinski @ 2019-01-31 18:50 UTC (permalink / raw)
  To: davem
  Cc: netdev, oss-drivers, jiri, andrew, f.fainelli, mkubecek, eugenem,
	jonathan.lemon, Jakub Kicinski
In-Reply-To: <20190131185047.27685-1-jakub.kicinski@netronome.com>

Report versions of firmware components using the new NSP command.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
---
 .../net/ethernet/netronome/nfp/nfp_devlink.c  | 87 +++++++++++++++++++
 1 file changed, 87 insertions(+)

diff --git a/drivers/net/ethernet/netronome/nfp/nfp_devlink.c b/drivers/net/ethernet/netronome/nfp/nfp_devlink.c
index 75eda34fc1b4..dddbb0575be9 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_devlink.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_devlink.c
@@ -206,11 +206,60 @@ nfp_devlink_versions_get_hwinfo(struct nfp_pf *pf, struct devlink_info_req *req)
 	return 0;
 }
 
+static const struct nfp_devlink_versions {
+	enum nfp_nsp_versions id;
+	const char *key;
+} nfp_devlink_versions_nsp[] = {
+	{ NFP_VERSIONS_BUNDLE,	"fw.bundle_id", },
+	{ NFP_VERSIONS_BSP,	DEVLINK_INFO_VERSION_GENERIC_FW_MGMT, },
+	{ NFP_VERSIONS_CPLD,	"fw.cpld", },
+	{ NFP_VERSIONS_APP,	DEVLINK_INFO_VERSION_GENERIC_FW_APP, },
+	{ NFP_VERSIONS_UNDI,	DEVLINK_INFO_VERSION_GENERIC_FW_UNDI, },
+	{ NFP_VERSIONS_NCSI,	DEVLINK_INFO_VERSION_GENERIC_FW_NCSI, },
+	{ NFP_VERSIONS_CFGR,	"chip.init", },
+};
+
+static int
+nfp_devlink_versions_get_nsp(struct devlink_info_req *req, bool flash,
+			     const u8 *buf, unsigned int size)
+{
+	unsigned int i;
+	int err;
+
+	for (i = 0; i < ARRAY_SIZE(nfp_devlink_versions_nsp); i++) {
+		const struct nfp_devlink_versions *info;
+		const char *version;
+
+		info = &nfp_devlink_versions_nsp[i];
+
+		version = nfp_nsp_versions_get(info->id, flash, buf, size);
+		if (IS_ERR(version)) {
+			if (PTR_ERR(version) == -ENOENT)
+				continue;
+			else
+				return PTR_ERR(version);
+		}
+
+		if (flash)
+			err = devlink_info_version_stored_put(req, info->key,
+							      version);
+		else
+			err = devlink_info_version_running_put(req, info->key,
+							       version);
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
+
 static int
 nfp_devlink_info_get(struct devlink *devlink, struct devlink_info_req *req,
 		     struct netlink_ext_ack *extack)
 {
 	struct nfp_pf *pf = devlink_priv(devlink);
+	struct nfp_nsp *nsp;
+	char *buf = NULL;
 	const char *sn;
 	int err;
 
@@ -225,7 +274,45 @@ nfp_devlink_info_get(struct devlink *devlink, struct devlink_info_req *req,
 			return err;
 	}
 
+	nsp = nfp_nsp_open(pf->cpp);
+	if (IS_ERR(nsp)) {
+		NL_SET_ERR_MSG_MOD(extack, "can't access NSP");
+		return PTR_ERR(nsp);
+	}
+
+	if (nfp_nsp_has_versions(nsp)) {
+		buf = kzalloc(NFP_NSP_VERSION_BUFSZ, GFP_KERNEL);
+		if (!buf) {
+			err = -ENOMEM;
+			goto err_close_nsp;
+		}
+
+		err = nfp_nsp_versions(nsp, buf, NFP_NSP_VERSION_BUFSZ);
+		if (err)
+			goto err_free_buf;
+
+		err = nfp_devlink_versions_get_nsp(req, false,
+						   buf, NFP_NSP_VERSION_BUFSZ);
+		if (err)
+			goto err_free_buf;
+
+		err = nfp_devlink_versions_get_nsp(req, true,
+						   buf, NFP_NSP_VERSION_BUFSZ);
+		if (err)
+			goto err_free_buf;
+
+		kfree(buf);
+	}
+
+	nfp_nsp_close(nsp);
+
 	return nfp_devlink_versions_get_hwinfo(pf, req);
+
+err_free_buf:
+	kfree(buf);
+err_close_nsp:
+	nfp_nsp_close(nsp);
+	return err;
 }
 
 const struct devlink_ops nfp_devlink_ops = {
-- 
2.19.2


^ permalink raw reply related

* [PATCH net-next v4 2/8] devlink: add version reporting to devlink info API
From: Jakub Kicinski @ 2019-01-31 18:50 UTC (permalink / raw)
  To: davem
  Cc: netdev, oss-drivers, jiri, andrew, f.fainelli, mkubecek, eugenem,
	jonathan.lemon, Jakub Kicinski
In-Reply-To: <20190131185047.27685-1-jakub.kicinski@netronome.com>

ethtool -i has a few fixed-size fields which can be used to report
firmware version and expansion ROM version. Unfortunately, modern
hardware has more firmware components. There is usually some
datapath microcode, management controller, PXE drivers, and a
CPLD load. Running ethtool -i on modern controllers reveals the
fact that vendors cram multiple values into firmware version field.

Here are some examples from systems I could lay my hands on quickly:

tg3:  "FFV20.2.17 bc 5720-v1.39"
i40e: "6.01 0x800034a4 1.1747.0"
nfp:  "0.0.3.5 0.25 sriov-2.1.16 nic"

Add a new devlink API to allow retrieving multiple versions, and
provide user-readable name for those versions.

While at it break down the versions into three categories:
 - fixed - this is the board/fixed component version, usually vendors
           report information like the board version in the PCI VPD,
           but it will benefit from naming and common API as well;
 - running - this is the running firmware version;
 - stored - this is firmware in the flash, after firmware update
            this value will reflect the flashed version, while the
            running version may only be updated after reboot.

v3:
 - add per-type helpers instead of using the special argument (Jiri).
RFCv2:
 - remove the nesting in attr DEVLINK_ATTR_INFO_VERSIONS (now
   versions are mixed with other info attrs)l
 - have the driver report versions from the same callback as
   other info.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
---
 include/net/devlink.h        | 33 +++++++++++++++++++++
 include/uapi/linux/devlink.h |  5 ++++
 net/core/devlink.c           | 57 ++++++++++++++++++++++++++++++++++++
 3 files changed, 95 insertions(+)

diff --git a/include/net/devlink.h b/include/net/devlink.h
index a6d0a530483d..6dc0ef964392 100644
--- a/include/net/devlink.h
+++ b/include/net/devlink.h
@@ -614,6 +614,15 @@ int devlink_info_serial_number_put(struct devlink_info_req *req,
 				   const char *sn);
 int devlink_info_driver_name_put(struct devlink_info_req *req,
 				 const char *name);
+int devlink_info_version_fixed_put(struct devlink_info_req *req,
+				   const char *version_name,
+				   const char *version_value);
+int devlink_info_version_stored_put(struct devlink_info_req *req,
+				    const char *version_name,
+				    const char *version_value);
+int devlink_info_version_running_put(struct devlink_info_req *req,
+				     const char *version_name,
+				     const char *version_value);
 
 #else
 
@@ -923,6 +932,30 @@ devlink_info_serial_number_put(struct devlink_info_req *req, const char *sn)
 {
 	return 0;
 }
+
+static inline int
+devlink_info_version_fixed_put(struct devlink_info_req *req,
+			       const char *version_name,
+			       const char *version_value)
+{
+	return 0;
+}
+
+static inline int
+devlink_info_version_stored_put(struct devlink_info_req *req,
+				const char *version_name,
+				const char *version_value)
+{
+	return 0;
+}
+
+static inline int
+devlink_info_version_running_put(struct devlink_info_req *req,
+				 const char *version_name,
+				 const char *version_value)
+{
+	return 0;
+}
 #endif
 
 #endif /* _NET_DEVLINK_H_ */
diff --git a/include/uapi/linux/devlink.h b/include/uapi/linux/devlink.h
index 142710d45093..7fffd879c328 100644
--- a/include/uapi/linux/devlink.h
+++ b/include/uapi/linux/devlink.h
@@ -294,6 +294,11 @@ enum devlink_attr {
 
 	DEVLINK_ATTR_INFO_DRIVER_NAME,		/* string */
 	DEVLINK_ATTR_INFO_SERIAL_NUMBER,	/* string */
+	DEVLINK_ATTR_INFO_VERSION_FIXED,	/* nested */
+	DEVLINK_ATTR_INFO_VERSION_RUNNING,	/* nested */
+	DEVLINK_ATTR_INFO_VERSION_STORED,	/* nested */
+	DEVLINK_ATTR_INFO_VERSION_NAME,		/* string */
+	DEVLINK_ATTR_INFO_VERSION_VALUE,	/* string */
 
 	/* add new attributes above here, update the policy in devlink.c */
 
diff --git a/net/core/devlink.c b/net/core/devlink.c
index f456f6aa3d40..e31b6d617837 100644
--- a/net/core/devlink.c
+++ b/net/core/devlink.c
@@ -3730,6 +3730,63 @@ int devlink_info_serial_number_put(struct devlink_info_req *req, const char *sn)
 }
 EXPORT_SYMBOL_GPL(devlink_info_serial_number_put);
 
+static int devlink_info_version_put(struct devlink_info_req *req, int attr,
+				    const char *version_name,
+				    const char *version_value)
+{
+	struct nlattr *nest;
+	int err;
+
+	nest = nla_nest_start(req->msg, attr);
+	if (!nest)
+		return -EMSGSIZE;
+
+	err = nla_put_string(req->msg, DEVLINK_ATTR_INFO_VERSION_NAME,
+			     version_name);
+	if (err)
+		goto nla_put_failure;
+
+	err = nla_put_string(req->msg, DEVLINK_ATTR_INFO_VERSION_VALUE,
+			     version_value);
+	if (err)
+		goto nla_put_failure;
+
+	nla_nest_end(req->msg, nest);
+
+	return 0;
+
+nla_put_failure:
+	nla_nest_cancel(req->msg, nest);
+	return err;
+}
+
+int devlink_info_version_fixed_put(struct devlink_info_req *req,
+				   const char *version_name,
+				   const char *version_value)
+{
+	return devlink_info_version_put(req, DEVLINK_ATTR_INFO_VERSION_FIXED,
+					version_name, version_value);
+}
+EXPORT_SYMBOL_GPL(devlink_info_version_fixed_put);
+
+int devlink_info_version_stored_put(struct devlink_info_req *req,
+				    const char *version_name,
+				    const char *version_value)
+{
+	return devlink_info_version_put(req, DEVLINK_ATTR_INFO_VERSION_STORED,
+					version_name, version_value);
+}
+EXPORT_SYMBOL_GPL(devlink_info_version_stored_put);
+
+int devlink_info_version_running_put(struct devlink_info_req *req,
+				     const char *version_name,
+				     const char *version_value)
+{
+	return devlink_info_version_put(req, DEVLINK_ATTR_INFO_VERSION_RUNNING,
+					version_name, version_value);
+}
+EXPORT_SYMBOL_GPL(devlink_info_version_running_put);
+
 static int
 devlink_nl_info_fill(struct sk_buff *msg, struct devlink *devlink,
 		     enum devlink_command cmd, u32 portid,
-- 
2.19.2


^ permalink raw reply related

* [PATCH net-next v4 0/8] devlink: add device (driver) information API
From: Jakub Kicinski @ 2019-01-31 18:50 UTC (permalink / raw)
  To: davem
  Cc: netdev, oss-drivers, jiri, andrew, f.fainelli, mkubecek, eugenem,
	jonathan.lemon, Jakub Kicinski

Hi!

fw_version field in ethtool -i does not suit modern needs with 31
characters being quite limiting on more complex systems.  There is
also no distinction between the running and flashed versions of
the firmware.

Since the driver information pertains to the entire device, rather
than a particular netdev, it seems wise to move it do devlink, at
the same time fixing the aforementioned issues.

The new API allows exposing the device serial number and versions
of the components of the card - both hardware, firmware (running
and flashed).  Driver authors can choose descriptive identifiers
for the version fields.  A few version identifiers which seemed
relevant for most devices have been added to the global devlink
header.

Example:
$ devlink dev info pci/0000:05:00.0
pci/0000:05:00.0:
  driver nfp
  serial_number 16240145
  versions:
    fixed:
      board.id AMDA0099-0001
      board.rev 07
      board.vendor SMA
      board.model carbon
    running:
      fw.mgmt: 010156.010156.010156
      fw.cpld: 0x44
      fw.app: sriov-2.1.16
    stored:
      fw.mgmt: 010158.010158.010158
      fw.cpld: 0x44
      fw.app: sriov-2.1.20

Last patch also includes a compat code for ethtool.  If driver
reports no fw_version via the traditional ethtool API, ethtool
can call into devlink and try to cram as many versions as possible
into the 31 characters.

v4:
 - use IS_REACHABLE instead of IS_ENABLED in last patch.

v3 (Jiri):
 - rename various functions and attributes;
 - break out the version helpers per-type;
 - make the compat code parse a dump instead of special casing
   in each helper;
 - move generic version defines to a separate patch.
 
v2:
 - rebase.

this non-RFC, v3 some would say:
 - add three more versions in the NFP patches;
 - add last patch (ethool compat) - Andrew & Michal.

RFCv2:
 - use one driver op;
 - allow longer serial number;
 - wrap the skb into an opaque request struct;
 - add some common identifier into the devlink header.

Jakub Kicinski (8):
  devlink: add device information API
  devlink: add version reporting to devlink info API
  devlink: add generic info version names
  nfp: devlink: report driver name and serial number
  nfp: devlink: report fixed versions
  nfp: nsp: add support for versions command
  nfp: devlink: report the running and flashed versions
  ethtool: add compat for devlink info

 .../networking/devlink-info-versions.rst      |  38 +++
 Documentation/networking/index.rst            |   1 +
 .../net/ethernet/netronome/nfp/nfp_devlink.c  | 145 +++++++++++
 .../ethernet/netronome/nfp/nfpcore/nfp_nsp.c  |  61 +++++
 .../ethernet/netronome/nfp/nfpcore/nfp_nsp.h  |  20 ++
 include/net/devlink.h                         |  75 ++++++
 include/uapi/linux/devlink.h                  |  10 +
 net/core/devlink.c                            | 232 ++++++++++++++++++
 net/core/ethtool.c                            |   7 +
 9 files changed, 589 insertions(+)
 create mode 100644 Documentation/networking/devlink-info-versions.rst

-- 
2.19.2


^ permalink raw reply

* [PATCH net-next v4 4/8] nfp: devlink: report driver name and serial number
From: Jakub Kicinski @ 2019-01-31 18:50 UTC (permalink / raw)
  To: davem
  Cc: netdev, oss-drivers, jiri, andrew, f.fainelli, mkubecek, eugenem,
	jonathan.lemon, Jakub Kicinski
In-Reply-To: <20190131185047.27685-1-jakub.kicinski@netronome.com>

Report the basic info through new devlink info API.

RFCv2:
 - add driver name;
 - align serial to core changes.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
---
 .../net/ethernet/netronome/nfp/nfp_devlink.c  | 24 +++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/drivers/net/ethernet/netronome/nfp/nfp_devlink.c b/drivers/net/ethernet/netronome/nfp/nfp_devlink.c
index 808647ec3573..2ba3b3891d99 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_devlink.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_devlink.c
@@ -4,6 +4,7 @@
 #include <linux/rtnetlink.h>
 #include <net/devlink.h>
 
+#include "nfpcore/nfp.h"
 #include "nfpcore/nfp_nsp.h"
 #include "nfp_app.h"
 #include "nfp_main.h"
@@ -171,6 +172,28 @@ static int nfp_devlink_eswitch_mode_set(struct devlink *devlink, u16 mode,
 	return ret;
 }
 
+static int
+nfp_devlink_info_get(struct devlink *devlink, struct devlink_info_req *req,
+		     struct netlink_ext_ack *extack)
+{
+	struct nfp_pf *pf = devlink_priv(devlink);
+	const char *sn;
+	int err;
+
+	err = devlink_info_driver_name_put(req, "nfp");
+	if (err)
+		return err;
+
+	sn = nfp_hwinfo_lookup(pf->hwinfo, "assembly.serial");
+	if (sn) {
+		err = devlink_info_serial_number_put(req, sn);
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
+
 const struct devlink_ops nfp_devlink_ops = {
 	.port_split		= nfp_devlink_port_split,
 	.port_unsplit		= nfp_devlink_port_unsplit,
@@ -178,6 +201,7 @@ const struct devlink_ops nfp_devlink_ops = {
 	.sb_pool_set		= nfp_devlink_sb_pool_set,
 	.eswitch_mode_get	= nfp_devlink_eswitch_mode_get,
 	.eswitch_mode_set	= nfp_devlink_eswitch_mode_set,
+	.info_get		= nfp_devlink_info_get,
 };
 
 int nfp_devlink_port_register(struct nfp_app *app, struct nfp_port *port)
-- 
2.19.2


^ permalink raw reply related

* [PATCH net-next v4 3/8] devlink: add generic info version names
From: Jakub Kicinski @ 2019-01-31 18:50 UTC (permalink / raw)
  To: davem
  Cc: netdev, oss-drivers, jiri, andrew, f.fainelli, mkubecek, eugenem,
	jonathan.lemon, Jakub Kicinski
In-Reply-To: <20190131185047.27685-1-jakub.kicinski@netronome.com>

Add defines and docs for generic info versions.

v3:
 - add docs;
 - separate patch (Jiri).

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
---
 .../networking/devlink-info-versions.rst      | 38 +++++++++++++++++++
 Documentation/networking/index.rst            |  1 +
 include/net/devlink.h                         | 14 +++++++
 3 files changed, 53 insertions(+)
 create mode 100644 Documentation/networking/devlink-info-versions.rst

diff --git a/Documentation/networking/devlink-info-versions.rst b/Documentation/networking/devlink-info-versions.rst
new file mode 100644
index 000000000000..7d4ecf6b6f34
--- /dev/null
+++ b/Documentation/networking/devlink-info-versions.rst
@@ -0,0 +1,38 @@
+.. SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+
+=====================
+Devlink info versions
+=====================
+
+board.id
+========
+
+Unique identifier of the board design.
+
+board.rev
+=========
+
+Board design revision.
+
+fw.mgmt
+=======
+
+Control unit firmware version. This firmware is responsible for house
+keeping tasks, PHY control etc. but not the packet-by-packet data path
+operation.
+
+fw.app
+======
+
+Data path microcode controlling high-speed packet processing.
+
+fw.undi
+=======
+
+UNDI software, may include the UEFI driver, firmware or both.
+
+fw.ncsi
+=======
+
+Version of the software responsible for supporting/handling the
+Network Controller Sideband Interface.
diff --git a/Documentation/networking/index.rst b/Documentation/networking/index.rst
index f1627ca2a0ea..9a32451cd201 100644
--- a/Documentation/networking/index.rst
+++ b/Documentation/networking/index.rst
@@ -24,6 +24,7 @@ Linux Networking Documentation
    device_drivers/intel/i40e
    device_drivers/intel/iavf
    device_drivers/intel/ice
+   devlink-info-versions
    kapi
    z8530book
    msg_zerocopy
diff --git a/include/net/devlink.h b/include/net/devlink.h
index 6dc0ef964392..6b417f141fd6 100644
--- a/include/net/devlink.h
+++ b/include/net/devlink.h
@@ -428,6 +428,20 @@ enum devlink_param_wol_types {
 	.validate = _validate,						\
 }
 
+/* Part number, identifier of board design */
+#define DEVLINK_INFO_VERSION_GENERIC_BOARD_ID	"board.id"
+/* Revision of board design */
+#define DEVLINK_INFO_VERSION_GENERIC_BOARD_REV	"board.rev"
+
+/* Control processor FW version */
+#define DEVLINK_INFO_VERSION_GENERIC_FW_MGMT	"fw.mgmt"
+/* Data path microcode controlling high-speed packet processing */
+#define DEVLINK_INFO_VERSION_GENERIC_FW_APP	"fw.app"
+/* UNDI software version */
+#define DEVLINK_INFO_VERSION_GENERIC_FW_UNDI	"fw.undi"
+/* NCSI support/handler version */
+#define DEVLINK_INFO_VERSION_GENERIC_FW_NCSI	"fw.ncsi"
+
 struct devlink_region;
 struct devlink_info_req;
 
-- 
2.19.2


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox