Netdev List

Netdev List
 help / color / mirror / Atom feed

* linux-next: manual merge of the net-next tree with the net tree
From: Stephen Rothwell @ 2019-07-25  0:58 UTC (permalink / raw)
  To: David Miller, Networking
  Cc: Linux Next Mailing List, Linux Kernel Mailing List, Wen Yang,
	Sean Nyekjaer

[-- Attachment #1: Type: text/plain, Size: 1294 bytes --]

Hi all,

Today's linux-next merge of the net-next tree got a conflict in:

  drivers/net/can/flexcan.c

between commit:

  e9f2a856e102 ("can: flexcan: fix an use-after-free in flexcan_setup_stop_mode()")

from the net tree and commit:

  915f9666421c ("can: flexcan: add support for DT property 'wakeup-source'")

from the net-next tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc drivers/net/can/flexcan.c
index fcec8bcb53d6,09d8e623dcf6..000000000000
--- a/drivers/net/can/flexcan.c
+++ b/drivers/net/can/flexcan.c
@@@ -1473,9 -1473,10 +1491,12 @@@ static int flexcan_setup_stop_mode(stru
  
  	device_set_wakeup_capable(&pdev->dev, true);
  
+ 	if (of_property_read_bool(np, "wakeup-source"))
+ 		device_set_wakeup_enable(&pdev->dev, true);
+ 
 -	return 0;
 +out_put_node:
 +	of_node_put(gpr_np);
 +	return ret;
  }
  
  static const struct of_device_id flexcan_of_match[] = {

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply

* linux-next: manual merge of the net-next tree with the jc_docs tree
From: Stephen Rothwell @ 2019-07-25  0:54 UTC (permalink / raw)
  To: David Miller, Networking, Jonathan Corbet
  Cc: Linux Next Mailing List, Linux Kernel Mailing List,
	Mauro Carvalho Chehab, Benjamin Poirier

[-- Attachment #1: Type: text/plain, Size: 1183 bytes --]

Hi all,

Today's linux-next merge of the net-next tree got a conflict in:

  Documentation/PCI/pci-error-recovery.rst

between commit:

  4d2e26a38fbc ("docs: powerpc: convert docs to ReST and rename to *.rst")

from the jc_docs tree and commit:

  955315b0dc8c ("qlge: Move drivers/net/ethernet/qlogic/qlge/ to drivers/staging/qlge/")

from the net-next tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc Documentation/PCI/pci-error-recovery.rst
index e5d450df06b4,7e30f43a9659..000000000000
--- a/Documentation/PCI/pci-error-recovery.rst
+++ b/Documentation/PCI/pci-error-recovery.rst
@@@ -421,7 -421,3 +421,6 @@@ That is, the recovery API only require
     - drivers/net/ixgbe
     - drivers/net/cxgb3
     - drivers/net/s2io.c
-    - drivers/net/qlge
 +
 +The End
 +-------

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply

* RE: [PATCH net-next 2/2] qed: Add API for flashing the nvm attributes.
From: Sudarsana Reddy Kalluru @ 2019-07-25  0:48 UTC (permalink / raw)
  To: Saeed Mahameed, davem@davemloft.net
  Cc: Ariel Elior, Michal Kalderon, netdev@vger.kernel.org
In-Reply-To: <24c09b029d00ba73aab58ef09a2e65ac545b3423.camel@mellanox.com>

> -----Original Message-----
> From: Saeed Mahameed <saeedm@mellanox.com>
> Sent: Thursday, July 25, 2019 1:13 AM
> To: Sudarsana Reddy Kalluru <skalluru@marvell.com>;
> davem@davemloft.net
> Cc: Ariel Elior <aelior@marvell.com>; Michal Kalderon
> <mkalderon@marvell.com>; netdev@vger.kernel.org
> Subject: [EXT] Re: [PATCH net-next 2/2] qed: Add API for flashing the nvm
> attributes.
> 
> External Email
> 
> ----------------------------------------------------------------------
> On Tue, 2019-07-23 at 21:51 -0700, Sudarsana Reddy Kalluru wrote:
> > The patch adds driver interface for reading the NVM config request and
> > update the attributes on nvm config flash partition.
> >
> 
> You didn't not use the get_cfg API you added in previous patch.
Thanks for your review. Will move this API to the next patch series which will plan to send shortly.

> 
> Also can you please clarify how the user reads/write from/to NVM config
> ? i mean what UAPIs and tools are being used ?
NVM config/partition will be updated using ethtool flash update command (i.e., ethtool -f) just like the update of 
other flash partitions of qed device. Example code path,
  ethool-flash_device --> qede_flash_device() --> qed_nvm_flash() --> qed_nvm_flash_cfg_write()

> 
> > Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com>
> > Signed-off-by: Ariel Elior <aelior@marvell.com>
> > ---
> >  drivers/net/ethernet/qlogic/qed/qed_main.c | 65
> > ++++++++++++++++++++++++++++++
> >  include/linux/qed/qed_if.h                 |  1 +
> >  2 files changed, 66 insertions(+)
> >
> > diff --git a/drivers/net/ethernet/qlogic/qed/qed_main.c
> > b/drivers/net/ethernet/qlogic/qed/qed_main.c
> > index 829dd60..54f00d2 100644
> > --- a/drivers/net/ethernet/qlogic/qed/qed_main.c
> > +++ b/drivers/net/ethernet/qlogic/qed/qed_main.c
> > @@ -67,6 +67,8 @@
> >  #define QED_ROCE_QPS			(8192)
> >  #define QED_ROCE_DPIS			(8)
> >  #define QED_RDMA_SRQS                   QED_ROCE_QPS
> > +#define QED_NVM_CFG_SET_FLAGS		0xE
> > +#define QED_NVM_CFG_SET_PF_FLAGS	0x1E
> >
> >  static char version[] =
> >  	"QLogic FastLinQ 4xxxx Core Module qed " DRV_MODULE_VERSION
> > "\n";
> > @@ -2227,6 +2229,66 @@ static int qed_nvm_flash_image_validate(struct
> > qed_dev *cdev,
> >  	return 0;
> >  }
> >
> > +/* Binary file format -
> > + *     /----------------------------------------------------------
> > ------------\
> > + * 0B  |                       0x5 [command
> > index]                            |
> > + * 4B  | Entity ID     | Reserved        |  Number of config
> > attributes       |
> > + * 8B  | Config ID                       | Length        |
> > Value              |
> > +
> > *     |
> >         |
> > + *     \----------------------------------------------------------
> > ------------/
> > + * There can be several Cfg_id-Length-Value sets as specified by
> > 'Number of...'.
> > + * Entity ID - A non zero entity value for which the config need to
> > be updated.
> > + */
> > +static int qed_nvm_flash_cfg_write(struct qed_dev *cdev, const u8
> > **data)
> > +{
> > +	struct qed_hwfn *hwfn = QED_LEADING_HWFN(cdev);
> > +	u8 entity_id, len, buf[32];
> > +	struct qed_ptt *ptt;
> > +	u16 cfg_id, count;
> > +	int rc = 0, i;
> > +	u32 flags;
> > +
> > +	ptt = qed_ptt_acquire(hwfn);
> > +	if (!ptt)
> > +		return -EAGAIN;
> > +
> > +	/* NVM CFG ID attribute header */
> > +	*data += 4;
> > +	entity_id = **data;
> > +	*data += 2;
> > +	count = *((u16 *)*data);
> > +	*data += 2;
> > +
> > +	DP_VERBOSE(cdev, NETIF_MSG_DRV,
> > +		   "Read config ids: entity id %02x num _attrs =
> > %0d\n",
> > +		   entity_id, count);
> > +	/* NVM CFG ID attributes */
> > +	for (i = 0; i < count; i++) {
> > +		cfg_id = *((u16 *)*data);
> > +		*data += 2;
> > +		len = **data;
> > +		(*data)++;
> > +		memcpy(buf, *data, len);
> > +		*data += len;
> > +
> > +		flags = entity_id ? QED_NVM_CFG_SET_PF_FLAGS :
> > +			QED_NVM_CFG_SET_FLAGS;
> > +
> > +		DP_VERBOSE(cdev, NETIF_MSG_DRV,
> > +			   "cfg_id = %d len = %d\n", cfg_id, len);
> > +		rc = qed_mcp_nvm_set_cfg(hwfn, ptt, cfg_id, entity_id,
> > flags,
> > +					 buf, len);
> > +		if (rc) {
> > +			DP_ERR(cdev, "Error %d configuring %d\n", rc,
> > cfg_id);
> > +			break;
> > +		}
> > +	}
> > +
> > +	qed_ptt_release(hwfn, ptt);
> > +
> > +	return rc;
> > +}
> > +
> >  static int qed_nvm_flash(struct qed_dev *cdev, const char *name)
> >  {
> >  	const struct firmware *image;
> > @@ -2268,6 +2330,9 @@ static int qed_nvm_flash(struct qed_dev *cdev,
> > const char *name)
> >  			rc = qed_nvm_flash_image_access(cdev, &data,
> >  							&check_resp);
> >  			break;
> > +		case QED_NVM_FLASH_CMD_NVM_CFG_ID:
> > +			rc = qed_nvm_flash_cfg_write(cdev, &data);
> > +			break;
> >  		default:
> >  			DP_ERR(cdev, "Unknown command %08x\n",
> > cmd_type);
> >  			rc = -EINVAL;
> > diff --git a/include/linux/qed/qed_if.h b/include/linux/qed/qed_if.h
> > index eef02e6..23805ea 100644
> > --- a/include/linux/qed/qed_if.h
> > +++ b/include/linux/qed/qed_if.h
> > @@ -804,6 +804,7 @@ enum qed_nvm_flash_cmd {
> >  	QED_NVM_FLASH_CMD_FILE_DATA = 0x2,
> >  	QED_NVM_FLASH_CMD_FILE_START = 0x3,
> >  	QED_NVM_FLASH_CMD_NVM_CHANGE = 0x4,
> > +	QED_NVM_FLASH_CMD_NVM_CFG_ID = 0x5,
> >  	QED_NVM_FLASH_CMD_NVM_MAX,
> >  };
> >

^ permalink raw reply

* Re: [PATCH 00/12] block/bio, fs: convert put_page() to put_user_page*()
From: Bob Liu @ 2019-07-25  0:41 UTC (permalink / raw)
  To: john.hubbard, Andrew Morton
  Cc: Alexander Viro, Anna Schumaker, David S . Miller,
	Dominique Martinet, Eric Van Hensbergen, Jason Gunthorpe,
	Jason Wang, Jens Axboe, Latchesar Ionkov, Michael S . Tsirkin,
	Miklos Szeredi, Trond Myklebust, Christoph Hellwig,
	Matthew Wilcox, linux-mm, LKML, ceph-devel, kvm, linux-block,
	linux-cifs, linux-fsdevel, linux-nfs, linux-rdma, netdev,
	samba-technical, v9fs-developer, virtualization, John Hubbard
In-Reply-To: <20190724042518.14363-1-jhubbard@nvidia.com>

On 7/24/19 12:25 PM, john.hubbard@gmail.com wrote:
> From: John Hubbard <jhubbard@nvidia.com>
> 
> Hi,
> 
> This is mostly Jerome's work, converting the block/bio and related areas
> to call put_user_page*() instead of put_page(). Because I've changed
> Jerome's patches, in some cases significantly, I'd like to get his
> feedback before we actually leave him listed as the author (he might
> want to disown some or all of these).
> 

Could you add some background to the commit log for people don't have the context..
Why this converting? What's the main differences?

Regards, -Bob

> I added a new patch, in order to make this work with Christoph Hellwig's
> recent overhaul to bio_release_pages(): "block: bio_release_pages: use
> flags arg instead of bool".
> 
> I've started the series with a patch that I've posted in another
> series ("mm/gup: add make_dirty arg to put_user_pages_dirty_lock()"[1]),
> because I'm not sure which of these will go in first, and this allows each
> to stand alone.
> 
> Testing: not much beyond build and boot testing has been done yet. And
> I'm not set up to even exercise all of it (especially the IB parts) at
> run time.
> 
> Anyway, changes here are:
> 
> * Store, in the iov_iter, a "came from gup (get_user_pages)" parameter.
>   Then, use the new iov_iter_get_pages_use_gup() to retrieve it when
>   it is time to release the pages. That allows choosing between put_page()
>   and put_user_page*().
> 
> * Pass in one more piece of information to bio_release_pages: a "from_gup"
>   parameter. Similar use as above.
> 
> * Change the block layer, and several file systems, to use
>   put_user_page*().
> 
> [1] https://urldefense.proofpoint.com/v2/url?u=https-3A__lore.kernel.org_r_20190724012606.25844-2D2-2Djhubbard-40nvidia.com&d=DwIDaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=1ktT0U2YS_I8Zz2o-MS1YcCAzWZ6hFGtyTgvVMGM7gI&m=FpFhv2rjbKCAYGmO6Hy8WJAottr1Qz_mDKDLObQ40FU&s=q-_mX3daEr22WbdZMElc_ZbD8L9oGLD7U0xLeyJ661Y&e= 
>     And please note the correction email that I posted as a follow-up,
>     if you're looking closely at that patch. :) The fixed version is
>     included here.
> 
> John Hubbard (3):
>   mm/gup: add make_dirty arg to put_user_pages_dirty_lock()
>   block: bio_release_pages: use flags arg instead of bool
>   fs/ceph: fix a build warning: returning a value from void function
> 
> Jérôme Glisse (9):
>   iov_iter: add helper to test if an iter would use GUP v2
>   block: bio_release_pages: convert put_page() to put_user_page*()
>   block_dev: convert put_page() to put_user_page*()
>   fs/nfs: convert put_page() to put_user_page*()
>   vhost-scsi: convert put_page() to put_user_page*()
>   fs/cifs: convert put_page() to put_user_page*()
>   fs/fuse: convert put_page() to put_user_page*()
>   fs/ceph: convert put_page() to put_user_page*()
>   9p/net: convert put_page() to put_user_page*()
> 
>  block/bio.c                                |  81 ++++++++++++---
>  drivers/infiniband/core/umem.c             |   5 +-
>  drivers/infiniband/hw/hfi1/user_pages.c    |   5 +-
>  drivers/infiniband/hw/qib/qib_user_pages.c |   5 +-
>  drivers/infiniband/hw/usnic/usnic_uiom.c   |   5 +-
>  drivers/infiniband/sw/siw/siw_mem.c        |   8 +-
>  drivers/vhost/scsi.c                       |  13 ++-
>  fs/block_dev.c                             |  22 +++-
>  fs/ceph/debugfs.c                          |   2 +-
>  fs/ceph/file.c                             |  62 ++++++++---
>  fs/cifs/cifsglob.h                         |   3 +
>  fs/cifs/file.c                             |  22 +++-
>  fs/cifs/misc.c                             |  19 +++-
>  fs/direct-io.c                             |   2 +-
>  fs/fuse/dev.c                              |  22 +++-
>  fs/fuse/file.c                             |  53 +++++++---
>  fs/nfs/direct.c                            |  10 +-
>  include/linux/bio.h                        |  22 +++-
>  include/linux/mm.h                         |   5 +-
>  include/linux/uio.h                        |  11 ++
>  mm/gup.c                                   | 115 +++++++++------------
>  net/9p/trans_common.c                      |  14 ++-
>  net/9p/trans_common.h                      |   3 +-
>  net/9p/trans_virtio.c                      |  18 +++-
>  24 files changed, 357 insertions(+), 170 deletions(-)
> 


^ permalink raw reply

* Re: [PATCH bpf-next 01/10] libbpf: add .BTF.ext offset relocation section loading
From: Andrii Nakryiko @ 2019-07-25  0:37 UTC (permalink / raw)
  To: Song Liu
  Cc: Andrii Nakryiko, bpf, Networking, Alexei Starovoitov,
	Daniel Borkmann, Yonghong Song, Kernel Team
In-Reply-To: <B5E772A5-C0D9-4697-ADE2-2A94C4AD37B5@fb.com>

On Wed, Jul 24, 2019 at 5:00 PM Song Liu <songliubraving@fb.com> wrote:
>
>
>
> > On Jul 24, 2019, at 12:27 PM, Andrii Nakryiko <andriin@fb.com> wrote:
> >
> > Add support for BPF CO-RE offset relocations. Add section/record
> > iteration macros for .BTF.ext. These macro are useful for iterating over
> > each .BTF.ext record, either for dumping out contents or later for BPF
> > CO-RE relocation handling.
> >
> > To enable other parts of libbpf to work with .BTF.ext contents, moved
> > a bunch of type definitions into libbpf_internal.h.
> >
> > Signed-off-by: Andrii Nakryiko <andriin@fb.com>
> > ---
> > tools/lib/bpf/btf.c             | 64 +++++++++--------------
> > tools/lib/bpf/btf.h             |  4 ++
> > tools/lib/bpf/libbpf_internal.h | 91 +++++++++++++++++++++++++++++++++
> > 3 files changed, 118 insertions(+), 41 deletions(-)
> >

[...]

> > +
> > static int btf_ext_parse_hdr(__u8 *data, __u32 data_size)
> > {
> >       const struct btf_ext_header *hdr = (struct btf_ext_header *)data;
> > @@ -1004,6 +979,13 @@ struct btf_ext *btf_ext__new(__u8 *data, __u32 size)
> >       if (err)
> >               goto done;
> >
> > +     /* check if there is offset_reloc_off/offset_reloc_len fields */
> > +     if (btf_ext->hdr->hdr_len < sizeof(struct btf_ext_header))
>
> This check will break when we add more optional sections to btf_ext_header.
> Maybe use offsetof() instead?

I didn't do it, because there are no fields after offset_reloc_len.
But now I though that maybe it would be ok to add zero-sized marker
field, kind of like marking off various versions of btf_ext header?

Alternatively, I can add offsetofend() macro somewhere in libbpf_internal.h.

Do you have any preference?

>
> > +             goto done;
> > +     err = btf_ext_setup_offset_reloc(btf_ext);
> > +     if (err)
> > +             goto done;
> > +
> > done:

[...]

^ permalink raw reply

* Re: [PATCH net] selftests/net: add missing gitignores (ipv6_flowlabel)
From: Willem de Bruijn @ 2019-07-25  0:22 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: David Miller, Network Development, oss-drivers, Quentin Monnet
In-Reply-To: <20190725000714.10200-1-jakub.kicinski@netronome.com>

On Wed, Jul 24, 2019 at 8:07 PM Jakub Kicinski
<jakub.kicinski@netronome.com> wrote:
>
> ipv6_flowlabel and ipv6_flowlabel_mgr are missing from
> gitignore.  Quentin points out that the original
> commit 3fb321fde22d ("selftests/net: ipv6 flowlabel")
> did add ignore entries, they are just missing the "ipv6_"
> prefix.
>
> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>

Acked-by: Willem de Bruijn <willemb@google.com>

Thanks Jakub

^ permalink raw reply

* Re: [PATCH v4 net-next 13/19] ionic: Add initial ethtool support
From: Saeed Mahameed @ 2019-07-25  0:17 UTC (permalink / raw)
  To: snelson@pensando.io, netdev@vger.kernel.org, davem@davemloft.net
In-Reply-To: <20190722214023.9513-14-snelson@pensando.io>

On Mon, 2019-07-22 at 14:40 -0700, Shannon Nelson wrote:
> Add in the basic ethtool callbacks for device information
> and control.
> 
> Signed-off-by: Shannon Nelson <snelson@pensando.io>
> ---
>  drivers/net/ethernet/pensando/ionic/Makefile  |   2 +-
>  .../net/ethernet/pensando/ionic/ionic_dev.h   |   3 +
>  .../ethernet/pensando/ionic/ionic_ethtool.c   | 495
> ++++++++++++++++++
>  .../ethernet/pensando/ionic/ionic_ethtool.h   |   9 +
>  .../net/ethernet/pensando/ionic/ionic_lif.c   |   2 +
>  .../net/ethernet/pensando/ionic/ionic_lif.h   |   8 +
>  6 files changed, 518 insertions(+), 1 deletion(-)
>  create mode 100644
> drivers/net/ethernet/pensando/ionic/ionic_ethtool.c
>  create mode 100644
> drivers/net/ethernet/pensando/ionic/ionic_ethtool.h
> 
> diff --git a/drivers/net/ethernet/pensando/ionic/Makefile
> b/drivers/net/ethernet/pensando/ionic/Makefile
> index 7d9cdc5f02a1..9b19bf57a489 100644
> --- a/drivers/net/ethernet/pensando/ionic/Makefile
> +++ b/drivers/net/ethernet/pensando/ionic/Makefile
> @@ -3,5 +3,5 @@
>  
>  obj-$(CONFIG_IONIC) := ionic.o
>  
> -ionic-y := ionic_main.o ionic_bus_pci.o ionic_dev.o \
> +ionic-y := ionic_main.o ionic_bus_pci.o ionic_dev.o ionic_ethtool.o
> \
>  	   ionic_lif.o ionic_rx_filter.o ionic_debugfs.o
> diff --git a/drivers/net/ethernet/pensando/ionic/ionic_dev.h
> b/drivers/net/ethernet/pensando/ionic/ionic_dev.h
> index 523927566925..bacc9c557329 100644
> --- a/drivers/net/ethernet/pensando/ionic/ionic_dev.h
> +++ b/drivers/net/ethernet/pensando/ionic/ionic_dev.h
> @@ -12,6 +12,9 @@
>  
>  #define IONIC_MIN_MTU			ETH_MIN_MTU
>  #define IONIC_MAX_MTU			9194
> +#define IONIC_MAX_TXRX_DESC		16384
> +#define IONIC_MIN_TXRX_DESC		16
> +#define IONIC_DEF_TXRX_DESC		4096
>  #define IONIC_LIFS_MAX			1024
>  
>  struct ionic_dev_bar {
> diff --git a/drivers/net/ethernet/pensando/ionic/ionic_ethtool.c
> b/drivers/net/ethernet/pensando/ionic/ionic_ethtool.c
> new file mode 100644
> index 000000000000..f7899be547c3
> --- /dev/null
> +++ b/drivers/net/ethernet/pensando/ionic/ionic_ethtool.c
> @@ -0,0 +1,495 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/* Copyright(c) 2017 - 2019 Pensando Systems, Inc */
> +
> +#include <linux/module.h>
> +#include <linux/netdevice.h>
> +
> +#include "ionic.h"
> +#include "ionic_bus.h"
> +#include "ionic_lif.h"
> +#include "ionic_ethtool.h"
> +
> +static void ionic_get_drvinfo(struct net_device *netdev,
> +			      struct ethtool_drvinfo *drvinfo)
> +{
> +	struct lif *lif = netdev_priv(netdev);
> +	struct ionic *ionic = lif->ionic;
> +	struct ionic_dev *idev = &ionic->idev;
> +
> +	strlcpy(drvinfo->driver, DRV_NAME, sizeof(drvinfo->driver));
> +	strlcpy(drvinfo->version, DRV_VERSION, sizeof(drvinfo-
> >version));
> +	strlcpy(drvinfo->fw_version, idev->dev_info.fw_version,
> +		sizeof(drvinfo->fw_version));
> +	strlcpy(drvinfo->bus_info, ionic_bus_info(ionic),
> +		sizeof(drvinfo->bus_info));
> +}
> +
> +#define DEV_CMD_REG_VERSION 1
> +#define DEV_INFO_REG_COUNT  32
> +#define DEV_CMD_REG_COUNT   32
> +static int ionic_get_regs_len(struct net_device *netdev)
> +{
> +	return (DEV_INFO_REG_COUNT + DEV_CMD_REG_COUNT) * sizeof(u32);
> +}
> +
> +static void ionic_get_regs(struct net_device *netdev, struct
> ethtool_regs *regs,
> +			   void *p)
> +{
> +	struct lif *lif = netdev_priv(netdev);
> +	unsigned int size;
> +
> +	regs->version = DEV_CMD_REG_VERSION;
> +
> +	size = DEV_INFO_REG_COUNT * sizeof(u32);
> +	memcpy_fromio(p, lif->ionic->idev.dev_info_regs->words, size);
> +
> +	size = DEV_CMD_REG_COUNT * sizeof(u32);
> +	memcpy_fromio(p, lif->ionic->idev.dev_cmd_regs->words, size);
> +}
> +
> +static int ionic_get_link_ksettings(struct net_device *netdev,
> +				    struct ethtool_link_ksettings *ks)
> +{
> +	struct lif *lif = netdev_priv(netdev);
> +	struct ionic_dev *idev = &lif->ionic->idev;
> +	int copper_seen = 0;
> +
> +	ethtool_link_ksettings_zero_link_mode(ks, supported);
> +
> +	/* The port_info data is found in a DMA space that the NIC
> keeps
> +	 * up-to-date, so there's no need to request the data from the
> +	 * NIC, we already have it in our memory space.
> +	 */
> +
> +	switch (le16_to_cpu(idev->port_info->status.xcvr.pid)) {
> +		/* Copper */
> +	case XCVR_PID_QSFP_100G_CR4:
> +		ethtool_link_ksettings_add_link_mode(ks, supported,
> +						     100000baseCR4_Full
> );
> +		copper_seen++;
> +		break;
> +	case XCVR_PID_QSFP_40GBASE_CR4:
> +		ethtool_link_ksettings_add_link_mode(ks, supported,
> +						     40000baseCR4_Full)
> ;
> +		copper_seen++;
> +		break;
> +	case XCVR_PID_SFP_25GBASE_CR_S:
> +	case XCVR_PID_SFP_25GBASE_CR_L:
> +	case XCVR_PID_SFP_25GBASE_CR_N:
> +		ethtool_link_ksettings_add_link_mode(ks, supported,
> +						     25000baseCR_Full);
> +		copper_seen++;
> +		break;
> +	case XCVR_PID_SFP_10GBASE_AOC:
> +	case XCVR_PID_SFP_10GBASE_CU:
> +		ethtool_link_ksettings_add_link_mode(ks, supported,
> +						     10000baseCR_Full);
> +		copper_seen++;
> +		break;
> +
> +		/* Fibre */
> +	case XCVR_PID_QSFP_100G_SR4:
> +	case XCVR_PID_QSFP_100G_AOC:
> +		ethtool_link_ksettings_add_link_mode(ks, supported,
> +						     100000baseSR4_Full
> );
> +		break;
> +	case XCVR_PID_QSFP_100G_LR4:
> +		ethtool_link_ksettings_add_link_mode(ks, supported,
> +						     100000baseLR4_ER4_
> Full);
> +		break;
> +	case XCVR_PID_QSFP_100G_ER4:
> +		ethtool_link_ksettings_add_link_mode(ks, supported,
> +						     100000baseLR4_ER4_
> Full);
> +		break;
> +	case XCVR_PID_QSFP_40GBASE_SR4:
> +	case XCVR_PID_QSFP_40GBASE_AOC:
> +		ethtool_link_ksettings_add_link_mode(ks, supported,
> +						     40000baseSR4_Full)
> ;
> +		break;
> +	case XCVR_PID_QSFP_40GBASE_LR4:
> +		ethtool_link_ksettings_add_link_mode(ks, supported,
> +						     40000baseLR4_Full)
> ;
> +		break;
> +	case XCVR_PID_SFP_25GBASE_SR:
> +	case XCVR_PID_SFP_25GBASE_AOC:
> +		ethtool_link_ksettings_add_link_mode(ks, supported,
> +						     25000baseSR_Full);
> +		break;
> +	case XCVR_PID_SFP_10GBASE_SR:
> +		ethtool_link_ksettings_add_link_mode(ks, supported,
> +						     10000baseSR_Full);
> +		break;
> +	case XCVR_PID_SFP_10GBASE_LR:
> +		ethtool_link_ksettings_add_link_mode(ks, supported,
> +						     10000baseLR_Full);
> +		break;
> +	case XCVR_PID_SFP_10GBASE_LRM:
> +		ethtool_link_ksettings_add_link_mode(ks, supported,
> +						     10000baseLRM_Full)
> ;
> +		break;
> +	case XCVR_PID_SFP_10GBASE_ER:
> +		ethtool_link_ksettings_add_link_mode(ks, supported,
> +						     10000baseER_Full);
> +		break;
> +	case XCVR_PID_UNKNOWN:
> +		break;
> +	default:
> +		dev_info(lif->ionic->dev, "unknown xcvr type pid=%d /
> 0x%x\n",
> +			 idev->port_info->status.xcvr.pid,
> +			 idev->port_info->status.xcvr.pid);
> +		break;
> +	}
> +	ethtool_link_ksettings_add_link_mode(ks, supported, FIBRE);
> +
> +	if (ionic_is_pf(lif->ionic))
> +		ethtool_link_ksettings_add_link_mode(ks, supported,
> Autoneg);
> +
> +	bitmap_copy(ks->link_modes.advertising, ks-
> >link_modes.supported,
> +		    __ETHTOOL_LINK_MODE_MASK_NBITS);
> +
> +	ethtool_link_ksettings_add_link_mode(ks, supported, FEC_NONE);
> +	ethtool_link_ksettings_add_link_mode(ks, supported, FEC_RS);
> +	ethtool_link_ksettings_add_link_mode(ks, supported, FEC_BASER);
> +
> +	if (idev->port_info->config.fec_type == PORT_FEC_TYPE_FC)
> +		ethtool_link_ksettings_add_link_mode(ks, advertising,
> FEC_BASER);
> +	else if (idev->port_info->config.fec_type == PORT_FEC_TYPE_RS)
> +		ethtool_link_ksettings_add_link_mode(ks, advertising,
> FEC_RS);
> +	else if (idev->port_info->config.fec_type ==
> PORT_FEC_TYPE_NONE)
> +		ethtool_link_ksettings_add_link_mode(ks, advertising,
> FEC_NONE);
> +
> +	ethtool_link_ksettings_add_link_mode(ks, supported, Pause);
> +	if (idev->port_info->config.pause_type)
> +		ethtool_link_ksettings_add_link_mode(ks, advertising,
> Pause);
> +
> +	if (idev->port_info->status.xcvr.phy == PHY_TYPE_COPPER ||
> +	    copper_seen) {
> +		ks->base.port = PORT_DA;
> +	} else if (idev->port_info->status.xcvr.phy == PHY_TYPE_FIBER)
> {
> +		ks->base.port = PORT_FIBRE;
> +	} else {
> +		ks->base.port = PORT_OTHER;
> +	}
> +
> +	ks->base.speed = le32_to_cpu(lif->info->status.link_speed);
> +
> +	if (idev->port_info->config.an_enable)
> +		ks->base.autoneg = AUTONEG_ENABLE;
> +
> +	if (le16_to_cpu(lif->info->status.link_status))
> +		ks->base.duplex = DUPLEX_FULL;
> +	else
> +		ks->base.duplex = DUPLEX_UNKNOWN;
> +
> +	return 0;
> +}
> +
> +static int ionic_set_link_ksettings(struct net_device *netdev,
> +				    const struct ethtool_link_ksettings
> *ks)
> +{
> +	struct lif *lif = netdev_priv(netdev);
> +	struct ionic *ionic = lif->ionic;
> +	struct ionic_dev *idev = &lif->ionic->idev;
> +	u8 fec_type = PORT_FEC_TYPE_NONE;
> +	u32 req_rs, req_fc;
> +	int err = 0;
> +
> +	/* set autoneg */
> +	if (ks->base.autoneg != idev->port_info->config.an_enable) {
> +		mutex_lock(&ionic->dev_cmd_lock);
> +		ionic_dev_cmd_port_autoneg(idev, ks->base.autoneg);
> +		err = ionic_dev_cmd_wait(ionic, devcmd_timeout);
> +		mutex_unlock(&ionic->dev_cmd_lock);
> +		if (err)
> +			return err;
> +	}
> +
> +	/* set speed */
> +	if (ks->base.speed != le32_to_cpu(idev->port_info-
> >config.speed)) {
> +		mutex_lock(&ionic->dev_cmd_lock);
> +		ionic_dev_cmd_port_speed(idev, ks->base.speed);
> +		err = ionic_dev_cmd_wait(ionic, devcmd_timeout);
> +		mutex_unlock(&ionic->dev_cmd_lock);
> +		if (err)
> +			return err;
> +	}
> +
> +	/* set FEC */
> +	req_rs = ethtool_link_ksettings_test_link_mode(ks, advertising,
> FEC_RS);
> +	req_fc = ethtool_link_ksettings_test_link_mode(ks, advertising,
> FEC_BASER);
> +	if (req_rs && req_fc) {
> +		netdev_info(netdev, "Only select one FEC mode at a
> time\n");
> +		return -EINVAL;
> +	} else if (req_fc &&
> +		   idev->port_info->config.fec_type !=
> PORT_FEC_TYPE_FC) {
> +		fec_type = PORT_FEC_TYPE_FC;
> +	} else if (req_rs &&
> +		   idev->port_info->config.fec_type !=
> PORT_FEC_TYPE_RS) {
> +		fec_type = PORT_FEC_TYPE_RS;
> +	} else if (!(req_rs | req_fc) &&
> +		   idev->port_info->config.fec_type !=
> PORT_FEC_TYPE_NONE) {
> +		fec_type = PORT_FEC_TYPE_NONE;
> +	}
> +
> +	if (fec_type != idev->port_info->config.fec_type) {
> +		mutex_lock(&ionic->dev_cmd_lock);
> +		ionic_dev_cmd_port_fec(idev, fec_type);
> +		err = ionic_dev_cmd_wait(ionic, devcmd_timeout);
> +		mutex_unlock(&ionic->dev_cmd_lock);
> +		if (err)
> +			return err;
> +
> +		idev->port_info->config.fec_type = fec_type;
> +	}
> +
> +	return 0;
> +}
> +
> +static void ionic_get_pauseparam(struct net_device *netdev,
> +				 struct ethtool_pauseparam *pause)
> +{
> +	struct lif *lif = netdev_priv(netdev);
> +	struct ionic_dev *idev = &lif->ionic->idev;
> +	uint8_t pause_type = idev->port_info->config.pause_type;
> +
> +	pause->autoneg = 0;
> +
> +	if (pause_type) {
> +		pause->rx_pause = pause_type & IONIC_PAUSE_F_RX ? 1 :
> 0;
> +		pause->tx_pause = pause_type & IONIC_PAUSE_F_TX ? 1 :
> 0;
> +	}
> +}
> +
> +static int ionic_set_pauseparam(struct net_device *netdev,
> +				struct ethtool_pauseparam *pause)
> +{
> +	struct lif *lif = netdev_priv(netdev);
> +	struct ionic *ionic = lif->ionic;
> +	struct ionic_dev *idev = &lif->ionic->idev;
> +	u32 requested_pause;
> +	int err;
> +
> +	if (pause->autoneg == AUTONEG_ENABLE) {
> +		netdev_info(netdev, "Please use 'ethtool -s ...' to
> change autoneg\n");
> +		return -EOPNOTSUPP;
> +	}
> +
> +	/* change both at the same time */
> +	requested_pause = PORT_PAUSE_TYPE_LINK;
> +	if (pause->rx_pause)
> +		requested_pause |= IONIC_PAUSE_F_RX;
> +	if (pause->tx_pause)
> +		requested_pause |= IONIC_PAUSE_F_TX;
> +
> +	if (requested_pause == idev->port_info->config.pause_type)
> +		return 0;
> +
> +	idev->port_info->config.pause_type = requested_pause;
> +
> +	mutex_lock(&ionic->dev_cmd_lock);
> +	ionic_dev_cmd_port_pause(idev, requested_pause);
> +	err = ionic_dev_cmd_wait(ionic, devcmd_timeout);
> +	mutex_unlock(&ionic->dev_cmd_lock);
> +	if (err)
> +		return err;
> +
> +	return 0;
> +}
> +
> +static int ionic_get_coalesce(struct net_device *netdev,
> +			      struct ethtool_coalesce *coalesce)
> +{
> +	struct lif *lif = netdev_priv(netdev);
> +
> +	coalesce->tx_coalesce_usecs = lif->tx_coalesce_usecs;
> +	coalesce->rx_coalesce_usecs = lif->rx_coalesce_usecs;
> +
> +	return 0;
> +}
> +
> +static void ionic_get_ringparam(struct net_device *netdev,
> +				struct ethtool_ringparam *ring)
> +{
> +	struct lif *lif = netdev_priv(netdev);
> +
> +	ring->tx_max_pending = IONIC_MAX_TXRX_DESC;
> +	ring->tx_pending = lif->ntxq_descs;
> +	ring->rx_max_pending = IONIC_MAX_TXRX_DESC;
> +	ring->rx_pending = lif->nrxq_descs;
> +}
> +
> +static int ionic_set_ringparam(struct net_device *netdev,
> +			       struct ethtool_ringparam *ring)
> +{
> +	struct lif *lif = netdev_priv(netdev);
> +	bool running;
> +
> +	if (ring->rx_mini_pending || ring->rx_jumbo_pending) {
> +		netdev_info(netdev, "Changing jumbo or mini descriptors
> not supported\n");
> +		return -EINVAL;
> +	}
> +
> +	if (!is_power_of_2(ring->tx_pending) ||
> +	    !is_power_of_2(ring->rx_pending)) {
> +		netdev_info(netdev, "Descriptor count must be a power
> of 2\n");
> +		return -EINVAL;
> +	}
> +
> +	/* if nothing to do return success */
> +	if (ring->tx_pending == lif->ntxq_descs &&
> +	    ring->rx_pending == lif->nrxq_descs)
> +		return 0;
> +
> +	while (test_and_set_bit(LIF_QUEUE_RESET, lif->state))
> +		usleep_range(200, 400);
> +
> +	running = test_bit(LIF_UP, lif->state);
> +	if (running)
> +		ionic_stop(netdev);
> +
> +	lif->ntxq_descs = ring->tx_pending;
> +	lif->nrxq_descs = ring->rx_pending;
> +
> +	if (running)
> +		ionic_open(netdev);
> +	clear_bit(LIF_QUEUE_RESET, lif->state);
> +
> +	return 0;
> +}
> +
> +static void ionic_get_channels(struct net_device *netdev,
> +			       struct ethtool_channels *ch)
> +{
> +	struct lif *lif = netdev_priv(netdev);
> +
> +	/* report maximum channels */
> +	ch->max_combined = lif->ionic->ntxqs_per_lif;
> +
> +	/* report current channels */
> +	ch->combined_count = lif->nxqs;
> +}
> +
> +static int ionic_set_channels(struct net_device *netdev,
> +			      struct ethtool_channels *ch)
> +{
> +	struct lif *lif = netdev_priv(netdev);
> +	bool running;
> +
> +	if (!ch->combined_count || ch->other_count ||
> +	    ch->rx_count || ch->tx_count)
> +		return -EINVAL;
> +
> +	if (ch->combined_count == lif->nxqs)
> +		return 0;
> +
> +	while (test_and_set_bit(LIF_QUEUE_RESET, lif->state))
> +		usleep_range(200, 400);
> +

I see this is recurring a lot in the driver, i suggest to have a helper
function (wait_pending_reset_timeout) and make it return with timeout
errno after a reasonable amount of time, especially on user context
flows.

> +	running = test_bit(LIF_UP, lif->state);
> +	if (running)
> +		ionic_stop(netdev);
> +
> +	lif->nxqs = ch->combined_count;
> +
> +	if (running)
> +		ionic_open(netdev);
> +	clear_bit(LIF_QUEUE_RESET, lif->state);
> +
> +	return 0;
> +}
> +
> +static int ionic_get_module_info(struct net_device *netdev,
> +				 struct ethtool_modinfo *modinfo)
> +
> +{
> +	struct lif *lif = netdev_priv(netdev);
> +	struct ionic_dev *idev = &lif->ionic->idev;
> +	struct xcvr_status *xcvr;
> +
> +	xcvr = &idev->port_info->status.xcvr;
> +
> +	/* report the module data type and length */
> +	switch (xcvr->sprom[0]) {
> +	case 0x03: /* SFP */
> +		modinfo->type = ETH_MODULE_SFF_8079;
> +		modinfo->eeprom_len = ETH_MODULE_SFF_8079_LEN;
> +		break;
> +	case 0x0D: /* QSFP */
> +	case 0x11: /* QSFP28 */
> +		modinfo->type = ETH_MODULE_SFF_8436;
> +		modinfo->eeprom_len = ETH_MODULE_SFF_8436_LEN;
> +		break;
> +	default:
> +		netdev_info(netdev, "unknown xcvr type 0x%02x\n",
> +			    xcvr->sprom[0]);
> +		break;
> +	}
> +
> +	return 0;
> +}
> +
> +static int ionic_get_module_eeprom(struct net_device *netdev,
> +				   struct ethtool_eeprom *ee,
> +				   u8 *data)
> +{
> +	struct lif *lif = netdev_priv(netdev);
> +	struct ionic_dev *idev = &lif->ionic->idev;
> +	struct xcvr_status *xcvr;
> +	u32 len;
> +
> +	/* The NIC keeps the module prom up-to-date in the DMA space
> +	 * so we can simply copy the module bytes into the data buffer.
> +	 */
> +	xcvr = &idev->port_info->status.xcvr;
> +	len = min_t(u32, sizeof(xcvr->sprom), ee->len);
> +	memcpy(data, xcvr->sprom, len);
> +
> +	return 0;
> +}
> +
> +static int ionic_nway_reset(struct net_device *netdev)
> +{
> +	struct lif *lif = netdev_priv(netdev);
> +	struct ionic *ionic = lif->ionic;
> +	int err = 0;
> +
> +	/* flap the link to force auto-negotiation */
> +
> +	mutex_lock(&ionic->dev_cmd_lock);
> +
> +	ionic_dev_cmd_port_state(&ionic->idev, PORT_ADMIN_STATE_DOWN);
> +	err = ionic_dev_cmd_wait(ionic, devcmd_timeout);
> +
> +	if (!err) {
> +		ionic_dev_cmd_port_state(&ionic->idev,
> PORT_ADMIN_STATE_UP);
> +		err = ionic_dev_cmd_wait(ionic, devcmd_timeout);
> +	}
> +
> +	mutex_unlock(&ionic->dev_cmd_lock);
> +
> +	return err;
> +}
> +
> +static const struct ethtool_ops ionic_ethtool_ops = {
> +	.get_drvinfo		= ionic_get_drvinfo,
> +	.get_regs_len		= ionic_get_regs_len,
> +	.get_regs		= ionic_get_regs,
> +	.get_link		= ethtool_op_get_link,
> +	.get_link_ksettings	= ionic_get_link_ksettings,
> +	.get_coalesce		= ionic_get_coalesce,
> +	.get_ringparam		= ionic_get_ringparam,
> +	.set_ringparam		= ionic_set_ringparam,
> +	.get_channels		= ionic_get_channels,
> +	.set_channels		= ionic_set_channels,
> +	.get_module_info	= ionic_get_module_info,
> +	.get_module_eeprom	= ionic_get_module_eeprom,
> +	.get_pauseparam		= ionic_get_pauseparam,
> +	.set_pauseparam		= ionic_set_pauseparam,
> +	.set_link_ksettings	= ionic_set_link_ksettings,
> +	.nway_reset		= ionic_nway_reset,
> +};
> +
> +void ionic_ethtool_set_ops(struct net_device *netdev)
> +{
> +	netdev->ethtool_ops = &ionic_ethtool_ops;
> +}
> diff --git a/drivers/net/ethernet/pensando/ionic/ionic_ethtool.h
> b/drivers/net/ethernet/pensando/ionic/ionic_ethtool.h
> new file mode 100644
> index 000000000000..38b91b1d70ae
> --- /dev/null
> +++ b/drivers/net/ethernet/pensando/ionic/ionic_ethtool.h
> @@ -0,0 +1,9 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/* Copyright(c) 2017 - 2019 Pensando Systems, Inc */
> +
> +#ifndef _IONIC_ETHTOOL_H_
> +#define _IONIC_ETHTOOL_H_
> +
> +void ionic_ethtool_set_ops(struct net_device *netdev);
> +
> +#endif /* _IONIC_ETHTOOL_H_ */
> diff --git a/drivers/net/ethernet/pensando/ionic/ionic_lif.c
> b/drivers/net/ethernet/pensando/ionic/ionic_lif.c
> index f52af9cb6264..2bd8ce61c4a0 100644
> --- a/drivers/net/ethernet/pensando/ionic/ionic_lif.c
> +++ b/drivers/net/ethernet/pensando/ionic/ionic_lif.c
> @@ -10,6 +10,7 @@
>  #include "ionic.h"
>  #include "ionic_bus.h"
>  #include "ionic_lif.h"
> +#include "ionic_ethtool.h"
>  #include "ionic_debugfs.h"
>  
>  static void ionic_lif_rx_mode(struct lif *lif, unsigned int
> rx_mode);
> @@ -980,6 +981,7 @@ static struct lif *ionic_lif_alloc(struct ionic
> *ionic, unsigned int index)
>  	lif->netdev = netdev;
>  	ionic->master_lif = lif;
>  	netdev->netdev_ops = &ionic_netdev_ops;
> +	ionic_ethtool_set_ops(netdev);
>  
>  	netdev->watchdog_timeo = 2 * HZ;
>  	netdev->min_mtu = IONIC_MIN_MTU;
> diff --git a/drivers/net/ethernet/pensando/ionic/ionic_lif.h
> b/drivers/net/ethernet/pensando/ionic/ionic_lif.h
> index 9930b9390c8a..d8589a306aa5 100644
> --- a/drivers/net/ethernet/pensando/ionic/ionic_lif.h
> +++ b/drivers/net/ethernet/pensando/ionic/ionic_lif.h
> @@ -111,6 +111,8 @@ struct lif {
>  	u64 last_eid;
>  	unsigned int neqs;
>  	unsigned int nxqs;
> +	unsigned int ntxq_descs;
> +	unsigned int nrxq_descs;
>  	unsigned int rx_mode;
>  	u64 hw_features;
>  	bool mc_overflow;
> @@ -124,6 +126,8 @@ struct lif {
>  
>  	struct rx_filters rx_filters;
>  	struct ionic_deferred deferred;
> +	u32 tx_coalesce_usecs;
> +	u32 rx_coalesce_usecs;
>  	unsigned long *dbid_inuse;
>  	unsigned int dbid_count;
>  	struct dentry *dentry;
> @@ -165,6 +169,10 @@ int ionic_lif_identify(struct ionic *ionic, u8
> lif_type,
>  		       union lif_identity *lif_ident);
>  int ionic_lifs_size(struct ionic *ionic);
>  
> +int ionic_open(struct net_device *netdev);
> +int ionic_stop(struct net_device *netdev);
> +int ionic_reset_queues(struct lif *lif);
> +
>  static inline void debug_stats_napi_poll(struct qcq *qcq,
>  					 unsigned int work_done)
>  {

^ permalink raw reply

* [PATCH net] selftests/net: add missing gitignores (ipv6_flowlabel)
From: Jakub Kicinski @ 2019-07-25  0:07 UTC (permalink / raw)
  To: davem; +Cc: netdev, oss-drivers, willemb, Jakub Kicinski, Quentin Monnet

ipv6_flowlabel and ipv6_flowlabel_mgr are missing from
gitignore.  Quentin points out that the original
commit 3fb321fde22d ("selftests/net: ipv6 flowlabel")
did add ignore entries, they are just missing the "ipv6_"
prefix.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
---
 tools/testing/selftests/net/.gitignore | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/net/.gitignore b/tools/testing/selftests/net/.gitignore
index 4ce0bc1612f5..c7cced739c34 100644
--- a/tools/testing/selftests/net/.gitignore
+++ b/tools/testing/selftests/net/.gitignore
@@ -17,7 +17,7 @@ tcp_inq
 tls
 txring_overwrite
 ip_defrag
+ipv6_flowlabel
+ipv6_flowlabel_mgr
 so_txtime
-flowlabel
-flowlabel_mgr
 tcp_fastopen_backup_key
-- 
2.21.0


^ permalink raw reply related

* Re: [PATCH v4 net-next 12/19] ionic: Add async link status check and basic stats
From: Saeed Mahameed @ 2019-07-25  0:04 UTC (permalink / raw)
  To: snelson@pensando.io, netdev@vger.kernel.org, davem@davemloft.net
In-Reply-To: <20190722214023.9513-13-snelson@pensando.io>

On Mon, 2019-07-22 at 14:40 -0700, Shannon Nelson wrote:
> Add code to handle the link status event, and wire up the
> basic netdev hardware stats.
> 
> Signed-off-by: Shannon Nelson <snelson@pensando.io>
> ---
>  .../net/ethernet/pensando/ionic/ionic_lif.c   | 116
> ++++++++++++++++++
>  .../net/ethernet/pensando/ionic/ionic_lif.h   |   1 +
>  2 files changed, 117 insertions(+)
> 
> diff --git a/drivers/net/ethernet/pensando/ionic/ionic_lif.c
> b/drivers/net/ethernet/pensando/ionic/ionic_lif.c
> index efcda1337f91..f52af9cb6264 100644
> --- a/drivers/net/ethernet/pensando/ionic/ionic_lif.c
> +++ b/drivers/net/ethernet/pensando/ionic/ionic_lif.c
> @@ -15,6 +15,7 @@
>  static void ionic_lif_rx_mode(struct lif *lif, unsigned int
> rx_mode);
>  static int ionic_lif_addr_add(struct lif *lif, const u8 *addr);
>  static int ionic_lif_addr_del(struct lif *lif, const u8 *addr);
> +static void ionic_link_status_check(struct lif *lif);
>  
>  static int ionic_set_nic_features(struct lif *lif, netdev_features_t
> features);
>  static int ionic_notifyq_clean(struct lif *lif, int budget);
> @@ -44,6 +45,9 @@ static void ionic_lif_deferred_work(struct
> work_struct *work)
>  		case DW_TYPE_RX_ADDR_DEL:
>  			ionic_lif_addr_del(lif, w->addr);
>  			break;
> +		case DW_TYPE_LINK_STATUS:
> +			ionic_link_status_check(lif);
> +			break;
>  		default:
>  			break;
>  		}
> @@ -69,6 +73,7 @@ int ionic_open(struct net_device *netdev)
>  
>  	set_bit(LIF_UP, lif->state);
>  
> +	ionic_link_status_check(lif);
>  	if (netif_carrier_ok(netdev))
>  		netif_tx_wake_all_queues(netdev);
>  
> @@ -151,6 +156,39 @@ static int ionic_adminq_napi(struct napi_struct
> *napi, int budget)
>  	return max(n_work, a_work);
>  }
>  
> +static void ionic_link_status_check(struct lif *lif)
> +{
> +	struct net_device *netdev = lif->netdev;
> +	u16 link_status;
> +	bool link_up;
> +
> +	clear_bit(LIF_LINK_CHECK_NEEDED, lif->state);
> +
> +	link_status = le16_to_cpu(lif->info->status.link_status);
> +	link_up = link_status == PORT_OPER_STATUS_UP;
> +
> +	/* filter out the no-change cases */
> +	if (link_up == netif_carrier_ok(netdev))
> +		return;
> +
> +	if (link_up) {
> +		netdev_info(netdev, "Link up - %d Gbps\n",
> +			    le32_to_cpu(lif->info->status.link_speed) /
> 1000);
> +
> +		if (test_bit(LIF_UP, lif->state)) {
> +			netif_tx_wake_all_queues(lif->netdev);
> +			netif_carrier_on(netdev);
> +		}
> +	} else {
> +		netdev_info(netdev, "Link down\n");
> +
> +		/* carrier off first to avoid watchdog timeout */
> +		netif_carrier_off(netdev);
> +		if (test_bit(LIF_UP, lif->state))
> +			netif_tx_stop_all_queues(netdev);
> +	}
> +}
> +
>  static bool ionic_notifyq_service(struct cq *cq, struct cq_info
> *cq_info)
>  {
>  	union notifyq_comp *comp = cq_info->cq_desc;
> @@ -182,6 +220,9 @@ static bool ionic_notifyq_service(struct cq *cq,
> struct cq_info *cq_info)
>  			    "  link_status=%d link_speed=%d\n",
>  			    le16_to_cpu(comp->link_change.link_status),
>  			    le32_to_cpu(comp->link_change.link_speed));
> +
> +		set_bit(LIF_LINK_CHECK_NEEDED, lif->state);
> +
>  		break;
>  	case EVENT_OPCODE_RESET:
>  		netdev_info(netdev, "Notifyq EVENT_OPCODE_RESET
> eid=%lld\n",
> @@ -222,10 +263,81 @@ static int ionic_notifyq_clean(struct lif *lif,
> int budget)
>  	if (work_done == budget)
>  		goto return_to_napi;
>  
> +	/* After outstanding events are processed we can check on
> +	 * the link status and any outstanding interrupt credits.
> +	 *
> +	 * We wait until here to check on the link status in case
> +	 * there was a long list of link events from a flap episode.
> +	 */
> +	if (test_bit(LIF_LINK_CHECK_NEEDED, lif->state)) {
> +		struct ionic_deferred_work *work;
> +
> +		work = kzalloc(sizeof(*work), GFP_ATOMIC);
> +		if (!work) {
> +			netdev_err(lif->netdev, "%s OOM\n", __func__);

why not having a pre allocated dedicated lif->link_check_work, instead
of allocating in atomic context on every link check event ?

> +		} else {
> +			work->type = DW_TYPE_LINK_STATUS;
> +			ionic_lif_deferred_enqueue(&lif->deferred,
> work);
> +		}
> +	}
> +
>  return_to_napi:
>  	return work_done;
>  }
>  
> +static void ionic_get_stats64(struct net_device *netdev,
> +			      struct rtnl_link_stats64 *ns)
> +{
> +	struct lif *lif = netdev_priv(netdev);
> +	struct lif_stats *ls;
> +
> +	memset(ns, 0, sizeof(*ns));
> +	ls = &lif->info->stats;
> +
> +	ns->rx_packets = le64_to_cpu(ls->rx_ucast_packets) +
> +			 le64_to_cpu(ls->rx_mcast_packets) +
> +			 le64_to_cpu(ls->rx_bcast_packets);
> +
> +	ns->tx_packets = le64_to_cpu(ls->tx_ucast_packets) +
> +			 le64_to_cpu(ls->tx_mcast_packets) +
> +			 le64_to_cpu(ls->tx_bcast_packets);
> +
> +	ns->rx_bytes = le64_to_cpu(ls->rx_ucast_bytes) +
> +		       le64_to_cpu(ls->rx_mcast_bytes) +
> +		       le64_to_cpu(ls->rx_bcast_bytes);
> +
> +	ns->tx_bytes = le64_to_cpu(ls->tx_ucast_bytes) +
> +		       le64_to_cpu(ls->tx_mcast_bytes) +
> +		       le64_to_cpu(ls->tx_bcast_bytes);
> +
> +	ns->rx_dropped = le64_to_cpu(ls->rx_ucast_drop_packets) +
> +			 le64_to_cpu(ls->rx_mcast_drop_packets) +
> +			 le64_to_cpu(ls->rx_bcast_drop_packets);
> +
> +	ns->tx_dropped = le64_to_cpu(ls->tx_ucast_drop_packets) +
> +			 le64_to_cpu(ls->tx_mcast_drop_packets) +
> +			 le64_to_cpu(ls->tx_bcast_drop_packets);
> +
> +	ns->multicast = le64_to_cpu(ls->rx_mcast_packets);
> +
> +	ns->rx_over_errors = le64_to_cpu(ls->rx_queue_empty);
> +
> +	ns->rx_missed_errors = le64_to_cpu(ls->rx_dma_error) +
> +			       le64_to_cpu(ls->rx_queue_disabled) +
> +			       le64_to_cpu(ls->rx_desc_fetch_error) +
> +			       le64_to_cpu(ls->rx_desc_data_error);
> +
> +	ns->tx_aborted_errors = le64_to_cpu(ls->tx_dma_error) +
> +				le64_to_cpu(ls->tx_queue_disabled) +
> +				le64_to_cpu(ls->tx_desc_fetch_error) +
> +				le64_to_cpu(ls->tx_desc_data_error);
> +
> +	ns->rx_errors = ns->rx_over_errors +
> +			ns->rx_missed_errors;
> +
> +	ns->tx_errors = ns->tx_aborted_errors;
> +}
> +
>  static int ionic_lif_addr_add(struct lif *lif, const u8 *addr)
>  {
>  	struct ionic_admin_ctx ctx = {
> @@ -581,6 +693,7 @@ static int ionic_vlan_rx_kill_vid(struct
> net_device *netdev, __be16 proto,
>  static const struct net_device_ops ionic_netdev_ops = {
>  	.ndo_open               = ionic_open,
>  	.ndo_stop               = ionic_stop,
> +	.ndo_get_stats64	= ionic_get_stats64,
>  	.ndo_set_rx_mode	= ionic_set_rx_mode,
>  	.ndo_set_features	= ionic_set_features,
>  	.ndo_set_mac_address	= ionic_set_mac_address,
> @@ -1418,6 +1531,8 @@ static int ionic_lif_init(struct lif *lif)
>  
>  	set_bit(LIF_INITED, lif->state);
>  
> +	ionic_link_status_check(lif);
> +
>  	return 0;
>  
>  err_out_notifyq_deinit:
> @@ -1461,6 +1576,7 @@ int ionic_lifs_register(struct ionic *ionic)
>  		return err;
>  	}
>  

are events (NotifyQ) enabled at this stage ? if so then you might endup
racing ionic_link_status_check with itself.

> +	ionic_link_status_check(ionic->master_lif);
>  	ionic->master_lif->registered = true;
>  
>  	return 0;
> diff --git a/drivers/net/ethernet/pensando/ionic/ionic_lif.h
> b/drivers/net/ethernet/pensando/ionic/ionic_lif.h
> index 20b4fa573f77..9930b9390c8a 100644
> --- a/drivers/net/ethernet/pensando/ionic/ionic_lif.h
> +++ b/drivers/net/ethernet/pensando/ionic/ionic_lif.h
> @@ -86,6 +86,7 @@ struct ionic_deferred {
>  enum lif_state_flags {
>  	LIF_INITED,
>  	LIF_UP,
> +	LIF_LINK_CHECK_NEEDED,
>  	LIF_QUEUE_RESET,
>  
>  	/* leave this as last */

^ permalink raw reply

* Re: [PATCH bpf-next 01/10] libbpf: add .BTF.ext offset relocation section loading
From: Song Liu @ 2019-07-25  0:00 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, Networking, Alexei Starovoitov, Daniel Borkmann,
	Yonghong Song, andrii.nakryiko@gmail.com, Kernel Team
In-Reply-To: <20190724192742.1419254-2-andriin@fb.com>



> On Jul 24, 2019, at 12:27 PM, Andrii Nakryiko <andriin@fb.com> wrote:
> 
> Add support for BPF CO-RE offset relocations. Add section/record
> iteration macros for .BTF.ext. These macro are useful for iterating over
> each .BTF.ext record, either for dumping out contents or later for BPF
> CO-RE relocation handling.
> 
> To enable other parts of libbpf to work with .BTF.ext contents, moved
> a bunch of type definitions into libbpf_internal.h.
> 
> Signed-off-by: Andrii Nakryiko <andriin@fb.com>
> ---
> tools/lib/bpf/btf.c             | 64 +++++++++--------------
> tools/lib/bpf/btf.h             |  4 ++
> tools/lib/bpf/libbpf_internal.h | 91 +++++++++++++++++++++++++++++++++
> 3 files changed, 118 insertions(+), 41 deletions(-)
> 
> diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> index 467224feb43b..4a36bc783848 100644
> --- a/tools/lib/bpf/btf.c
> +++ b/tools/lib/bpf/btf.c
> @@ -42,47 +42,6 @@ struct btf {
> 	int fd;
> };
> 
> -struct btf_ext_info {
> -	/*
> -	 * info points to the individual info section (e.g. func_info and
> -	 * line_info) from the .BTF.ext. It does not include the __u32 rec_size.
> -	 */
> -	void *info;
> -	__u32 rec_size;
> -	__u32 len;
> -};
> -
> -struct btf_ext {
> -	union {
> -		struct btf_ext_header *hdr;
> -		void *data;
> -	};
> -	struct btf_ext_info func_info;
> -	struct btf_ext_info line_info;
> -	__u32 data_size;
> -};
> -
> -struct btf_ext_info_sec {
> -	__u32	sec_name_off;
> -	__u32	num_info;
> -	/* Followed by num_info * record_size number of bytes */
> -	__u8	data[0];
> -};
> -
> -/* The minimum bpf_func_info checked by the loader */
> -struct bpf_func_info_min {
> -	__u32   insn_off;
> -	__u32   type_id;
> -};
> -
> -/* The minimum bpf_line_info checked by the loader */
> -struct bpf_line_info_min {
> -	__u32	insn_off;
> -	__u32	file_name_off;
> -	__u32	line_off;
> -	__u32	line_col;
> -};
> -
> static inline __u64 ptr_to_u64(const void *ptr)
> {
> 	return (__u64) (unsigned long) ptr;
> @@ -831,6 +790,9 @@ static int btf_ext_setup_info(struct btf_ext *btf_ext,
> 	/* The start of the info sec (including the __u32 record_size). */
> 	void *info;
> 
> +	if (ext_sec->len == 0)
> +		return 0;
> +
> 	if (ext_sec->off & 0x03) {
> 		pr_debug(".BTF.ext %s section is not aligned to 4 bytes\n",
> 		     ext_sec->desc);
> @@ -934,6 +896,19 @@ static int btf_ext_setup_line_info(struct btf_ext *btf_ext)
> 	return btf_ext_setup_info(btf_ext, &param);
> }
> 
> +static int btf_ext_setup_offset_reloc(struct btf_ext *btf_ext)
> +{
> +	struct btf_ext_sec_setup_param param = {
> +		.off = btf_ext->hdr->offset_reloc_off,
> +		.len = btf_ext->hdr->offset_reloc_len,
> +		.min_rec_size = sizeof(struct bpf_offset_reloc),
> +		.ext_info = &btf_ext->offset_reloc_info,
> +		.desc = "offset_reloc",
> +	};
> +
> +	return btf_ext_setup_info(btf_ext, &param);
> +}
> +
> static int btf_ext_parse_hdr(__u8 *data, __u32 data_size)
> {
> 	const struct btf_ext_header *hdr = (struct btf_ext_header *)data;
> @@ -1004,6 +979,13 @@ struct btf_ext *btf_ext__new(__u8 *data, __u32 size)
> 	if (err)
> 		goto done;
> 
> +	/* check if there is offset_reloc_off/offset_reloc_len fields */
> +	if (btf_ext->hdr->hdr_len < sizeof(struct btf_ext_header))

This check will break when we add more optional sections to btf_ext_header.
Maybe use offsetof() instead?

> +		goto done;
> +	err = btf_ext_setup_offset_reloc(btf_ext);
> +	if (err)
> +		goto done;
> +
> done:
> 	if (err) {
> 		btf_ext__free(btf_ext);
> diff --git a/tools/lib/bpf/btf.h b/tools/lib/bpf/btf.h
> index 88a52ae56fc6..287361ee1f6b 100644
> --- a/tools/lib/bpf/btf.h
> +++ b/tools/lib/bpf/btf.h
> @@ -57,6 +57,10 @@ struct btf_ext_header {
> 	__u32	func_info_len;
> 	__u32	line_info_off;
> 	__u32	line_info_len;
> +
> +	/* optional part of .BTF.ext header */
> +	__u32	offset_reloc_off;
> +	__u32	offset_reloc_len;
> };
> 
> LIBBPF_API void btf__free(struct btf *btf);
> diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h
> index 2ac29bd36226..087ff512282f 100644
> --- a/tools/lib/bpf/libbpf_internal.h
> +++ b/tools/lib/bpf/libbpf_internal.h
> @@ -46,4 +46,95 @@ do {				\
> int libbpf__load_raw_btf(const char *raw_types, size_t types_len,
> 			 const char *str_sec, size_t str_len);
> 
> +struct btf_ext_info {
> +	/*
> +	 * info points to the individual info section (e.g. func_info and
> +	 * line_info) from the .BTF.ext. It does not include the __u32 rec_size.
> +	 */
> +	void *info;
> +	__u32 rec_size;
> +	__u32 len;
> +};
> +
> +#define for_each_btf_ext_sec(seg, sec)					\
> +	for (sec = (seg)->info;						\
> +	     (void *)sec < (seg)->info + (seg)->len;			\
> +	     sec = (void *)sec + sizeof(struct btf_ext_info_sec) +	\
> +		   (seg)->rec_size * sec->num_info)
> +
> +#define for_each_btf_ext_rec(seg, sec, i, rec)				\
> +	for (i = 0, rec = (void *)&(sec)->data;				\
> +	     i < (sec)->num_info;					\
> +	     i++, rec = (void *)rec + (seg)->rec_size)
> +
> +struct btf_ext {
> +	union {
> +		struct btf_ext_header *hdr;
> +		void *data;
> +	};
> +	struct btf_ext_info func_info;
> +	struct btf_ext_info line_info;
> +	struct btf_ext_info offset_reloc_info;
> +	__u32 data_size;
> +};
> +
> +struct btf_ext_info_sec {
> +	__u32	sec_name_off;
> +	__u32	num_info;
> +	/* Followed by num_info * record_size number of bytes */
> +	__u8	data[0];
> +};
> +
> +/* The minimum bpf_func_info checked by the loader */
> +struct bpf_func_info_min {
> +	__u32   insn_off;
> +	__u32   type_id;
> +};
> +
> +/* The minimum bpf_line_info checked by the loader */
> +struct bpf_line_info_min {
> +	__u32	insn_off;
> +	__u32	file_name_off;
> +	__u32	line_off;
> +	__u32	line_col;
> +};
> +
> +/* The minimum bpf_offset_reloc checked by the loader
> + *
> + * Offset relocation captures the following data:
> + * - insn_off - instruction offset (in bytes) within a BPF program that needs
> + *   its insn->imm field to be relocated with actual offset;
> + * - type_id - BTF type ID of the "root" (containing) entity of a relocatable
> + *   offset;
> + * - access_str_off - offset into corresponding .BTF string section. String
> + *   itself encodes an accessed field using a sequence of field and array
> + *   indicies, separated by colon (:). It's conceptually very close to LLVM's
> + *   getelementptr ([0]) instruction's arguments for identifying offset to 
> + *   a field.
> + *
> + * Example to provide a better feel.
> + *
> + *   struct sample {
> + *       int a;
> + *       struct {
> + *           int b[10];
> + *       };
> + *   };
> + * 
> + *   struct sample *s = ...;
> + *   int x = &s->a;     // encoded as "0:0" (a is field #0)
> + *   int y = &s->b[5];  // encoded as "0:1:5" (b is field #1, arr elem #5)
> + *   int z = &s[10]->b; // encoded as "10:1" (ptr is used as an array)
> + *
> + * type_id for all relocs in this example  will capture BTF type id of
> + * `struct sample`.
> + *
> + *   [0] https://llvm.org/docs/LangRef.html#getelementptr-instruction
> + */
> +struct bpf_offset_reloc {
> +	__u32   insn_off;
> +	__u32   type_id;
> +	__u32   access_str_off;
> +};
> +
> #endif /* __LIBBPF_LIBBPF_INTERNAL_H */
> -- 
> 2.17.1
> 


^ permalink raw reply

* Re: [PATCH bpf-next 5/7] sefltests/bpf: support FLOW_DISSECTOR_F_PARSE_1ST_FRAG
From: Stanislav Fomichev @ 2019-07-24 23:52 UTC (permalink / raw)
  To: Song Liu
  Cc: Stanislav Fomichev, Networking, bpf, David S . Miller,
	Alexei Starovoitov, Daniel Borkmann, Willem de Bruijn,
	Petar Penkov
In-Reply-To: <CAPhsuW6Z2Bx66ZDOV-9jW+hsxKbZJxY-YFgP0rL_4QipAuptQA@mail.gmail.com>

On 07/24, Song Liu wrote:
> On Wed, Jul 24, 2019 at 10:11 AM Stanislav Fomichev <sdf@google.com> wrote:
> >
> > bpf_flow.c: exit early unless FLOW_DISSECTOR_F_PARSE_1ST_FRAG is passed
> > in flags. Also, set ip_proto earlier, this makes sure we have correct
> > value with fragmented packets.
> >
> > Add selftest cases to test ipv4/ipv6 fragments and skip eth_get_headlen
> > tests that don't have FLOW_DISSECTOR_F_PARSE_1ST_FRAG flag.
> >
> > eth_get_headlen calls flow dissector with
> > FLOW_DISSECTOR_F_PARSE_1ST_FRAG flag so we can't run tests that
> > have different set of input flags against it.
> >
> > Cc: Willem de Bruijn <willemb@google.com>
> > Cc: Petar Penkov <ppenkov@google.com>
> > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > ---
> >  .../selftests/bpf/prog_tests/flow_dissector.c | 129 ++++++++++++++++++
> >  tools/testing/selftests/bpf/progs/bpf_flow.c  |  28 +++-
> >  2 files changed, 151 insertions(+), 6 deletions(-)
> >
> > diff --git a/tools/testing/selftests/bpf/prog_tests/flow_dissector.c b/tools/testing/selftests/bpf/prog_tests/flow_dissector.c
> > index c938283ac232..966cb3b06870 100644
> > --- a/tools/testing/selftests/bpf/prog_tests/flow_dissector.c
> > +++ b/tools/testing/selftests/bpf/prog_tests/flow_dissector.c
> > @@ -5,6 +5,10 @@
> >  #include <linux/if_tun.h>
> >  #include <sys/uio.h>
> >
> > +#ifndef IP_MF
> > +#define IP_MF 0x2000
> > +#endif
> > +
> >  #define CHECK_FLOW_KEYS(desc, got, expected)                           \
> >         CHECK_ATTR(memcmp(&got, &expected, sizeof(got)) != 0,           \
> >               desc,                                                     \
> > @@ -49,6 +53,18 @@ struct ipv6_pkt {
> >         struct tcphdr tcp;
> >  } __packed;
> >
> > +struct ipv6_frag_pkt {
> > +       struct ethhdr eth;
> > +       struct ipv6hdr iph;
> > +       struct frag_hdr {
> > +               __u8 nexthdr;
> > +               __u8 reserved;
> > +               __be16 frag_off;
> > +               __be32 identification;
> > +       } ipf;
> > +       struct tcphdr tcp;
> > +} __packed;
> > +
> >  struct dvlan_ipv6_pkt {
> >         struct ethhdr eth;
> >         __u16 vlan_tci;
> > @@ -65,9 +81,11 @@ struct test {
> >                 struct ipv4_pkt ipv4;
> >                 struct svlan_ipv4_pkt svlan_ipv4;
> >                 struct ipv6_pkt ipv6;
> > +               struct ipv6_frag_pkt ipv6_frag;
> >                 struct dvlan_ipv6_pkt dvlan_ipv6;
> >         } pkt;
> >         struct bpf_flow_keys keys;
> > +       __u32 flags;
> >  };
> >
> >  #define VLAN_HLEN      4
> > @@ -143,6 +161,102 @@ struct test tests[] = {
> >                         .n_proto = __bpf_constant_htons(ETH_P_IPV6),
> >                 },
> >         },
> > +       {
> > +               .name = "ipv4-frag",
> > +               .pkt.ipv4 = {
> > +                       .eth.h_proto = __bpf_constant_htons(ETH_P_IP),
> > +                       .iph.ihl = 5,
> > +                       .iph.protocol = IPPROTO_TCP,
> > +                       .iph.tot_len = __bpf_constant_htons(MAGIC_BYTES),
> > +                       .iph.frag_off = __bpf_constant_htons(IP_MF),
> > +                       .tcp.doff = 5,
> > +                       .tcp.source = 80,
> > +                       .tcp.dest = 8080,
> > +               },
> > +               .keys = {
> > +                       .flags = FLOW_DISSECTOR_F_PARSE_1ST_FRAG,
> > +                       .nhoff = ETH_HLEN,
> > +                       .thoff = ETH_HLEN + sizeof(struct iphdr),
> > +                       .addr_proto = ETH_P_IP,
> > +                       .ip_proto = IPPROTO_TCP,
> > +                       .n_proto = __bpf_constant_htons(ETH_P_IP),
> > +                       .is_frag = true,
> > +                       .is_first_frag = true,
> > +                       .sport = 80,
> > +                       .dport = 8080,
> > +               },
> > +               .flags = FLOW_DISSECTOR_F_PARSE_1ST_FRAG,
> > +       },
> > +       {
> > +               .name = "ipv4-no-frag",
> > +               .pkt.ipv4 = {
> > +                       .eth.h_proto = __bpf_constant_htons(ETH_P_IP),
> > +                       .iph.ihl = 5,
> > +                       .iph.protocol = IPPROTO_TCP,
> > +                       .iph.tot_len = __bpf_constant_htons(MAGIC_BYTES),
> > +                       .iph.frag_off = __bpf_constant_htons(IP_MF),
> > +                       .tcp.doff = 5,
> > +                       .tcp.source = 80,
> > +                       .tcp.dest = 8080,
> > +               },
> > +               .keys = {
> > +                       .nhoff = ETH_HLEN,
> > +                       .thoff = ETH_HLEN + sizeof(struct iphdr),
> > +                       .addr_proto = ETH_P_IP,
> > +                       .ip_proto = IPPROTO_TCP,
> > +                       .n_proto = __bpf_constant_htons(ETH_P_IP),
> > +                       .is_frag = true,
> > +                       .is_first_frag = true,
> > +               },
> > +       },
> > +       {
> > +               .name = "ipv6-frag",
> > +               .pkt.ipv6_frag = {
> > +                       .eth.h_proto = __bpf_constant_htons(ETH_P_IPV6),
> > +                       .iph.nexthdr = IPPROTO_FRAGMENT,
> > +                       .iph.payload_len = __bpf_constant_htons(MAGIC_BYTES),
> > +                       .ipf.nexthdr = IPPROTO_TCP,
> > +                       .tcp.doff = 5,
> > +                       .tcp.source = 80,
> > +                       .tcp.dest = 8080,
> > +               },
> > +               .keys = {
> > +                       .flags = FLOW_DISSECTOR_F_PARSE_1ST_FRAG,
> > +                       .nhoff = ETH_HLEN,
> > +                       .thoff = ETH_HLEN + sizeof(struct ipv6hdr) +
> > +                               sizeof(struct frag_hdr),
> > +                       .addr_proto = ETH_P_IPV6,
> > +                       .ip_proto = IPPROTO_TCP,
> > +                       .n_proto = __bpf_constant_htons(ETH_P_IPV6),
> > +                       .is_frag = true,
> > +                       .is_first_frag = true,
> > +                       .sport = 80,
> > +                       .dport = 8080,
> > +               },
> > +               .flags = FLOW_DISSECTOR_F_PARSE_1ST_FRAG,
> > +       },
> > +       {
> > +               .name = "ipv6-no-frag",
> > +               .pkt.ipv6_frag = {
> > +                       .eth.h_proto = __bpf_constant_htons(ETH_P_IPV6),
> > +                       .iph.nexthdr = IPPROTO_FRAGMENT,
> > +                       .iph.payload_len = __bpf_constant_htons(MAGIC_BYTES),
> > +                       .ipf.nexthdr = IPPROTO_TCP,
> > +                       .tcp.doff = 5,
> > +                       .tcp.source = 80,
> > +                       .tcp.dest = 8080,
> > +               },
> > +               .keys = {
> > +                       .nhoff = ETH_HLEN,
> > +                       .thoff = ETH_HLEN + sizeof(struct ipv6hdr) +
> > +                               sizeof(struct frag_hdr),
> > +                       .addr_proto = ETH_P_IPV6,
> > +                       .ip_proto = IPPROTO_TCP,
> > +                       .n_proto = __bpf_constant_htons(ETH_P_IPV6),
> > +                       .is_frag = true,
> > +                       .is_first_frag = true,
> > +               },
> > +       },
> >  };
> >
> >  static int create_tap(const char *ifname)
> > @@ -225,6 +339,13 @@ void test_flow_dissector(void)
> >                         .data_size_in = sizeof(tests[i].pkt),
> >                         .data_out = &flow_keys,
> >                 };
> > +               static struct bpf_flow_keys ctx = {};
> > +
> > +               if (tests[i].flags) {
> > +                       tattr.ctx_in = &ctx;
> > +                       tattr.ctx_size_in = sizeof(ctx);
> > +                       ctx.flags = tests[i].flags;
> > +               }
> >
> >                 err = bpf_prog_test_run_xattr(&tattr);
> >                 CHECK_ATTR(tattr.data_size_out != sizeof(flow_keys) ||
> > @@ -255,6 +376,14 @@ void test_flow_dissector(void)
> >                 struct bpf_prog_test_run_attr tattr = {};
> >                 __u32 key = 0;
> >
> > +               /* Don't run tests that are not marked as
> > +                * FLOW_DISSECTOR_F_PARSE_1ST_FRAG; eth_get_headlen
> > +                * sets this flag.
> > +                */
> > +
> > +               if (tests[i].flags != FLOW_DISSECTOR_F_PARSE_1ST_FRAG)
> > +                       continue;
> 
> Maybe test flags & FLOW_DISSECTOR_F_PARSE_1ST_FRAG == 0 instead?
> It is not necessary now, but might be useful in the future.
I'm not sure about this one. We want flags here to match flags
from eth_get_headlen:

	const unsigned int flags = FLOW_DISSECTOR_F_PARSE_1ST_FRAG;
	...
	if (!skb_flow_dissect_flow_keys_basic(..., flags))

Otherwise the test might break unexpectedly. So I'd rather manually
adjust a test here if eth_get_headlen flags change.

Maybe I should clarify the comment to signify that dependency? Because
currently it might be read as if we only care about
FLOW_DISSECTOR_F_PARSE_1ST_FRAG, but we really care about all flags
in eth_get_headlen; it just happens that it only has one right now.

^ permalink raw reply

* Re: [PATCH v4 net-next 10/19] ionic: Add management of rx filters
From: Saeed Mahameed @ 2019-07-24 23:52 UTC (permalink / raw)
  To: snelson@pensando.io, netdev@vger.kernel.org, davem@davemloft.net
In-Reply-To: <20190722214023.9513-11-snelson@pensando.io>

On Mon, 2019-07-22 at 14:40 -0700, Shannon Nelson wrote:
> Set up the infrastructure for managing Rx filters.  We can't ask the
> hardware for what filters it has, so we keep a local list of filters
> that we've pushed into the HW.
> 
> Signed-off-by: Shannon Nelson <snelson@pensando.io>
> ---
>  drivers/net/ethernet/pensando/ionic/Makefile  |   4 +-
>  .../net/ethernet/pensando/ionic/ionic_lif.c   |   6 +
>  .../net/ethernet/pensando/ionic/ionic_lif.h   |   2 +
>  .../ethernet/pensando/ionic/ionic_rx_filter.c | 143
> ++++++++++++++++++
>  .../ethernet/pensando/ionic/ionic_rx_filter.h |  35 +++++
>  5 files changed, 188 insertions(+), 2 deletions(-)
>  create mode 100644
> drivers/net/ethernet/pensando/ionic/ionic_rx_filter.c
>  create mode 100644
> drivers/net/ethernet/pensando/ionic/ionic_rx_filter.h
> 
> 

[...]

> +#define RXQ_INDEX_ANY		(0xFFFF)
> +struct rx_filter {
> +	u32 flow_id;
> +	u32 filter_id;
> +	u16 rxq_index;
> +	struct rx_filter_add_cmd cmd;
> +	struct hlist_node by_hash;
> +	struct hlist_node by_id;
> +};
> +
> +#define RX_FILTER_HASH_BITS	10
> +#define RX_FILTER_HLISTS	BIT(RX_FILTER_HASH_BITS)
> +#define RX_FILTER_HLISTS_MASK	(RX_FILTER_HLISTS - 1)
> +struct rx_filters {
> +	spinlock_t lock;				/* filter list lock
> */
> +	struct hlist_head by_hash[RX_FILTER_HLISTS];	/* by skb
> hash */
> +	struct hlist_head by_id[RX_FILTER_HLISTS];	/* by
> filter_id */
> +};
> +
> 

Following Dave's comment on this, you use too generic struct and macro
/define names, i strongly recommend to add a unique prefix to this
driver.

^ permalink raw reply

* Re: [PATCH net-next] netfilter: nf_table_offload: Fix zero prio of flow_cls_common_offload
From: Marcelo Ricardo Leitner @ 2019-07-24 23:51 UTC (permalink / raw)
  To: wenxu; +Cc: pablo, davem, netfilter-devel, netdev
In-Reply-To: <1562832210-25981-1-git-send-email-wenxu@ucloud.cn>

On Thu, Jul 11, 2019 at 04:03:30PM +0800, wenxu@ucloud.cn wrote:
> From: wenxu <wenxu@ucloud.cn>
> 
> The flow_cls_common_offload prio should be not zero
> 
> It leads the invalid table prio in hw.
> 
> # nft add table netdev firewall
> # nft add chain netdev firewall acl { type filter hook ingress device mlx_pf0vf0 priority - 300 \; }
> # nft add rule netdev firewall acl ip daddr 1.1.1.7 drop
> Error: Could not process rule: Invalid argument
> 
> kernel log
> mlx5_core 0000:81:00.0: E-Switch: Failed to create FDB Table err -22 (table prio: 65535, level: 0, size: 4194304)
> 
> Fixes: c9626a2cbdb2 ("netfilter: nf_tables: add hardware offload support")
> Signed-off-by: wenxu <wenxu@ucloud.cn>
> ---
>  net/netfilter/nf_tables_offload.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/net/netfilter/nf_tables_offload.c b/net/netfilter/nf_tables_offload.c
> index 2c33028..01d8133 100644
> --- a/net/netfilter/nf_tables_offload.c
> +++ b/net/netfilter/nf_tables_offload.c
> @@ -7,6 +7,8 @@
>  #include <net/netfilter/nf_tables_offload.h>
>  #include <net/pkt_cls.h>
>  
> +#define FLOW_OFFLOAD_DEFAUT_PRIO 1U
> +
>  static struct nft_flow_rule *nft_flow_rule_alloc(int num_actions)
>  {
>  	struct nft_flow_rule *flow;
> @@ -107,6 +109,7 @@ static void nft_flow_offload_common_init(struct flow_cls_common_offload *common,
>  					struct netlink_ext_ack *extack)
>  {
>  	common->protocol = proto;
> +	common->prio = TC_H_MAKE(FLOW_OFFLOAD_DEFAUT_PRIO << 16, 0);

Note that tc semantics for this is to auto-generate a priority in such
cases, instead of using a default.

@tc_new_tfilter():
        if (prio == 0) {
                /* If no priority is provided by the user,
                 * we allocate one.
                 */
                if (n->nlmsg_flags & NLM_F_CREATE) {
                        prio = TC_H_MAKE(0x80000000U, 0U);
                        prio_allocate = true;
...
                if (prio_allocate)
                        prio = tcf_auto_prio(tcf_chain_tp_prev(chain,
                                                               &chain_info));

>  	common->extack = extack;
>  }
>  
> -- 
> 1.8.3.1
> 

^ permalink raw reply

* Re: [PATCH v4 net-next 09/19] ionic: Add the basic NDO callbacks for netdev support
From: Saeed Mahameed @ 2019-07-24 23:45 UTC (permalink / raw)
  To: snelson@pensando.io, netdev@vger.kernel.org, davem@davemloft.net
In-Reply-To: <20190722214023.9513-10-snelson@pensando.io>

On Mon, 2019-07-22 at 14:40 -0700, Shannon Nelson wrote:
> Set up the initial NDO structure and callbacks for netdev
> to use, and register the netdev.  This will allow us to do
> a few basic operations on the device, but no traffic yet.
> 
> Signed-off-by: Shannon Nelson <snelson@pensando.io>
> ---
>  drivers/net/ethernet/pensando/ionic/ionic.h   |   1 +
>  .../ethernet/pensando/ionic/ionic_bus_pci.c   |   9 +
>  .../net/ethernet/pensando/ionic/ionic_dev.h   |   2 +
>  .../net/ethernet/pensando/ionic/ionic_lif.c   | 348
> ++++++++++++++++++
>  .../net/ethernet/pensando/ionic/ionic_lif.h   |   5 +
>  5 files changed, 365 insertions(+)
> 
> diff --git a/drivers/net/ethernet/pensando/ionic/ionic.h
> b/drivers/net/ethernet/pensando/ionic/ionic.h
> index 87ab13aee89e..d7eee79b2a10 100644
> --- a/drivers/net/ethernet/pensando/ionic/ionic.h
> +++ b/drivers/net/ethernet/pensando/ionic/ionic.h
> @@ -34,6 +34,7 @@ struct ionic {
>  	unsigned int num_bars;
>  	struct identity ident;
>  	struct list_head lifs;
> +	struct lif *master_lif;
>  	unsigned int nnqs_per_lif;
>  	unsigned int neqs_per_lif;
>  	unsigned int ntxqs_per_lif;
> diff --git a/drivers/net/ethernet/pensando/ionic/ionic_bus_pci.c
> b/drivers/net/ethernet/pensando/ionic/ionic_bus_pci.c
> index 59d1ae7ce532..98c12b770c7f 100644
> --- a/drivers/net/ethernet/pensando/ionic/ionic_bus_pci.c
> +++ b/drivers/net/ethernet/pensando/ionic/ionic_bus_pci.c
> @@ -206,8 +206,16 @@ static int ionic_probe(struct pci_dev *pdev,
> const struct pci_device_id *ent)
>  		goto err_out_free_lifs;
>  	}
>  
> +	err = ionic_lifs_register(ionic);
> +	if (err) {
> +		dev_err(dev, "Cannot register LIFs: %d, aborting\n",
> err);
> +		goto err_out_deinit_lifs;
> +	}
> +
>  	return 0;
>  
> +err_out_deinit_lifs:
> +	ionic_lifs_deinit(ionic);
>  err_out_free_lifs:
>  	ionic_lifs_free(ionic);
>  err_out_free_irqs:
> @@ -239,6 +247,7 @@ static void ionic_remove(struct pci_dev *pdev)
>  	struct ionic *ionic = pci_get_drvdata(pdev);
>  
>  	if (ionic) {
> +		ionic_lifs_unregister(ionic);
>  		ionic_lifs_deinit(ionic);
>  		ionic_lifs_free(ionic);
>  		ionic_bus_free_irq_vectors(ionic);
> diff --git a/drivers/net/ethernet/pensando/ionic/ionic_dev.h
> b/drivers/net/ethernet/pensando/ionic/ionic_dev.h
> index 8bd1501dd639..523927566925 100644
> --- a/drivers/net/ethernet/pensando/ionic/ionic_dev.h
> +++ b/drivers/net/ethernet/pensando/ionic/ionic_dev.h
> @@ -10,6 +10,8 @@
>  #include "ionic_if.h"
>  #include "ionic_regs.h"
>  
> +#define IONIC_MIN_MTU			ETH_MIN_MTU
> +#define IONIC_MAX_MTU			9194
>  #define IONIC_LIFS_MAX			1024
>  
>  struct ionic_dev_bar {
> diff --git a/drivers/net/ethernet/pensando/ionic/ionic_lif.c
> b/drivers/net/ethernet/pensando/ionic/ionic_lif.c
> index 01f9665611d4..005b1d908fa1 100644
> --- a/drivers/net/ethernet/pensando/ionic/ionic_lif.c
> +++ b/drivers/net/ethernet/pensando/ionic/ionic_lif.c
> @@ -12,8 +12,74 @@
>  #include "ionic_lif.h"
>  #include "ionic_debugfs.h"
>  
> +static int ionic_set_nic_features(struct lif *lif, netdev_features_t
> features);
>  static int ionic_notifyq_clean(struct lif *lif, int budget);
>  
> +int ionic_open(struct net_device *netdev)
> +{
> +	struct lif *lif = netdev_priv(netdev);
> +
> +	netif_carrier_off(netdev);
> +
> +	set_bit(LIF_UP, lif->state);
> +
> +	if (netif_carrier_ok(netdev))

always false ? you just invoked netif_carrier_off two lines ago.. 

> +		netif_tx_wake_all_queues(netdev);
> +
> +	return 0;
> +}
> +
> +static int ionic_lif_stop(struct lif *lif)
> +{
> +	struct net_device *ndev = lif->netdev;
> +	int err = 0;
> +
> +	if (!test_bit(LIF_UP, lif->state)) {
> +		dev_dbg(lif->ionic->dev, "%s: %s state=DOWN\n",
> +			__func__, lif->name);
> +		return 0;
> +	}
> +	dev_dbg(lif->ionic->dev, "%s: %s state=UP\n", __func__, lif-
> >name);
> +	clear_bit(LIF_UP, lif->state);
> +
> +	/* carrier off before disabling queues to avoid watchdog
> timeout */
> +	netif_carrier_off(ndev);
> +	netif_tx_stop_all_queues(ndev);
> +	netif_tx_disable(ndev);
> +	synchronize_rcu();

why synchronize_rcu ? 

> +
> +	return err;
> +}
> +
> +int ionic_stop(struct net_device *netdev)
> +{
> +	struct lif *lif = netdev_priv(netdev);
> +
> +	return ionic_lif_stop(lif);
> +}
> +
> +int ionic_reset_queues(struct lif *lif)
> +{
> +	bool running;
> +	int err = 0;
> +
> +	/* Put off the next watchdog timeout */
> +	netif_trans_update(lif->netdev);

this doesn't seem right to me also this won't help you if the next
while loop takes too long.. also netif_trans_update is marked to be
only used for legacy drivers.

> +
> +	while (test_and_set_bit(LIF_QUEUE_RESET, lif->state))
> +		usleep_range(100, 200);
> +
> +	running = netif_running(lif->netdev);
> +	if (running)
> +		err = ionic_stop(lif->netdev);
> +	if (!err && running)
> +		ionic_open(lif->netdev);
> +
> +	clear_bit(LIF_QUEUE_RESET, lif->state);
> +
> +	return err;
> +}
> +
>  static bool ionic_adminq_service(struct cq *cq, struct cq_info
> *cq_info)
>  {
>  	struct admin_comp *comp = cq_info->cq_desc;
> @@ -114,6 +180,81 @@ static int ionic_notifyq_clean(struct lif *lif,
> int budget)
>  	return work_done;
>  }
>  
> +static int ionic_set_features(struct net_device *netdev,
> +			      netdev_features_t features)
> +{
> +	struct lif *lif = netdev_priv(netdev);
> +	int err;
> +
> +	netdev_dbg(netdev, "%s: lif->features=0x%08llx
> new_features=0x%08llx\n",
> +		   __func__, (u64)lif->netdev->features,
> (u64)features);
> +
> +	err = ionic_set_nic_features(lif, features);
> +
> +	return err;
> +}
> +
> +static int ionic_set_mac_address(struct net_device *netdev, void
> *sa)
> +{
> +	netdev_info(netdev, "%s: stubbed\n", __func__);
> +	return 0;
> +}
> +
> +static int ionic_change_mtu(struct net_device *netdev, int new_mtu)
> +{
> +	struct lif *lif = netdev_priv(netdev);
> +	struct ionic_admin_ctx ctx = {
> +		.work = COMPLETION_INITIALIZER_ONSTACK(ctx.work),
> +		.cmd.lif_setattr = {
> +			.opcode = CMD_OPCODE_LIF_SETATTR,
> +			.index = cpu_to_le16(lif->index),
> +			.attr = IONIC_LIF_ATTR_MTU,
> +			.mtu = cpu_to_le32(new_mtu),
> +		},
> +	};
> +	int err;
> +
> +	err = ionic_adminq_post_wait(lif, &ctx);
> +	if (err)
> +		return err;
> +
> +	netdev->mtu = new_mtu;
> +	err = ionic_reset_queues(lif);
> +
> +	return err;
> +}
> +
> +static void ionic_tx_timeout(struct net_device *netdev)
> +{
> +	netdev_info(netdev, "%s: stubbed\n", __func__);
> +}
> +
> +static int ionic_vlan_rx_add_vid(struct net_device *netdev, __be16
> proto,
> +				 u16 vid)
> +{
> +	netdev_info(netdev, "%s: stubbed\n", __func__);
> +	return 0;
> +}
> +
> +static int ionic_vlan_rx_kill_vid(struct net_device *netdev, __be16
> proto,
> +				  u16 vid)
> +{
> +	netdev_info(netdev, "%s: stubbed\n", __func__);
> +	return 0;
> +}
> +
> +static const struct net_device_ops ionic_netdev_ops = {
> +	.ndo_open               = ionic_open,
> +	.ndo_stop               = ionic_stop,
> +	.ndo_set_features	= ionic_set_features,
> +	.ndo_set_mac_address	= ionic_set_mac_address,
> +	.ndo_validate_addr	= eth_validate_addr,
> +	.ndo_tx_timeout         = ionic_tx_timeout,
> +	.ndo_change_mtu         = ionic_change_mtu,
> +	.ndo_vlan_rx_add_vid    = ionic_vlan_rx_add_vid,
> +	.ndo_vlan_rx_kill_vid   = ionic_vlan_rx_kill_vid,
> +};
> +
>  static irqreturn_t ionic_isr(int irq, void *data)
>  {
>  	struct napi_struct *napi = data;
> @@ -388,6 +529,12 @@ static struct lif *ionic_lif_alloc(struct ionic
> *ionic, unsigned int index)
>  
>  	lif = netdev_priv(netdev);
>  	lif->netdev = netdev;
> +	ionic->master_lif = lif;
> +	netdev->netdev_ops = &ionic_netdev_ops;
> +
> +	netdev->watchdog_timeo = 2 * HZ;
> +	netdev->min_mtu = IONIC_MIN_MTU;
> +	netdev->max_mtu = IONIC_MAX_MTU;
>  
>  	lif->neqs = ionic->neqs_per_lif;
>  	lif->nxqs = ionic->ntxqs_per_lif;
> @@ -655,6 +802,177 @@ static int ionic_lif_notifyq_init(struct lif
> *lif)
>  	return 0;
>  }
>  
> +static __le64 ionic_netdev_features_to_nic(netdev_features_t
> features)
> +{
> +	u64 wanted = 0;
> +
> +	if (features & NETIF_F_HW_VLAN_CTAG_TX)
> +		wanted |= ETH_HW_VLAN_TX_TAG;
> +	if (features & NETIF_F_HW_VLAN_CTAG_RX)
> +		wanted |= ETH_HW_VLAN_RX_STRIP;
> +	if (features & NETIF_F_HW_VLAN_CTAG_FILTER)
> +		wanted |= ETH_HW_VLAN_RX_FILTER;
> +	if (features & NETIF_F_RXHASH)
> +		wanted |= ETH_HW_RX_HASH;
> +	if (features & NETIF_F_RXCSUM)
> +		wanted |= ETH_HW_RX_CSUM;
> +	if (features & NETIF_F_SG)
> +		wanted |= ETH_HW_TX_SG;
> +	if (features & NETIF_F_HW_CSUM)
> +		wanted |= ETH_HW_TX_CSUM;
> +	if (features & NETIF_F_TSO)
> +		wanted |= ETH_HW_TSO;
> +	if (features & NETIF_F_TSO6)
> +		wanted |= ETH_HW_TSO_IPV6;
> +	if (features & NETIF_F_TSO_ECN)
> +		wanted |= ETH_HW_TSO_ECN;
> +	if (features & NETIF_F_GSO_GRE)
> +		wanted |= ETH_HW_TSO_GRE;
> +	if (features & NETIF_F_GSO_GRE_CSUM)
> +		wanted |= ETH_HW_TSO_GRE_CSUM;
> +	if (features & NETIF_F_GSO_IPXIP4)
> +		wanted |= ETH_HW_TSO_IPXIP4;
> +	if (features & NETIF_F_GSO_IPXIP6)
> +		wanted |= ETH_HW_TSO_IPXIP6;
> +	if (features & NETIF_F_GSO_UDP_TUNNEL)
> +		wanted |= ETH_HW_TSO_UDP;
> +	if (features & NETIF_F_GSO_UDP_TUNNEL_CSUM)
> +		wanted |= ETH_HW_TSO_UDP_CSUM;
> +
> +	return cpu_to_le64(wanted);
> +}
> +
> +static int ionic_set_nic_features(struct lif *lif, netdev_features_t
> features)
> +{
> +	struct device *dev = lif->ionic->dev;
> +	struct ionic_admin_ctx ctx = {
> +		.work = COMPLETION_INITIALIZER_ONSTACK(ctx.work),
> +		.cmd.lif_setattr = {
> +			.opcode = CMD_OPCODE_LIF_SETATTR,
> +			.index = cpu_to_le16(lif->index),
> +			.attr = IONIC_LIF_ATTR_FEATURES,
> +		},
> +	};
> +	u64 vlan_flags = ETH_HW_VLAN_TX_TAG |
> +			 ETH_HW_VLAN_RX_STRIP |
> +			 ETH_HW_VLAN_RX_FILTER;
> +	int err;
> +
> +	ctx.cmd.lif_setattr.features =
> ionic_netdev_features_to_nic(features);
> +	err = ionic_adminq_post_wait(lif, &ctx);
> +	if (err)
> +		return err;
> +
> +	lif->hw_features = le64_to_cpu(ctx.cmd.lif_setattr.features &
> +				       ctx.comp.lif_setattr.features);
> +
> +	if ((vlan_flags & features) &&
> +	    !(vlan_flags & le64_to_cpu(ctx.comp.lif_setattr.features)))
> +		dev_info_once(lif->ionic->dev, "NIC is not supporting
> vlan offload, likely in SmartNIC mode\n");
> +
> +	if (lif->hw_features & ETH_HW_VLAN_TX_TAG)
> +		dev_dbg(dev, "feature ETH_HW_VLAN_TX_TAG\n");
> +	if (lif->hw_features & ETH_HW_VLAN_RX_STRIP)
> +		dev_dbg(dev, "feature ETH_HW_VLAN_RX_STRIP\n");
> +	if (lif->hw_features & ETH_HW_VLAN_RX_FILTER)
> +		dev_dbg(dev, "feature ETH_HW_VLAN_RX_FILTER\n");
> +	if (lif->hw_features & ETH_HW_RX_HASH)
> +		dev_dbg(dev, "feature ETH_HW_RX_HASH\n");
> +	if (lif->hw_features & ETH_HW_TX_SG)
> +		dev_dbg(dev, "feature ETH_HW_TX_SG\n");
> +	if (lif->hw_features & ETH_HW_TX_CSUM)
> +		dev_dbg(dev, "feature ETH_HW_TX_CSUM\n");
> +	if (lif->hw_features & ETH_HW_RX_CSUM)
> +		dev_dbg(dev, "feature ETH_HW_RX_CSUM\n");
> +	if (lif->hw_features & ETH_HW_TSO)
> +		dev_dbg(dev, "feature ETH_HW_TSO\n");
> +	if (lif->hw_features & ETH_HW_TSO_IPV6)
> +		dev_dbg(dev, "feature ETH_HW_TSO_IPV6\n");
> +	if (lif->hw_features & ETH_HW_TSO_ECN)
> +		dev_dbg(dev, "feature ETH_HW_TSO_ECN\n");
> +	if (lif->hw_features & ETH_HW_TSO_GRE)
> +		dev_dbg(dev, "feature ETH_HW_TSO_GRE\n");
> +	if (lif->hw_features & ETH_HW_TSO_GRE_CSUM)
> +		dev_dbg(dev, "feature ETH_HW_TSO_GRE_CSUM\n");
> +	if (lif->hw_features & ETH_HW_TSO_IPXIP4)
> +		dev_dbg(dev, "feature ETH_HW_TSO_IPXIP4\n");
> +	if (lif->hw_features & ETH_HW_TSO_IPXIP6)
> +		dev_dbg(dev, "feature ETH_HW_TSO_IPXIP6\n");
> +	if (lif->hw_features & ETH_HW_TSO_UDP)
> +		dev_dbg(dev, "feature ETH_HW_TSO_UDP\n");
> +	if (lif->hw_features & ETH_HW_TSO_UDP_CSUM)
> +		dev_dbg(dev, "feature ETH_HW_TSO_UDP_CSUM\n");
> +
> +	return 0;
> +}
> +
> +static int ionic_init_nic_features(struct lif *lif)
> +{
> +	struct net_device *netdev = lif->netdev;
> +	netdev_features_t features;
> +	int err;
> +
> +	/* set up what we expect to support by default */
> +	features = NETIF_F_HW_VLAN_CTAG_TX |
> +		   NETIF_F_HW_VLAN_CTAG_RX |
> +		   NETIF_F_HW_VLAN_CTAG_FILTER |
> +		   NETIF_F_RXHASH |
> +		   NETIF_F_SG |
> +		   NETIF_F_HW_CSUM |
> +		   NETIF_F_RXCSUM |
> +		   NETIF_F_TSO |
> +		   NETIF_F_TSO6 |
> +		   NETIF_F_TSO_ECN;
> +
> +	err = ionic_set_nic_features(lif, features);
> +	if (err)
> +		return err;
> +
> +	/* tell the netdev what we actually can support */
> +	netdev->features |= NETIF_F_HIGHDMA;
> +
> +	if (lif->hw_features & ETH_HW_VLAN_TX_TAG)
> +		netdev->hw_features |= NETIF_F_HW_VLAN_CTAG_TX;
> +	if (lif->hw_features & ETH_HW_VLAN_RX_STRIP)
> +		netdev->hw_features |= NETIF_F_HW_VLAN_CTAG_RX;
> +	if (lif->hw_features & ETH_HW_VLAN_RX_FILTER)
> +		netdev->hw_features |= NETIF_F_HW_VLAN_CTAG_FILTER;
> +	if (lif->hw_features & ETH_HW_RX_HASH)
> +		netdev->hw_features |= NETIF_F_RXHASH;
> +	if (lif->hw_features & ETH_HW_TX_SG)
> +		netdev->hw_features |= NETIF_F_SG;
> +
> +	if (lif->hw_features & ETH_HW_TX_CSUM)
> +		netdev->hw_enc_features |= NETIF_F_HW_CSUM;
> +	if (lif->hw_features & ETH_HW_RX_CSUM)
> +		netdev->hw_enc_features |= NETIF_F_RXCSUM;
> +	if (lif->hw_features & ETH_HW_TSO)
> +		netdev->hw_enc_features |= NETIF_F_TSO;
> +	if (lif->hw_features & ETH_HW_TSO_IPV6)
> +		netdev->hw_enc_features |= NETIF_F_TSO6;
> +	if (lif->hw_features & ETH_HW_TSO_ECN)
> +		netdev->hw_enc_features |= NETIF_F_TSO_ECN;
> +	if (lif->hw_features & ETH_HW_TSO_GRE)
> +		netdev->hw_enc_features |= NETIF_F_GSO_GRE;
> +	if (lif->hw_features & ETH_HW_TSO_GRE_CSUM)
> +		netdev->hw_enc_features |= NETIF_F_GSO_GRE_CSUM;
> +	if (lif->hw_features & ETH_HW_TSO_IPXIP4)
> +		netdev->hw_enc_features |= NETIF_F_GSO_IPXIP4;
> +	if (lif->hw_features & ETH_HW_TSO_IPXIP6)
> +		netdev->hw_enc_features |= NETIF_F_GSO_IPXIP6;
> +	if (lif->hw_features & ETH_HW_TSO_UDP)
> +		netdev->hw_enc_features |= NETIF_F_GSO_UDP_TUNNEL;
> +	if (lif->hw_features & ETH_HW_TSO_UDP_CSUM)
> +		netdev->hw_enc_features |= NETIF_F_GSO_UDP_TUNNEL_CSUM;
> +
> +	netdev->hw_features |= netdev->hw_enc_features;
> +	netdev->features |= netdev->hw_features;
> +
> +	netdev->priv_flags |= IFF_UNICAST_FLT;
> +
> +	return 0;
> +}
> +
>  static int ionic_lif_init(struct lif *lif)
>  {
>  	struct ionic_dev *idev = &lif->ionic->idev;
> @@ -711,6 +1029,10 @@ static int ionic_lif_init(struct lif *lif)
>  			goto err_out_notifyq_deinit;
>  	}
>  
> +	err = ionic_init_nic_features(lif);
> +	if (err)
> +		goto err_out_notifyq_deinit;
> +
>  	set_bit(LIF_INITED, lif->state);
>  
>  	return 0;
> @@ -745,6 +1067,32 @@ int ionic_lifs_init(struct ionic *ionic)
>  	return 0;
>  }
>  
> +int ionic_lifs_register(struct ionic *ionic)
> +{
> +	int err;
> +
> +	/* only register LIF0 for now */
> +	err = register_netdev(ionic->master_lif->netdev);
> +	if (err) {
> +		dev_err(ionic->dev, "Cannot register net device,
> aborting\n");
> +		return err;
> +	}
> +
> +	ionic->master_lif->registered = true;
> +
> +	return 0;
> +}
> +
> +void ionic_lifs_unregister(struct ionic *ionic)
> +{
> +	/* There is only one lif ever registered in the
> +	 * current model, so don't bother searching the
> +	 * ionic->lif for candidates to unregister
> +	 */
> +	if (ionic->master_lif->netdev->reg_state == NETREG_REGISTERED)
> +		unregister_netdev(ionic->master_lif->netdev);
> +}
> +
>  int ionic_lif_identify(struct ionic *ionic, u8 lif_type,
>  		       union lif_identity *lid)
>  {
> diff --git a/drivers/net/ethernet/pensando/ionic/ionic_lif.h
> b/drivers/net/ethernet/pensando/ionic/ionic_lif.h
> index 80eec0778f40..ef3f7340a277 100644
> --- a/drivers/net/ethernet/pensando/ionic/ionic_lif.h
> +++ b/drivers/net/ethernet/pensando/ionic/ionic_lif.h
> @@ -61,6 +61,8 @@ struct qcq {
>  
>  enum lif_state_flags {
>  	LIF_INITED,
> +	LIF_UP,
> +	LIF_QUEUE_RESET,
>  
>  	/* leave this as last */
>  	LIF_STATE_SIZE
> @@ -84,6 +86,7 @@ struct lif {
>  	u64 last_eid;
>  	unsigned int neqs;
>  	unsigned int nxqs;
> +	u64 hw_features;
>  
>  	struct lif_info *info;
>  	dma_addr_t info_pa;
> @@ -124,6 +127,8 @@ int ionic_lifs_alloc(struct ionic *ionic);
>  void ionic_lifs_free(struct ionic *ionic);
>  void ionic_lifs_deinit(struct ionic *ionic);
>  int ionic_lifs_init(struct ionic *ionic);
> +int ionic_lifs_register(struct ionic *ionic);
> +void ionic_lifs_unregister(struct ionic *ionic);
>  int ionic_lif_identify(struct ionic *ionic, u8 lif_type,
>  		       union lif_identity *lif_ident);
>  int ionic_lifs_size(struct ionic *ionic);

^ permalink raw reply

* Re: [PATCH bpf-next 7/7] selftests/bpf: support FLOW_DISSECTOR_F_STOP_AT_ENCAP
From: Song Liu @ 2019-07-24 23:29 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: Networking, bpf, David S . Miller, Alexei Starovoitov,
	Daniel Borkmann, Willem de Bruijn, Petar Penkov
In-Reply-To: <20190724170018.96659-8-sdf@google.com>

On Wed, Jul 24, 2019 at 10:11 AM Stanislav Fomichev <sdf@google.com> wrote:
>
> Exit as soon as we found that packet is encapped when
> FLOW_DISSECTOR_F_STOP_AT_ENCAP is passed.
> Add appropriate selftest cases.
>
> Cc: Willem de Bruijn <willemb@google.com>
> Cc: Petar Penkov <ppenkov@google.com>
> Signed-off-by: Stanislav Fomichev <sdf@google.com>

Acked-by: Song Liu <songliubraving@fb.com>

> ---
>  .../selftests/bpf/prog_tests/flow_dissector.c | 60 +++++++++++++++++++
>  tools/testing/selftests/bpf/progs/bpf_flow.c  |  8 +++
>  2 files changed, 68 insertions(+)
>
> diff --git a/tools/testing/selftests/bpf/prog_tests/flow_dissector.c b/tools/testing/selftests/bpf/prog_tests/flow_dissector.c
> index 1ea921c4cdc0..e382264fbc40 100644
> --- a/tools/testing/selftests/bpf/prog_tests/flow_dissector.c
> +++ b/tools/testing/selftests/bpf/prog_tests/flow_dissector.c
> @@ -41,6 +41,13 @@ struct ipv4_pkt {
>         struct tcphdr tcp;
>  } __packed;
>
> +struct ipip_pkt {
> +       struct ethhdr eth;
> +       struct iphdr iph;
> +       struct iphdr iph_inner;
> +       struct tcphdr tcp;
> +} __packed;
> +
>  struct svlan_ipv4_pkt {
>         struct ethhdr eth;
>         __u16 vlan_tci;
> @@ -82,6 +89,7 @@ struct test {
>         union {
>                 struct ipv4_pkt ipv4;
>                 struct svlan_ipv4_pkt svlan_ipv4;
> +               struct ipip_pkt ipip;
>                 struct ipv6_pkt ipv6;
>                 struct ipv6_frag_pkt ipv6_frag;
>                 struct dvlan_ipv6_pkt dvlan_ipv6;
> @@ -303,6 +311,58 @@ struct test tests[] = {
>                 },
>                 .flags = FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL,
>         },
> +       {
> +               .name = "ipip-encap",
> +               .pkt.ipip = {
> +                       .eth.h_proto = __bpf_constant_htons(ETH_P_IP),
> +                       .iph.ihl = 5,
> +                       .iph.protocol = IPPROTO_IPIP,
> +                       .iph.tot_len = __bpf_constant_htons(MAGIC_BYTES),
> +                       .iph_inner.ihl = 5,
> +                       .iph_inner.protocol = IPPROTO_TCP,
> +                       .iph_inner.tot_len = __bpf_constant_htons(MAGIC_BYTES),
> +                       .tcp.doff = 5,
> +                       .tcp.source = 80,
> +                       .tcp.dest = 8080,
> +               },
> +               .keys = {
> +                       .nhoff = 0,
> +                       .nhoff = ETH_HLEN,
> +                       .thoff = ETH_HLEN + sizeof(struct iphdr) +
> +                               sizeof(struct iphdr),
> +                       .addr_proto = ETH_P_IP,
> +                       .ip_proto = IPPROTO_TCP,
> +                       .n_proto = __bpf_constant_htons(ETH_P_IP),
> +                       .is_encap = true,
> +                       .sport = 80,
> +                       .dport = 8080,
> +               },
> +       },
> +       {
> +               .name = "ipip-no-encap",
> +               .pkt.ipip = {
> +                       .eth.h_proto = __bpf_constant_htons(ETH_P_IP),
> +                       .iph.ihl = 5,
> +                       .iph.protocol = IPPROTO_IPIP,
> +                       .iph.tot_len = __bpf_constant_htons(MAGIC_BYTES),
> +                       .iph_inner.ihl = 5,
> +                       .iph_inner.protocol = IPPROTO_TCP,
> +                       .iph_inner.tot_len = __bpf_constant_htons(MAGIC_BYTES),
> +                       .tcp.doff = 5,
> +                       .tcp.source = 80,
> +                       .tcp.dest = 8080,
> +               },
> +               .keys = {
> +                       .flags = FLOW_DISSECTOR_F_STOP_AT_ENCAP,
> +                       .nhoff = ETH_HLEN,
> +                       .thoff = ETH_HLEN + sizeof(struct iphdr),
> +                       .addr_proto = ETH_P_IP,
> +                       .ip_proto = IPPROTO_IPIP,
> +                       .n_proto = __bpf_constant_htons(ETH_P_IP),
> +                       .is_encap = true,
> +               },
> +               .flags = FLOW_DISSECTOR_F_STOP_AT_ENCAP,
> +       },
>  };
>
>  static int create_tap(const char *ifname)
> diff --git a/tools/testing/selftests/bpf/progs/bpf_flow.c b/tools/testing/selftests/bpf/progs/bpf_flow.c
> index 7d73b7bfe609..b6236cdf8564 100644
> --- a/tools/testing/selftests/bpf/progs/bpf_flow.c
> +++ b/tools/testing/selftests/bpf/progs/bpf_flow.c
> @@ -167,9 +167,15 @@ static __always_inline int parse_ip_proto(struct __sk_buff *skb, __u8 proto)
>                 return export_flow_keys(keys, BPF_OK);
>         case IPPROTO_IPIP:
>                 keys->is_encap = true;
> +               if (keys->flags & FLOW_DISSECTOR_F_STOP_AT_ENCAP)
> +                       return export_flow_keys(keys, BPF_OK);
> +
>                 return parse_eth_proto(skb, bpf_htons(ETH_P_IP));
>         case IPPROTO_IPV6:
>                 keys->is_encap = true;
> +               if (keys->flags & FLOW_DISSECTOR_F_STOP_AT_ENCAP)
> +                       return export_flow_keys(keys, BPF_OK);
> +
>                 return parse_eth_proto(skb, bpf_htons(ETH_P_IPV6));
>         case IPPROTO_GRE:
>                 gre = bpf_flow_dissect_get_header(skb, sizeof(*gre), &_gre);
> @@ -189,6 +195,8 @@ static __always_inline int parse_ip_proto(struct __sk_buff *skb, __u8 proto)
>                         keys->thoff += 4; /* Step over sequence number */
>
>                 keys->is_encap = true;
> +               if (keys->flags & FLOW_DISSECTOR_F_STOP_AT_ENCAP)
> +                       return export_flow_keys(keys, BPF_OK);
>
>                 if (gre->proto == bpf_htons(ETH_P_TEB)) {
>                         eth = bpf_flow_dissect_get_header(skb, sizeof(*eth),
> --
> 2.22.0.657.g960e92d24f-goog
>

^ permalink raw reply

* Re: [PATCH bpf-next 6/7] bpf/flow_dissector: support ipv6 flow_label and FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL
From: Song Liu @ 2019-07-24 23:28 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: Networking, bpf, David S . Miller, Alexei Starovoitov,
	Daniel Borkmann, Willem de Bruijn, Petar Penkov
In-Reply-To: <20190724170018.96659-7-sdf@google.com>

On Wed, Jul 24, 2019 at 10:11 AM Stanislav Fomichev <sdf@google.com> wrote:
>
> Add support for exporting ipv6 flow label via bpf_flow_keys.
> Export flow label from bpf_flow.c and also return early when
> FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL is passed.
>
> Cc: Willem de Bruijn <willemb@google.com>
> Cc: Petar Penkov <ppenkov@google.com>
> Signed-off-by: Stanislav Fomichev <sdf@google.com>

Acked-by: Song Liu <songliubraving@fb.com>

^ permalink raw reply

* Re: [PATCH 00/12] block/bio, fs: convert put_page() to put_user_page*()
From: John Hubbard @ 2019-07-24 23:23 UTC (permalink / raw)
  To: Christoph Hellwig, john.hubbard
  Cc: Andrew Morton, Alexander Viro, Anna Schumaker, David S . Miller,
	Dominique Martinet, Eric Van Hensbergen, Jason Gunthorpe,
	Jason Wang, Jens Axboe, Latchesar Ionkov, Michael S . Tsirkin,
	Miklos Szeredi, Trond Myklebust, Christoph Hellwig,
	Matthew Wilcox, linux-mm, LKML, ceph-devel, kvm, linux-block,
	linux-cifs, linux-fsdevel, linux-nfs, linux-rdma, netdev,
	samba-technical, v9fs-developer, virtualization
In-Reply-To: <20190724061750.GA19397@infradead.org>

On 7/23/19 11:17 PM, Christoph Hellwig wrote:
> On Tue, Jul 23, 2019 at 09:25:06PM -0700, john.hubbard@gmail.com wrote:
>> * Store, in the iov_iter, a "came from gup (get_user_pages)" parameter.
>>   Then, use the new iov_iter_get_pages_use_gup() to retrieve it when
>>   it is time to release the pages. That allows choosing between put_page()
>>   and put_user_page*().
>>
>> * Pass in one more piece of information to bio_release_pages: a "from_gup"
>>   parameter. Similar use as above.
>>
>> * Change the block layer, and several file systems, to use
>>   put_user_page*().
> 
> I think we can do this in a simple and better way.  We have 5 ITER_*
> types.  Of those ITER_DISCARD as the name suggests never uses pages, so
> we can skip handling it.  ITER_PIPE is rejected іn the direct I/O path,
> which leaves us with three.
> 
> Out of those ITER_BVEC needs a user page reference, so we want to call

               ^ ITER_IOVEC, I hope. Otherwise I'm hopeless lost. :)

> put_user_page* on it.  ITER_BVEC always already has page reference,
> which means in the block direct I/O path path we alread don't take
> a page reference.  We should extent that handling to all other calls
> of iov_iter_get_pages / iov_iter_get_pages_alloc.  I think we should
> just reject ITER_KVEC for direct I/O as well as we have no users and
> it is rather pointless.  Alternatively if we see a use for it the
> callers should always have a life page reference anyway (or might
> be on kmalloc memory), so we really should not take a reference either.
> 
> In other words:  the only time we should ever have to put a page in
> this patch is when they are user pages.  We'll need to clean up
> various bits of code for that, but that can be done gradually before
> even getting to the actual put_user_pages conversion.
> 

Sounds great. I'm part way into it and it doesn't look too bad. The main
question is where to scatter various checks and assertions, to keep
the kvecs out of direct I/0. Or at least keep the gups away from 
direct I/0.


thanks,
-- 
John Hubbard
NVIDIA

^ permalink raw reply

* Re: [PATCH v4 net-next 08/19] ionic: Add notifyq support
From: Saeed Mahameed @ 2019-07-24 23:21 UTC (permalink / raw)
  To: snelson@pensando.io, netdev@vger.kernel.org, davem@davemloft.net
In-Reply-To: <20190722214023.9513-9-snelson@pensando.io>

On Mon, 2019-07-22 at 14:40 -0700, Shannon Nelson wrote:
> The AdminQ is fine for sending messages and requests to the NIC,
> but we also need to have events published from the NIC to the
> driver.  The NotifyQ handles this for us, using the same interrupt
> as AdminQ.
> 
> Signed-off-by: Shannon Nelson <snelson@pensando.io>
> ---
>  .../ethernet/pensando/ionic/ionic_debugfs.c   |  16 ++
>  .../net/ethernet/pensando/ionic/ionic_lif.c   | 181
> +++++++++++++++++-
>  .../net/ethernet/pensando/ionic/ionic_lif.h   |   4 +
>  3 files changed, 200 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/pensando/ionic/ionic_debugfs.c
> b/drivers/net/ethernet/pensando/ionic/ionic_debugfs.c
> index 9af15c69b2a6..1d05b23de303 100644
> --- a/drivers/net/ethernet/pensando/ionic/ionic_debugfs.c
> +++ b/drivers/net/ethernet/pensando/ionic/ionic_debugfs.c
> @@ -126,6 +126,7 @@ int ionic_debugfs_add_qcq(struct lif *lif, struct
> qcq *qcq)
>  	struct debugfs_blob_wrapper *desc_blob;
>  	struct device *dev = lif->ionic->dev;
>  	struct intr *intr = &qcq->intr;
> +	struct dentry *stats_dentry;
>  	struct queue *q = &qcq->q;
>  	struct cq *cq = &qcq->cq;
>  
> @@ -219,6 +220,21 @@ int ionic_debugfs_add_qcq(struct lif *lif,
> struct qcq *qcq)
>  					intr_ctrl_regset);
>  	}
>  
> +	if (qcq->flags & QCQ_F_NOTIFYQ) {
> +		stats_dentry = debugfs_create_dir("notifyblock",
> qcq_dentry);
> +		if (IS_ERR_OR_NULL(stats_dentry))
> +			return PTR_ERR(stats_dentry);
> +
> +		debugfs_create_u64("eid", 0400, stats_dentry,
> +				   (u64 *)&lif->info->status.eid);
> +		debugfs_create_u16("link_status", 0400, stats_dentry,
> +				   (u16 *)&lif->info-
> >status.link_status);
> +		debugfs_create_u32("link_speed", 0400, stats_dentry,
> +				   (u32 *)&lif->info-
> >status.link_speed);
> +		debugfs_create_u16("link_down_count", 0400,
> stats_dentry,
> +				   (u16 *)&lif->info-
> >status.link_down_count);
> +	}
> +

you never write to these lif->info->status.xyz ..
and link state and speed are/should be available  in "ethtool <ifname>"
so this looks redundant to me. you can also use ethtool -S to report
linkdown count.

>  	return 0;
>  }
>  
> diff --git a/drivers/net/ethernet/pensando/ionic/ionic_lif.c
> b/drivers/net/ethernet/pensando/ionic/ionic_lif.c
> index 19c046502a26..01f9665611d4 100644
> --- a/drivers/net/ethernet/pensando/ionic/ionic_lif.c
> +++ b/drivers/net/ethernet/pensando/ionic/ionic_lif.c
> @@ -12,6 +12,8 @@
>  #include "ionic_lif.h"
>  #include "ionic_debugfs.h"
>  
> +static int ionic_notifyq_clean(struct lif *lif, int budget);
> +
>  static bool ionic_adminq_service(struct cq *cq, struct cq_info
> *cq_info)
>  {
>  	struct admin_comp *comp = cq_info->cq_desc;
> @@ -26,7 +28,90 @@ static bool ionic_adminq_service(struct cq *cq,
> struct cq_info *cq_info)
>  
>  static int ionic_adminq_napi(struct napi_struct *napi, int budget)
>  {
> -	return ionic_napi(napi, budget, ionic_adminq_service, NULL,
> NULL);
> +	struct lif *lif = napi_to_cq(napi)->lif;
> +	int n_work = 0;
> +	int a_work = 0;
> +
> +	if (likely(lif->notifyqcq && lif->notifyqcq->flags &
> QCQ_F_INITED))
> +		n_work = ionic_notifyq_clean(lif, budget);
> +	a_work = ionic_napi(napi, budget, ionic_adminq_service, NULL,
> NULL);
> +
> +	return max(n_work, a_work);
> +}
> +
> +static bool ionic_notifyq_service(struct cq *cq, struct cq_info
> *cq_info)
> +{
> +	union notifyq_comp *comp = cq_info->cq_desc;
> +	struct net_device *netdev;
> +	struct queue *q;
> +	struct lif *lif;
> +	u64 eid;
> +
> +	q = cq->bound_q;
> +	lif = q->info[0].cb_arg;
> +	netdev = lif->netdev;
> +	eid = le64_to_cpu(comp->event.eid);
> +
> +	/* Have we run out of new completions to process? */
> +	if (eid <= lif->last_eid)
> +		return false;
> +
> +	lif->last_eid = eid;
> +
> +	dev_dbg(lif->ionic->dev, "notifyq event:\n");
> +	dynamic_hex_dump("event ", DUMP_PREFIX_OFFSET, 16, 1,
> +			 comp, sizeof(*comp), true);
> +
> +	switch (le16_to_cpu(comp->event.ecode)) {
> +	case EVENT_OPCODE_LINK_CHANGE:
> +		netdev_info(netdev, "Notifyq EVENT_OPCODE_LINK_CHANGE
> eid=%lld\n",
> +			    eid);
> +		netdev_info(netdev,
> +			    "  link_status=%d link_speed=%d\n",
> +			    le16_to_cpu(comp->link_change.link_status),
> +			    le32_to_cpu(comp->link_change.link_speed));
> +		break;
> +	case EVENT_OPCODE_RESET:
> +		netdev_info(netdev, "Notifyq EVENT_OPCODE_RESET
> eid=%lld\n",
> +			    eid);
> +		netdev_info(netdev, "  reset_code=%d state=%d\n",
> +			    comp->reset.reset_code,
> +			    comp->reset.state);
> +		break;
> +	case EVENT_OPCODE_LOG:
> +		netdev_info(netdev, "Notifyq EVENT_OPCODE_LOG
> eid=%lld\n", eid);
> +		print_hex_dump(KERN_INFO, "notifyq ",
> DUMP_PREFIX_OFFSET, 16, 1,
> +			       comp->log.data, sizeof(comp->log.data),
> true);

So your device can generate log buffer dump into the kernel log .. 
I am not sure how acceptable this is, maybe trace buffer is more
appropriate for this.

> +		break;
> +	default:
> +		netdev_warn(netdev, "Notifyq unknown event ecode=%d
> eid=%lld\n",
> +			    comp->event.ecode, eid);
> +		break;
> +	}
> +
> +	return true;
> +}
> +
> +static int ionic_notifyq_clean(struct lif *lif, int budget)
> +{
> +	struct ionic_dev *idev = &lif->ionic->idev;
> +	struct cq *cq = &lif->notifyqcq->cq;
> +	u32 work_done;
> +
> +	work_done = ionic_cq_service(cq, budget, ionic_notifyq_service,
> +				     NULL, NULL);
> +	if (work_done)
> +		ionic_intr_credits(idev->intr_ctrl, cq->bound_intr-
> >index,
> +				   work_done,
> IONIC_INTR_CRED_RESET_COALESCE);
> +
> +	/* If we ran out of budget, there are more events
> +	 * to process and napi will reschedule us soon
> +	 */
> +	if (work_done == budget)
> +		goto return_to_napi;
> +
> +return_to_napi:
> +	return work_done;
>  }
>  
>  static irqreturn_t ionic_isr(int irq, void *data)
> @@ -62,6 +147,17 @@ static void ionic_intr_free(struct lif *lif, int
> index)
>  		clear_bit(index, lif->ionic->intrs);
>  }
>  
> +static void ionic_link_qcq_interrupts(struct qcq *src_qcq, struct
> qcq *n_qcq)
> +{
> +	if (WARN_ON(n_qcq->flags & QCQ_F_INTR)) {
> +		ionic_intr_free(n_qcq->cq.lif, n_qcq->intr.index);
> +		n_qcq->flags &= ~QCQ_F_INTR;
> +	}
> +
> +	n_qcq->intr.vector = src_qcq->intr.vector;
> +	n_qcq->intr.index = src_qcq->intr.index;
> +}
> +
>  static int ionic_qcq_alloc(struct lif *lif, unsigned int type,
>  			   unsigned int index,
>  			   const char *name, unsigned int flags,
> @@ -236,11 +332,36 @@ static int ionic_qcqs_alloc(struct lif *lif)
>  	if (err)
>  		return err;
>  
> +	if (lif->ionic->nnqs_per_lif) {
> +		flags = QCQ_F_NOTIFYQ;
> +		err = ionic_qcq_alloc(lif, IONIC_QTYPE_NOTIFYQ, 0,
> "notifyq",
> +				      flags, IONIC_NOTIFYQ_LENGTH,
> +				      sizeof(struct notifyq_cmd),
> +				      sizeof(union notifyq_comp),
> +				      0, lif->kern_pid, &lif-
> >notifyqcq);
> +		if (err)
> +			goto err_out_free_adminqcq;
> +
> +		/* Let the notifyq ride on the adminq interrupt */
> +		ionic_link_qcq_interrupts(lif->adminqcq, lif-
> >notifyqcq);
> +	}
> +
>  	return 0;
> +
> +err_out_free_adminqcq:
> +	ionic_qcq_free(lif, lif->adminqcq);
> +	lif->adminqcq = NULL;
> +
> +	return err;
>  }
>  
>  static void ionic_qcqs_free(struct lif *lif)
>  {
> +	if (lif->notifyqcq) {
> +		ionic_qcq_free(lif, lif->notifyqcq);
> +		lif->notifyqcq = NULL;
> +	}
> +
>  	if (lif->adminqcq) {
>  		ionic_qcq_free(lif, lif->adminqcq);
>  		lif->adminqcq = NULL;
> @@ -400,6 +521,7 @@ static void ionic_lif_deinit(struct lif *lif)
>  	clear_bit(LIF_INITED, lif->state);
>  
>  	napi_disable(&lif->adminqcq->napi);
> +	ionic_lif_qcq_deinit(lif, lif->notifyqcq);
>  	ionic_lif_qcq_deinit(lif, lif->adminqcq);
>  
>  	ionic_lif_reset(lif);
> @@ -484,6 +606,55 @@ static int ionic_lif_adminq_init(struct lif
> *lif)
>  	return 0;
>  }
>  
> +static int ionic_lif_notifyq_init(struct lif *lif)
> +{
> +	struct device *dev = lif->ionic->dev;
> +	struct qcq *qcq = lif->notifyqcq;
> +	struct queue *q = &qcq->q;
> +	int err;
> +
> +	struct ionic_admin_ctx ctx = {
> +		.work = COMPLETION_INITIALIZER_ONSTACK(ctx.work),
> +		.cmd.q_init = {
> +			.opcode = CMD_OPCODE_Q_INIT,
> +			.lif_index = cpu_to_le16(lif->index),
> +			.type = q->type,
> +			.index = cpu_to_le32(q->index),
> +			.flags = cpu_to_le16(IONIC_QINIT_F_IRQ |
> +					     IONIC_QINIT_F_ENA),
> +			.intr_index = cpu_to_le16(lif->adminqcq-
> >intr.index),
> +			.pid = cpu_to_le16(q->pid),
> +			.ring_size = ilog2(q->num_descs),
> +			.ring_base = cpu_to_le64(q->base_pa),
> +		}
> +	};
> +
> +	dev_dbg(dev, "notifyq_init.pid %d\n", ctx.cmd.q_init.pid);
> +	dev_dbg(dev, "notifyq_init.index %d\n", ctx.cmd.q_init.index);
> +	dev_dbg(dev, "notifyq_init.ring_base 0x%llx\n",
> ctx.cmd.q_init.ring_base);
> +	dev_dbg(dev, "notifyq_init.ring_size %d\n",
> ctx.cmd.q_init.ring_size);
> +
> +	err = ionic_adminq_post_wait(lif, &ctx);
> +	if (err)
> +		return err;
> +
> +	q->hw_type = ctx.comp.q_init.hw_type;
> +	q->hw_index = le32_to_cpu(ctx.comp.q_init.hw_index);
> +	q->dbval = IONIC_DBELL_QID(q->hw_index);
> +
> +	dev_dbg(dev, "notifyq->hw_type %d\n", q->hw_type);
> +	dev_dbg(dev, "notifyq->hw_index %d\n", q->hw_index);
> +
> +	/* preset the callback info */
> +	q->info[0].cb_arg = lif;
> +
> +	qcq->flags |= QCQ_F_INITED;
> +
> +	ionic_debugfs_add_qcq(lif, qcq);
> +
> +	return 0;
> +}
> +
>  static int ionic_lif_init(struct lif *lif)
>  {
>  	struct ionic_dev *idev = &lif->ionic->idev;
> @@ -534,10 +705,18 @@ static int ionic_lif_init(struct lif *lif)
>  	if (err)
>  		goto err_out_adminq_deinit;
>  
> +	if (lif->ionic->nnqs_per_lif) {
> +		err = ionic_lif_notifyq_init(lif);
> +		if (err)
> +			goto err_out_notifyq_deinit;
> +	}
> +
>  	set_bit(LIF_INITED, lif->state);
>  
>  	return 0;
>  
> +err_out_notifyq_deinit:
> +	ionic_lif_qcq_deinit(lif, lif->notifyqcq);
>  err_out_adminq_deinit:
>  	ionic_lif_qcq_deinit(lif, lif->adminqcq);
>  	ionic_lif_reset(lif);
> diff --git a/drivers/net/ethernet/pensando/ionic/ionic_lif.h
> b/drivers/net/ethernet/pensando/ionic/ionic_lif.h
> index 28ab92b43a64..80eec0778f40 100644
> --- a/drivers/net/ethernet/pensando/ionic/ionic_lif.h
> +++ b/drivers/net/ethernet/pensando/ionic/ionic_lif.h
> @@ -7,6 +7,7 @@
>  #include <linux/pci.h>
>  
>  #define IONIC_ADMINQ_LENGTH	16	/* must be a power of two */
> +#define IONIC_NOTIFYQ_LENGTH	64	/* must be a power of two */
>  
>  #define GET_NAPI_CNTR_IDX(work_done)	(work_done)
>  #define MAX_NUM_NAPI_CNTR	(NAPI_POLL_WEIGHT + 1)
> @@ -26,6 +27,7 @@ struct rx_stats {
>  #define QCQ_F_INITED		BIT(0)
>  #define QCQ_F_SG		BIT(1)
>  #define QCQ_F_INTR		BIT(2)
> +#define QCQ_F_NOTIFYQ		BIT(5)
>  
>  struct napi_stats {
>  	u64 poll_count;
> @@ -78,6 +80,8 @@ struct lif {
>  	u64 __iomem *kern_dbpage;
>  	spinlock_t adminq_lock;		/* lock for AdminQ operations
> */
>  	struct qcq *adminqcq;
> +	struct qcq *notifyqcq;
> +	u64 last_eid;
>  	unsigned int neqs;
>  	unsigned int nxqs;
>  

^ permalink raw reply

* Re: [PATCH bpf-next 5/7] sefltests/bpf: support FLOW_DISSECTOR_F_PARSE_1ST_FRAG
From: Song Liu @ 2019-07-24 23:21 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: Networking, bpf, David S . Miller, Alexei Starovoitov,
	Daniel Borkmann, Willem de Bruijn, Petar Penkov
In-Reply-To: <20190724170018.96659-6-sdf@google.com>

On Wed, Jul 24, 2019 at 10:11 AM Stanislav Fomichev <sdf@google.com> wrote:
>
> bpf_flow.c: exit early unless FLOW_DISSECTOR_F_PARSE_1ST_FRAG is passed
> in flags. Also, set ip_proto earlier, this makes sure we have correct
> value with fragmented packets.
>
> Add selftest cases to test ipv4/ipv6 fragments and skip eth_get_headlen
> tests that don't have FLOW_DISSECTOR_F_PARSE_1ST_FRAG flag.
>
> eth_get_headlen calls flow dissector with
> FLOW_DISSECTOR_F_PARSE_1ST_FRAG flag so we can't run tests that
> have different set of input flags against it.
>
> Cc: Willem de Bruijn <willemb@google.com>
> Cc: Petar Penkov <ppenkov@google.com>
> Signed-off-by: Stanislav Fomichev <sdf@google.com>
> ---
>  .../selftests/bpf/prog_tests/flow_dissector.c | 129 ++++++++++++++++++
>  tools/testing/selftests/bpf/progs/bpf_flow.c  |  28 +++-
>  2 files changed, 151 insertions(+), 6 deletions(-)
>
> diff --git a/tools/testing/selftests/bpf/prog_tests/flow_dissector.c b/tools/testing/selftests/bpf/prog_tests/flow_dissector.c
> index c938283ac232..966cb3b06870 100644
> --- a/tools/testing/selftests/bpf/prog_tests/flow_dissector.c
> +++ b/tools/testing/selftests/bpf/prog_tests/flow_dissector.c
> @@ -5,6 +5,10 @@
>  #include <linux/if_tun.h>
>  #include <sys/uio.h>
>
> +#ifndef IP_MF
> +#define IP_MF 0x2000
> +#endif
> +
>  #define CHECK_FLOW_KEYS(desc, got, expected)                           \
>         CHECK_ATTR(memcmp(&got, &expected, sizeof(got)) != 0,           \
>               desc,                                                     \
> @@ -49,6 +53,18 @@ struct ipv6_pkt {
>         struct tcphdr tcp;
>  } __packed;
>
> +struct ipv6_frag_pkt {
> +       struct ethhdr eth;
> +       struct ipv6hdr iph;
> +       struct frag_hdr {
> +               __u8 nexthdr;
> +               __u8 reserved;
> +               __be16 frag_off;
> +               __be32 identification;
> +       } ipf;
> +       struct tcphdr tcp;
> +} __packed;
> +
>  struct dvlan_ipv6_pkt {
>         struct ethhdr eth;
>         __u16 vlan_tci;
> @@ -65,9 +81,11 @@ struct test {
>                 struct ipv4_pkt ipv4;
>                 struct svlan_ipv4_pkt svlan_ipv4;
>                 struct ipv6_pkt ipv6;
> +               struct ipv6_frag_pkt ipv6_frag;
>                 struct dvlan_ipv6_pkt dvlan_ipv6;
>         } pkt;
>         struct bpf_flow_keys keys;
> +       __u32 flags;
>  };
>
>  #define VLAN_HLEN      4
> @@ -143,6 +161,102 @@ struct test tests[] = {
>                         .n_proto = __bpf_constant_htons(ETH_P_IPV6),
>                 },
>         },
> +       {
> +               .name = "ipv4-frag",
> +               .pkt.ipv4 = {
> +                       .eth.h_proto = __bpf_constant_htons(ETH_P_IP),
> +                       .iph.ihl = 5,
> +                       .iph.protocol = IPPROTO_TCP,
> +                       .iph.tot_len = __bpf_constant_htons(MAGIC_BYTES),
> +                       .iph.frag_off = __bpf_constant_htons(IP_MF),
> +                       .tcp.doff = 5,
> +                       .tcp.source = 80,
> +                       .tcp.dest = 8080,
> +               },
> +               .keys = {
> +                       .flags = FLOW_DISSECTOR_F_PARSE_1ST_FRAG,
> +                       .nhoff = ETH_HLEN,
> +                       .thoff = ETH_HLEN + sizeof(struct iphdr),
> +                       .addr_proto = ETH_P_IP,
> +                       .ip_proto = IPPROTO_TCP,
> +                       .n_proto = __bpf_constant_htons(ETH_P_IP),
> +                       .is_frag = true,
> +                       .is_first_frag = true,
> +                       .sport = 80,
> +                       .dport = 8080,
> +               },
> +               .flags = FLOW_DISSECTOR_F_PARSE_1ST_FRAG,
> +       },
> +       {
> +               .name = "ipv4-no-frag",
> +               .pkt.ipv4 = {
> +                       .eth.h_proto = __bpf_constant_htons(ETH_P_IP),
> +                       .iph.ihl = 5,
> +                       .iph.protocol = IPPROTO_TCP,
> +                       .iph.tot_len = __bpf_constant_htons(MAGIC_BYTES),
> +                       .iph.frag_off = __bpf_constant_htons(IP_MF),
> +                       .tcp.doff = 5,
> +                       .tcp.source = 80,
> +                       .tcp.dest = 8080,
> +               },
> +               .keys = {
> +                       .nhoff = ETH_HLEN,
> +                       .thoff = ETH_HLEN + sizeof(struct iphdr),
> +                       .addr_proto = ETH_P_IP,
> +                       .ip_proto = IPPROTO_TCP,
> +                       .n_proto = __bpf_constant_htons(ETH_P_IP),
> +                       .is_frag = true,
> +                       .is_first_frag = true,
> +               },
> +       },
> +       {
> +               .name = "ipv6-frag",
> +               .pkt.ipv6_frag = {
> +                       .eth.h_proto = __bpf_constant_htons(ETH_P_IPV6),
> +                       .iph.nexthdr = IPPROTO_FRAGMENT,
> +                       .iph.payload_len = __bpf_constant_htons(MAGIC_BYTES),
> +                       .ipf.nexthdr = IPPROTO_TCP,
> +                       .tcp.doff = 5,
> +                       .tcp.source = 80,
> +                       .tcp.dest = 8080,
> +               },
> +               .keys = {
> +                       .flags = FLOW_DISSECTOR_F_PARSE_1ST_FRAG,
> +                       .nhoff = ETH_HLEN,
> +                       .thoff = ETH_HLEN + sizeof(struct ipv6hdr) +
> +                               sizeof(struct frag_hdr),
> +                       .addr_proto = ETH_P_IPV6,
> +                       .ip_proto = IPPROTO_TCP,
> +                       .n_proto = __bpf_constant_htons(ETH_P_IPV6),
> +                       .is_frag = true,
> +                       .is_first_frag = true,
> +                       .sport = 80,
> +                       .dport = 8080,
> +               },
> +               .flags = FLOW_DISSECTOR_F_PARSE_1ST_FRAG,
> +       },
> +       {
> +               .name = "ipv6-no-frag",
> +               .pkt.ipv6_frag = {
> +                       .eth.h_proto = __bpf_constant_htons(ETH_P_IPV6),
> +                       .iph.nexthdr = IPPROTO_FRAGMENT,
> +                       .iph.payload_len = __bpf_constant_htons(MAGIC_BYTES),
> +                       .ipf.nexthdr = IPPROTO_TCP,
> +                       .tcp.doff = 5,
> +                       .tcp.source = 80,
> +                       .tcp.dest = 8080,
> +               },
> +               .keys = {
> +                       .nhoff = ETH_HLEN,
> +                       .thoff = ETH_HLEN + sizeof(struct ipv6hdr) +
> +                               sizeof(struct frag_hdr),
> +                       .addr_proto = ETH_P_IPV6,
> +                       .ip_proto = IPPROTO_TCP,
> +                       .n_proto = __bpf_constant_htons(ETH_P_IPV6),
> +                       .is_frag = true,
> +                       .is_first_frag = true,
> +               },
> +       },
>  };
>
>  static int create_tap(const char *ifname)
> @@ -225,6 +339,13 @@ void test_flow_dissector(void)
>                         .data_size_in = sizeof(tests[i].pkt),
>                         .data_out = &flow_keys,
>                 };
> +               static struct bpf_flow_keys ctx = {};
> +
> +               if (tests[i].flags) {
> +                       tattr.ctx_in = &ctx;
> +                       tattr.ctx_size_in = sizeof(ctx);
> +                       ctx.flags = tests[i].flags;
> +               }
>
>                 err = bpf_prog_test_run_xattr(&tattr);
>                 CHECK_ATTR(tattr.data_size_out != sizeof(flow_keys) ||
> @@ -255,6 +376,14 @@ void test_flow_dissector(void)
>                 struct bpf_prog_test_run_attr tattr = {};
>                 __u32 key = 0;
>
> +               /* Don't run tests that are not marked as
> +                * FLOW_DISSECTOR_F_PARSE_1ST_FRAG; eth_get_headlen
> +                * sets this flag.
> +                */
> +
> +               if (tests[i].flags != FLOW_DISSECTOR_F_PARSE_1ST_FRAG)
> +                       continue;

Maybe test flags & FLOW_DISSECTOR_F_PARSE_1ST_FRAG == 0 instead?
It is not necessary now, but might be useful in the future.

Thanks,
Song

^ permalink raw reply

* Re: [PATCH bpf-next 4/7] tools/bpf: sync bpf_flow_keys flags
From: Song Liu @ 2019-07-24 23:17 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: Networking, bpf, David S . Miller, Alexei Starovoitov,
	Daniel Borkmann, Willem de Bruijn, Petar Penkov
In-Reply-To: <20190724170018.96659-5-sdf@google.com>

On Wed, Jul 24, 2019 at 10:11 AM Stanislav Fomichev <sdf@google.com> wrote:
>
> Export bpf_flow_keys flags to tools/libbpf/selftests.
>
> Cc: Willem de Bruijn <willemb@google.com>
> Cc: Petar Penkov <ppenkov@google.com>
> Signed-off-by: Stanislav Fomichev <sdf@google.com>

Acked-by: Song Liu <songliubraving@fb.com>

> ---
>  tools/include/uapi/linux/bpf.h | 5 +++++
>  1 file changed, 5 insertions(+)
>
> diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
> index 4e455018da65..a0e1c891b56f 100644
> --- a/tools/include/uapi/linux/bpf.h
> +++ b/tools/include/uapi/linux/bpf.h
> @@ -3504,6 +3504,10 @@ enum bpf_task_fd_type {
>         BPF_FD_TYPE_URETPROBE,          /* filename + offset */
>  };
>
> +#define FLOW_DISSECTOR_F_PARSE_1ST_FRAG                (1U << 0)
> +#define FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL    (1U << 1)
> +#define FLOW_DISSECTOR_F_STOP_AT_ENCAP         (1U << 2)
> +
>  struct bpf_flow_keys {
>         __u16   nhoff;
>         __u16   thoff;
> @@ -3525,6 +3529,7 @@ struct bpf_flow_keys {
>                         __u32   ipv6_dst[4];    /* in6_addr; network order */
>                 };
>         };
> +       __u32   flags;
>  };
>
>  struct bpf_func_info {
> --
> 2.22.0.657.g960e92d24f-goog
>

^ permalink raw reply

* Re: BUG: spinlock recursion in release_sock
From: syzbot @ 2019-07-24 23:14 UTC (permalink / raw)
  To: arvid.brodin, aviadye, borisp, daniel, davejwatson, davem,
	jakub.kicinski, john.fastabend, john.hurley, linux-kernel, netdev,
	simon.horman, syzkaller-bugs, willemb, xiyou.wangcong
In-Reply-To: <000000000000464b54058e722b54@google.com>

syzbot has bisected this bug to:

commit 8822e270d697010e6a4fd42a319dbefc33db91e1
Author: John Hurley <john.hurley@netronome.com>
Date:   Sun Jul 7 14:01:54 2019 +0000

     net: core: move push MPLS functionality from OvS to core helper

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=13ca5a5c600000
start commit:   9e6dfe80 Add linux-next specific files for 20190724
git tree:       linux-next
final crash:    https://syzkaller.appspot.com/x/report.txt?x=102a5a5c600000
console output: https://syzkaller.appspot.com/x/log.txt?x=17ca5a5c600000
kernel config:  https://syzkaller.appspot.com/x/.config?x=6cbb8fc2cf2842d7
dashboard link: https://syzkaller.appspot.com/bug?extid=e67cf584b5e6b35a8ffa
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=13680594600000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=15b34144600000

Reported-by: syzbot+e67cf584b5e6b35a8ffa@syzkaller.appspotmail.com
Fixes: 8822e270d697 ("net: core: move push MPLS functionality from OvS to  
core helper")

For information about bisection process see: https://goo.gl/tpsmEJ#bisection

^ permalink raw reply

* Re: [PATCH bpf-next 3/6] bpf: keep bpf.h in sync with tools/
From: Andrii Nakryiko @ 2019-07-24 23:10 UTC (permalink / raw)
  To: Brian Vazquez
  Cc: Brian Vazquez, Alexei Starovoitov, Daniel Borkmann,
	David S . Miller, Stanislav Fomichev, Willem de Bruijn,
	Petar Penkov, open list, Networking, bpf
In-Reply-To: <20190724165803.87470-4-brianvv@google.com>

On Wed, Jul 24, 2019 at 10:10 AM Brian Vazquez <brianvv@google.com> wrote:
>
> Adds bpf_attr.dump structure to libbpf.
>
> Suggested-by: Stanislav Fomichev <sdf@google.com>
> Signed-off-by: Brian Vazquez <brianvv@google.com>
> ---
>  tools/include/uapi/linux/bpf.h | 9 +++++++++
>  tools/lib/bpf/libbpf.map       | 2 ++
>  2 files changed, 11 insertions(+)
>
> diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
> index 4e455018da65f..e127f16e4e932 100644
> --- a/tools/include/uapi/linux/bpf.h
> +++ b/tools/include/uapi/linux/bpf.h
> @@ -106,6 +106,7 @@ enum bpf_cmd {
>         BPF_TASK_FD_QUERY,
>         BPF_MAP_LOOKUP_AND_DELETE_ELEM,
>         BPF_MAP_FREEZE,
> +       BPF_MAP_DUMP,
>  };
>
>  enum bpf_map_type {
> @@ -388,6 +389,14 @@ union bpf_attr {
>                 __u64           flags;
>         };
>
> +       struct { /* struct used by BPF_MAP_DUMP command */
> +               __aligned_u64   prev_key;
> +               __aligned_u64   buf;
> +               __aligned_u64   buf_len; /* input/output: len of buf */
> +               __u64           flags;
> +               __u32           map_fd;
> +       } dump;
> +
>         struct { /* anonymous struct used by BPF_PROG_LOAD command */
>                 __u32           prog_type;      /* one of enum bpf_prog_type */
>                 __u32           insn_cnt;
> diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map
> index f9d316e873d8d..cac3723d5c45c 100644
> --- a/tools/lib/bpf/libbpf.map
> +++ b/tools/lib/bpf/libbpf.map
> @@ -183,4 +183,6 @@ LIBBPF_0.0.4 {

LIBBPF_0.0.4 is closed, this needs to go into LIBBPF_0.0.5.

>                 perf_buffer__new;
>                 perf_buffer__new_raw;
>                 perf_buffer__poll;
> +               bpf_map_dump;
> +               bpf_map_dump_flags;

As the general rule, please keep those lists of functions in alphabetical order.

>  } LIBBPF_0.0.3;
> --
> 2.22.0.657.g960e92d24f-goog
>

^ permalink raw reply

* Re: [PATCH bpf-next 2/6] bpf: add BPF_MAP_DUMP command to dump more than one entry per call
From: Song Liu @ 2019-07-24 23:04 UTC (permalink / raw)
  To: Brian Vazquez
  Cc: Brian Vazquez, Alexei Starovoitov, Daniel Borkmann,
	David S . Miller, Stanislav Fomichev, Willem de Bruijn,
	Petar Penkov, open list, Networking, bpf
In-Reply-To: <CABCgpaXz4hO=iGoswdqYBECWE5eu2AdUgms=hyfKnqz7E+ZgNg@mail.gmail.com>

On Wed, Jul 24, 2019 at 3:44 PM Brian Vazquez <brianvv.kernel@gmail.com> wrote:
>
> On Wed, Jul 24, 2019 at 2:40 PM Song Liu <liu.song.a23@gmail.com> wrote:
> >
> > On Wed, Jul 24, 2019 at 10:10 AM Brian Vazquez <brianvv@google.com> wrote:
> > >
> > > This introduces a new command to retrieve multiple number of entries
> > > from a bpf map, wrapping the existing bpf methods:
> > > map_get_next_key and map_lookup_elem
> > >
> > > To start dumping the map from the beginning you must specify NULL as
> > > the prev_key.
> > >
> > > The new API returns 0 when it successfully copied all the elements
> > > requested or it copied less because there weren't more elements to
> > > retrieved (i.e err == -ENOENT). In last scenario err will be masked to 0.
> > >
> > > On a successful call buf and buf_len will contain correct data and in
> > > case prev_key was provided (not for the first walk, since prev_key is
> > > NULL) it will contain the last_key copied into the prev_key which will
> > > simplify next call.
> > >
> > > Only when it can't find a single element it will return -ENOENT meaning
> > > that the map has been entirely walked. When an error is return buf,
> > > buf_len and prev_key shouldn't be read nor used.
> > >
> > > Because maps can be called from userspace and kernel code, this function
> > > can have a scenario where the next_key was found but by the time we
> > > try to retrieve the value the element is not there, in this case the
> > > function continues and tries to get a new next_key value, skipping the
> > > deleted key. If at some point the function find itself trap in a loop,
> > > it will return -EINTR.
> > >
> > > The function will try to fit as much as possible in the buf provided and
> > > will return -EINVAL if buf_len is smaller than elem_size.
> > >
> > > QUEUE and STACK maps are not supported.
> > >
> > > Note that map_dump doesn't guarantee that reading the entire table is
> > > consistent since this function is always racing with kernel and user code
> > > but the same behaviour is found when the entire table is walked using
> > > the current interfaces: map_get_next_key + map_lookup_elem.
> > > It is also important to note that with  a locked map, the lock is grabbed
> > > for 1 entry at the time, meaning that the returned buf might or might not
> > > be consistent.
> > >
> > > Suggested-by: Stanislav Fomichev <sdf@google.com>
> > > Signed-off-by: Brian Vazquez <brianvv@google.com>
> > > ---
> > >  include/uapi/linux/bpf.h |   9 +++
> > >  kernel/bpf/syscall.c     | 117 +++++++++++++++++++++++++++++++++++++++
> > >  2 files changed, 126 insertions(+)
> > >
> > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > > index fa1c753dcdbc7..66dab5385170d 100644
> > > --- a/include/uapi/linux/bpf.h
> > > +++ b/include/uapi/linux/bpf.h
> > > @@ -106,6 +106,7 @@ enum bpf_cmd {
> > >         BPF_TASK_FD_QUERY,
> > >         BPF_MAP_LOOKUP_AND_DELETE_ELEM,
> > >         BPF_MAP_FREEZE,
> > > +       BPF_MAP_DUMP,
> > >  };
> > >
> > >  enum bpf_map_type {
> > > @@ -388,6 +389,14 @@ union bpf_attr {
> > >                 __u64           flags;
> > >         };
> > >
> > > +       struct { /* struct used by BPF_MAP_DUMP command */
> > > +               __aligned_u64   prev_key;
> > > +               __aligned_u64   buf;
> > > +               __aligned_u64   buf_len; /* input/output: len of buf */
> > > +               __u64           flags;
> >
> > Please add explanation of flags here.
>
> got it!
>
> > Also, we need to update the
> > comments of BPF_F_LOCK for BPF_MAP_DUMP.
>
> What do you mean? I didn't get this part.

I meant, current comment says BPF_F_LOCK is for BPF_MAP_UPDATE_ELEM.
But it is also used by BPF_MAP_LOOKUP_ELEM and BPF_MAP_DUMP. So
current comment is not accurate either.

Maybe fix it while you are on it?
>
> >
> > > +               __u32           map_fd;
> > > +       } dump;
> > > +
> > >         struct { /* anonymous struct used by BPF_PROG_LOAD command */
> > >                 __u32           prog_type;      /* one of enum bpf_prog_type */
> > >                 __u32           insn_cnt;
> > > diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> > > index 86cdc2f7bb56e..0c35505aa219f 100644
> > > --- a/kernel/bpf/syscall.c
> > > +++ b/kernel/bpf/syscall.c
> > > @@ -1097,6 +1097,120 @@ static int map_get_next_key(union bpf_attr *attr)
> > >         return err;
> > >  }
> > >
> > > +/* last field in 'union bpf_attr' used by this command */
> > > +#define BPF_MAP_DUMP_LAST_FIELD dump.map_fd
> > > +
> > > +static int map_dump(union bpf_attr *attr)
> > > +{
> > > +       void __user *ukey = u64_to_user_ptr(attr->dump.prev_key);
> > > +       void __user *ubuf = u64_to_user_ptr(attr->dump.buf);
> > > +       u32 __user *ubuf_len = u64_to_user_ptr(attr->dump.buf_len);
> > > +       int ufd = attr->dump.map_fd;
> > > +       struct bpf_map *map;
> > > +       void *buf, *prev_key, *key, *value;
> > > +       u32 value_size, elem_size, buf_len, cp_len;
> > > +       struct fd f;
> > > +       int err;
> > > +       bool first_key = false;
> > > +
> > > +       if (CHECK_ATTR(BPF_MAP_DUMP))
> > > +               return -EINVAL;
> > > +
> > > +       if (attr->dump.flags & ~BPF_F_LOCK)
> > > +               return -EINVAL;
> > > +
> > > +       f = fdget(ufd);
> > > +       map = __bpf_map_get(f);
> > > +       if (IS_ERR(map))
> > > +               return PTR_ERR(map);
> > > +       if (!(map_get_sys_perms(map, f) & FMODE_CAN_READ)) {
> > > +               err = -EPERM;
> > > +               goto err_put;
> > > +       }
> > > +
> > > +       if ((attr->dump.flags & BPF_F_LOCK) &&
> > > +           !map_value_has_spin_lock(map)) {
> > > +               err = -EINVAL;
> > > +               goto err_put;
> > > +       }
> >
> > We can share these lines with map_lookup_elem(). Maybe
> > add another helper function?
>
> Which are the lines you are referring to? the dump.flags? It makes
> sense so that way when a new flag is added you only need to modify
> them in one spot.

I think I misread it. attr->dump.flags is not same as attr->flags.

So never mind.

>
> > > +
> > > +       if (map->map_type == BPF_MAP_TYPE_QUEUE ||
> > > +           map->map_type == BPF_MAP_TYPE_STACK) {
> > > +               err = -ENOTSUPP;
> > > +               goto err_put;
> > > +       }
> > > +
> > > +       value_size = bpf_map_value_size(map);
> > > +
> > > +       err = get_user(buf_len, ubuf_len);
> > > +       if (err)
> > > +               goto err_put;
> > > +
> > > +       elem_size = map->key_size + value_size;
> > > +       if (buf_len < elem_size) {
> > > +               err = -EINVAL;
> > > +               goto err_put;
> > > +       }
> > > +
> > > +       if (ukey) {
> > > +               prev_key = __bpf_copy_key(ukey, map->key_size);
> > > +               if (IS_ERR(prev_key)) {
> > > +                       err = PTR_ERR(prev_key);
> > > +                       goto err_put;
> > > +               }
> > > +       } else {
> > > +               prev_key = NULL;
> > > +               first_key = true;
> > > +       }
> > > +
> > > +       err = -ENOMEM;
> > > +       buf = kmalloc(elem_size, GFP_USER | __GFP_NOWARN);
> > > +       if (!buf)
> > > +               goto err_put;
> > > +
> > > +       key = buf;
> > > +       value = key + map->key_size;
> > > +       for (cp_len = 0; cp_len + elem_size <= buf_len;) {
> > > +               if (signal_pending(current)) {
> > > +                       err = -EINTR;
> > > +                       break;
> > > +               }
> > > +
> > > +               rcu_read_lock();
> > > +               err = map->ops->map_get_next_key(map, prev_key, key);
> >
> > If prev_key is deleted before map_get_next_key(), we get the first key
> > again. This is pretty weird.
>
> Yes, I know. But note that the current scenario happens even for the
> old interface (imagine you are walking a map from userspace and you
> tried get_next_key the prev_key was removed, you will start again from
> the beginning without noticing it).
> I tried to sent a patch in the past but I was missing some context:
> before NULL was used to get the very first_key the interface relied in
> a random (non existent) key to retrieve the first_key in the map, and
> I was told what we still have to support that scenario.

BPF_MAP_DUMP is slightly different, as you may return the first key
multiple times in the same call. Also, BPF_MAP_DUMP is new, so we
don't have to support legacy scenarios.

Since BPF_MAP_DUMP keeps a list of elements. It is possible to try
to look up previous keys. Would something down this direction work?

Thanks,
Song

^ permalink raw reply

* Re: [Intel-wired-lan] [PATCH v2] e1000e: Make speed detection on hotplugging cable more reliable
From: Brown, Aaron F @ 2019-07-24 22:52 UTC (permalink / raw)
  To: Kirsher, Jeffrey T, Kai-Heng Feng
  Cc: netdev@vger.kernel.org, intel-wired-lan@lists.osuosl.org,
	linux-kernel@vger.kernel.org
In-Reply-To: <20190715122555.11922-1-kai.heng.feng@canonical.com>

On Mon, 2019-07-15 at 20:25 +0800, Kai-Heng Feng wrote:
> After hotplugging an 1Gbps ethernet cable with 1Gbps link partner, the
> MII_BMSR may report 10Mbps, renders the network rather slow.
> 
> The issue has much lower fail rate after commit 59653e6497d1 ("e1000e:
> Make watchdog use delayed work"), which essentially introduces some
> delay before running the watchdog task.
> 
> But there's still a chance that the hotplugging event and the queued
> watchdog task gets run at the same time, then the original issue can be
> observed once again.
> 
> So let's use mod_delayed_work() to add a deterministic 1 second delay
> before running watchdog task, after an interrupt.
> 
> Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
> ---
>  drivers/net/ethernet/intel/e1000e/netdev.c | 12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)

Tested-by: Aaron Brown <aaron.f.brown@intel.com>

^ permalink raw reply

* Re: [Patch net] netrom: hold sock when setting skb->destructor
From: David Miller @ 2019-07-24 22:49 UTC (permalink / raw)
  To: xiyou.wangcong
  Cc: netdev, syzbot+622bdabb128acc33427d, syzbot+6eaef7158b19e3fec3a0,
	syzbot+9399c158fcc09b21d0d2, syzbot+a34e5f3d0300163f0c87, ralf
In-Reply-To: <20190723034122.23166-1-xiyou.wangcong@gmail.com>

From: Cong Wang <xiyou.wangcong@gmail.com>
Date: Mon, 22 Jul 2019 20:41:22 -0700

> sock_efree() releases the sock refcnt, if we don't hold this refcnt
> when setting skb->destructor to it, the refcnt would not be balanced.
> This leads to several bug reports from syzbot.
> 
> I have checked other users of sock_efree(), all of them hold the
> sock refcnt.
> 
> Fixes: c8c8218ec5af ("netrom: fix a memory leak in nr_rx_frame()")
> Reported-and-tested-by: <syzbot+622bdabb128acc33427d@syzkaller.appspotmail.com>
> Reported-and-tested-by: <syzbot+6eaef7158b19e3fec3a0@syzkaller.appspotmail.com>
> Reported-and-tested-by: <syzbot+9399c158fcc09b21d0d2@syzkaller.appspotmail.com>
> Reported-and-tested-by: <syzbot+a34e5f3d0300163f0c87@syzkaller.appspotmail.com>
> Cc: Ralf Baechle <ralf@linux-mips.org>
> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>

Applied and queued up for -stable.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox