Netdev List
 help / color / mirror / Atom feed
* Re: [RFC PATCH] kcm: hold rx mux lock when updating the receive queue.
From: Paolo Abeni @ 2018-06-05 16:06 UTC (permalink / raw)
  To: Tom Herbert, David Miller
  Cc: Linux Kernel Network Developers, Tom Herbert, ktkhai
In-Reply-To: <CALx6S353uk_W8b4ic1NYNBS--z41PT6brkwzPvZZj6J2-yEieg@mail.gmail.com>

Hi,

On Tue, 2018-06-05 at 08:35 -0700, Tom Herbert wrote:
> On Tue, Jun 5, 2018 at 7:53 AM, David Miller <davem@davemloft.net> wrote:
> > From: Paolo Abeni <pabeni@redhat.com>
> > Date: Tue,  5 Jun 2018 12:32:33 +0200
> >
> >> @@ -1157,7 +1158,9 @@ static int kcm_recvmsg(struct socket *sock, struct msghdr *msg,
> >>                       /* Finished with message */
> >>                       msg->msg_flags |= MSG_EOR;
> >>                       KCM_STATS_INCR(kcm->stats.rx_msgs);
> >> +                     spin_lock_bh(&kcm->mux->rx_lock);
> >>                       skb_unlink(skb, &sk->sk_receive_queue);
> >> +                     spin_unlock_bh(&kcm->mux->rx_lock);
> >
> > Hmmm, maybe I don't understand the corruption.
> >
> > But, skb_unlink() takes the sk->sk_receive_queue.lock which should
> > prevent SKB list corruption.
> 
> It looks like there is a case where the list is being manipulated
> without the queue lock. That is in requeue_rx_msgs where
> __skb_dequeue is being called instead of skb_dequeue which is in
> requeue_rx_msgs. requeue_rx_msgs holds the mux rx_lock which would
> explain why the suggested patch avoids the issue.

Yep, I belive this is the correct explanation. Sorry for the noise with
the previous patch, I underlooked the skb_queue lock already in place.

> Paolo, thanks for looking into this! Can you try replacing
> __skb_dequeue in requeue_rx_msgs with skb_dequeue to see if that is
> the fix.

Sure, I'll retrigger the test, and report the result here (or directly
a new patch, should the test be succesful)

Thanks,

Paolo

^ permalink raw reply

* Re: [PATCH v2 net-next 3/3] mlxsw: Add extack messages for port_{un,}split failures
From: Jiri Pirko @ 2018-06-05 15:15 UTC (permalink / raw)
  To: dsahern; +Cc: netdev, idosch, jiri, jakub.kicinski, David Ahern
In-Reply-To: <20180605151411.20310-4-dsahern@kernel.org>

Tue, Jun 05, 2018 at 05:14:11PM CEST, dsahern@kernel.org wrote:
>From: David Ahern <dsahern@gmail.com>
>
>Return messages in extack for port split/unsplit errors. e.g.,
>    $ devlink port split swp1s1 count 4
>    Error: mlxsw_spectrum: Port cannot be split further.
>    devlink answers: Invalid argument
>
>    $ devlink port unsplit swp4
>    Error: mlxsw_spectrum: Port was not split.
>    devlink answers: Invalid argument
>
>Signed-off-by: David Ahern <dsahern@gmail.com>
>Reviewed-by: Ido Schimmel <idosch@mellanox.com>

Acked-by: Jiri Pirko <jiri@mellanox.com>

^ permalink raw reply

* [PATCH v2 net-next 2/3] netdevsim: Add extack error message for devlink reload
From: dsahern @ 2018-06-05 15:14 UTC (permalink / raw)
  To: netdev; +Cc: idosch, jiri, jakub.kicinski, David Ahern
In-Reply-To: <20180605151411.20310-1-dsahern@kernel.org>

From: David Ahern <dsahern@gmail.com>

devlink reset command can fail if a FIB resource limit is set to a value
lower than the current occupancy. Return a proper message indicating the
reason for the failure.

$ devlink resource sh netdevsim/netdevsim0
netdevsim/netdevsim0:
  name IPv4 size unlimited unit entry size_min 0 size_max unlimited size_gran 1 dpipe_tables none
    resources:
      name fib size unlimited occ 43 unit entry size_min 0 size_max unlimited size_gran 1 dpipe_tables none
      name fib-rules size unlimited occ 4 unit entry size_min 0 size_max unlimited size_gran 1 dpipe_tables none
  name IPv6 size unlimited unit entry size_min 0 size_max unlimited size_gran 1 dpipe_tables none
    resources:
      name fib size unlimited occ 54 unit entry size_min 0 size_max unlimited size_gran 1 dpipe_tables none
      name fib-rules size unlimited occ 3 unit entry size_min 0 size_max unlimited size_gran 1 dpipe_tables none

$ devlink resource set netdevsim/netdevsim0 path /IPv4/fib size 40

$ devlink dev  reload netdevsim/netdevsim0
Error: netdevsim: New size is less than current occupancy.
devlink answers: Invalid argument

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
 drivers/net/netdevsim/devlink.c   | 4 ++--
 drivers/net/netdevsim/fib.c       | 9 ++++++---
 drivers/net/netdevsim/netdevsim.h | 3 ++-
 3 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/drivers/net/netdevsim/devlink.c b/drivers/net/netdevsim/devlink.c
index e8366cf372ff..ba663e5af168 100644
--- a/drivers/net/netdevsim/devlink.c
+++ b/drivers/net/netdevsim/devlink.c
@@ -163,7 +163,7 @@ static int nsim_devlink_reload(struct devlink *devlink,
 
 		err = devlink_resource_size_get(devlink, res_ids[i], &val);
 		if (!err) {
-			err = nsim_fib_set_max(net, res_ids[i], val);
+			err = nsim_fib_set_max(net, res_ids[i], val, extack);
 			if (err)
 				return err;
 		}
@@ -181,7 +181,7 @@ static void nsim_devlink_net_reset(struct net *net)
 	int i;
 
 	for (i = 0; i < ARRAY_SIZE(res_ids); ++i) {
-		if (nsim_fib_set_max(net, res_ids[i], (u64)-1)) {
+		if (nsim_fib_set_max(net, res_ids[i], (u64)-1, NULL)) {
 			pr_err("Failed to reset limit for resource %u\n",
 			       res_ids[i]);
 		}
diff --git a/drivers/net/netdevsim/fib.c b/drivers/net/netdevsim/fib.c
index 9bfe9e151e13..f61d094746c0 100644
--- a/drivers/net/netdevsim/fib.c
+++ b/drivers/net/netdevsim/fib.c
@@ -64,7 +64,8 @@ u64 nsim_fib_get_val(struct net *net, enum nsim_resource_id res_id, bool max)
 	return max ? entry->max : entry->num;
 }
 
-int nsim_fib_set_max(struct net *net, enum nsim_resource_id res_id, u64 val)
+int nsim_fib_set_max(struct net *net, enum nsim_resource_id res_id, u64 val,
+		     struct netlink_ext_ack *extack)
 {
 	struct nsim_fib_data *fib_data = net_generic(net, nsim_fib_net_id);
 	struct nsim_fib_entry *entry;
@@ -90,10 +91,12 @@ int nsim_fib_set_max(struct net *net, enum nsim_resource_id res_id, u64 val)
 	/* not allowing a new max to be less than curren occupancy
 	 * --> no means of evicting entries
 	 */
-	if (val < entry->num)
+	if (val < entry->num) {
+		NL_SET_ERR_MSG_MOD(extack, "New size is less than current occupancy");
 		err = -EINVAL;
-	else
+	} else {
 		entry->max = val;
+	}
 
 	return err;
 }
diff --git a/drivers/net/netdevsim/netdevsim.h b/drivers/net/netdevsim/netdevsim.h
index 3a8581af3b85..8ca50b72c328 100644
--- a/drivers/net/netdevsim/netdevsim.h
+++ b/drivers/net/netdevsim/netdevsim.h
@@ -126,7 +126,8 @@ void nsim_devlink_exit(void);
 int nsim_fib_init(void);
 void nsim_fib_exit(void);
 u64 nsim_fib_get_val(struct net *net, enum nsim_resource_id res_id, bool max);
-int nsim_fib_set_max(struct net *net, enum nsim_resource_id res_id, u64 val);
+int nsim_fib_set_max(struct net *net, enum nsim_resource_id res_id, u64 val,
+		     struct netlink_ext_ack *extack);
 #else
 static inline int nsim_devlink_setup(struct netdevsim *ns)
 {
-- 
2.11.0

^ permalink raw reply related

* [PATCH v2 net-next 3/3] mlxsw: Add extack messages for port_{un,}split failures
From: dsahern @ 2018-06-05 15:14 UTC (permalink / raw)
  To: netdev; +Cc: idosch, jiri, jakub.kicinski, David Ahern
In-Reply-To: <20180605151411.20310-1-dsahern@kernel.org>

From: David Ahern <dsahern@gmail.com>

Return messages in extack for port split/unsplit errors. e.g.,
    $ devlink port split swp1s1 count 4
    Error: mlxsw_spectrum: Port cannot be split further.
    devlink answers: Invalid argument

    $ devlink port unsplit swp4
    Error: mlxsw_spectrum: Port was not split.
    devlink answers: Invalid argument

Signed-off-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlxsw/core.c     | 14 ++++++++++----
 drivers/net/ethernet/mellanox/mlxsw/core.h     |  5 +++--
 drivers/net/ethernet/mellanox/mlxsw/spectrum.c | 15 ++++++++++++---
 3 files changed, 25 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/core.c b/drivers/net/ethernet/mellanox/mlxsw/core.c
index 7ed38d80bc08..f9c724752a32 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/core.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/core.c
@@ -775,11 +775,14 @@ static int mlxsw_devlink_port_split(struct devlink *devlink,
 {
 	struct mlxsw_core *mlxsw_core = devlink_priv(devlink);
 
-	if (port_index >= mlxsw_core->max_ports)
+	if (port_index >= mlxsw_core->max_ports) {
+		NL_SET_ERR_MSG_MOD(extack, "Port index exceeds maximum number of ports");
 		return -EINVAL;
+	}
 	if (!mlxsw_core->driver->port_split)
 		return -EOPNOTSUPP;
-	return mlxsw_core->driver->port_split(mlxsw_core, port_index, count);
+	return mlxsw_core->driver->port_split(mlxsw_core, port_index, count,
+					      extack);
 }
 
 static int mlxsw_devlink_port_unsplit(struct devlink *devlink,
@@ -788,11 +791,14 @@ static int mlxsw_devlink_port_unsplit(struct devlink *devlink,
 {
 	struct mlxsw_core *mlxsw_core = devlink_priv(devlink);
 
-	if (port_index >= mlxsw_core->max_ports)
+	if (port_index >= mlxsw_core->max_ports) {
+		NL_SET_ERR_MSG_MOD(extack, "Port index exceeds maximum number of ports");
 		return -EINVAL;
+	}
 	if (!mlxsw_core->driver->port_unsplit)
 		return -EOPNOTSUPP;
-	return mlxsw_core->driver->port_unsplit(mlxsw_core, port_index);
+	return mlxsw_core->driver->port_unsplit(mlxsw_core, port_index,
+						extack);
 }
 
 static int
diff --git a/drivers/net/ethernet/mellanox/mlxsw/core.h b/drivers/net/ethernet/mellanox/mlxsw/core.h
index 4a8d4c7f89d9..552cfa29c2f7 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/core.h
+++ b/drivers/net/ethernet/mellanox/mlxsw/core.h
@@ -274,8 +274,9 @@ struct mlxsw_driver {
 	int (*port_type_set)(struct mlxsw_core *mlxsw_core, u8 local_port,
 			     enum devlink_port_type new_type);
 	int (*port_split)(struct mlxsw_core *mlxsw_core, u8 local_port,
-			  unsigned int count);
-	int (*port_unsplit)(struct mlxsw_core *mlxsw_core, u8 local_port);
+			  unsigned int count, struct netlink_ext_ack *extack);
+	int (*port_unsplit)(struct mlxsw_core *mlxsw_core, u8 local_port,
+			    struct netlink_ext_ack *extack);
 	int (*sb_pool_get)(struct mlxsw_core *mlxsw_core,
 			   unsigned int sb_index, u16 pool_index,
 			   struct devlink_sb_pool_info *pool_info);
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
index fc39f22e5c70..968b88af2ef5 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
@@ -3092,7 +3092,8 @@ static void mlxsw_sp_port_unsplit_create(struct mlxsw_sp *mlxsw_sp,
 }
 
 static int mlxsw_sp_port_split(struct mlxsw_core *mlxsw_core, u8 local_port,
-			       unsigned int count)
+			       unsigned int count,
+			       struct netlink_ext_ack *extack)
 {
 	struct mlxsw_sp *mlxsw_sp = mlxsw_core_driver_priv(mlxsw_core);
 	struct mlxsw_sp_port *mlxsw_sp_port;
@@ -3104,6 +3105,7 @@ static int mlxsw_sp_port_split(struct mlxsw_core *mlxsw_core, u8 local_port,
 	if (!mlxsw_sp_port) {
 		dev_err(mlxsw_sp->bus_info->dev, "Port number \"%d\" does not exist\n",
 			local_port);
+		NL_SET_ERR_MSG_MOD(extack, "Port number does not exist");
 		return -EINVAL;
 	}
 
@@ -3112,11 +3114,13 @@ static int mlxsw_sp_port_split(struct mlxsw_core *mlxsw_core, u8 local_port,
 
 	if (count != 2 && count != 4) {
 		netdev_err(mlxsw_sp_port->dev, "Port can only be split into 2 or 4 ports\n");
+		NL_SET_ERR_MSG_MOD(extack, "Port can only be split into 2 or 4 ports");
 		return -EINVAL;
 	}
 
 	if (cur_width != MLXSW_PORT_MODULE_MAX_WIDTH) {
 		netdev_err(mlxsw_sp_port->dev, "Port cannot be split further\n");
+		NL_SET_ERR_MSG_MOD(extack, "Port cannot be split further");
 		return -EINVAL;
 	}
 
@@ -3125,6 +3129,7 @@ static int mlxsw_sp_port_split(struct mlxsw_core *mlxsw_core, u8 local_port,
 		base_port = local_port;
 		if (mlxsw_sp->ports[base_port + 1]) {
 			netdev_err(mlxsw_sp_port->dev, "Invalid split configuration\n");
+			NL_SET_ERR_MSG_MOD(extack, "Invalid split configuration");
 			return -EINVAL;
 		}
 	} else {
@@ -3132,6 +3137,7 @@ static int mlxsw_sp_port_split(struct mlxsw_core *mlxsw_core, u8 local_port,
 		if (mlxsw_sp->ports[base_port + 1] ||
 		    mlxsw_sp->ports[base_port + 3]) {
 			netdev_err(mlxsw_sp_port->dev, "Invalid split configuration\n");
+			NL_SET_ERR_MSG_MOD(extack, "Invalid split configuration");
 			return -EINVAL;
 		}
 	}
@@ -3153,7 +3159,8 @@ static int mlxsw_sp_port_split(struct mlxsw_core *mlxsw_core, u8 local_port,
 	return err;
 }
 
-static int mlxsw_sp_port_unsplit(struct mlxsw_core *mlxsw_core, u8 local_port)
+static int mlxsw_sp_port_unsplit(struct mlxsw_core *mlxsw_core, u8 local_port,
+				 struct netlink_ext_ack *extack)
 {
 	struct mlxsw_sp *mlxsw_sp = mlxsw_core_driver_priv(mlxsw_core);
 	struct mlxsw_sp_port *mlxsw_sp_port;
@@ -3165,11 +3172,13 @@ static int mlxsw_sp_port_unsplit(struct mlxsw_core *mlxsw_core, u8 local_port)
 	if (!mlxsw_sp_port) {
 		dev_err(mlxsw_sp->bus_info->dev, "Port number \"%d\" does not exist\n",
 			local_port);
+		NL_SET_ERR_MSG_MOD(extack, "Port number does not exist");
 		return -EINVAL;
 	}
 
 	if (!mlxsw_sp_port->split) {
-		netdev_err(mlxsw_sp_port->dev, "Port wasn't split\n");
+		netdev_err(mlxsw_sp_port->dev, "Port was not split\n");
+		NL_SET_ERR_MSG_MOD(extack, "Port was not split");
 		return -EINVAL;
 	}
 
-- 
2.11.0

^ permalink raw reply related

* [PATCH v2 net-next 1/3] devlink: Add extack to reload and port_{un,}split operations
From: dsahern @ 2018-06-05 15:14 UTC (permalink / raw)
  To: netdev; +Cc: idosch, jiri, jakub.kicinski, David Ahern
In-Reply-To: <20180605151411.20310-1-dsahern@kernel.org>

From: David Ahern <dsahern@gmail.com>

Add extack argument to reload, port_split and port_unsplit operations.

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlxsw/core.c       |  9 ++++++---
 drivers/net/ethernet/netronome/nfp/nfp_devlink.c |  5 +++--
 drivers/net/netdevsim/devlink.c                  |  3 ++-
 include/net/devlink.h                            |  7 ++++---
 net/core/devlink.c                               | 18 ++++++++++--------
 5 files changed, 25 insertions(+), 17 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/core.c b/drivers/net/ethernet/mellanox/mlxsw/core.c
index 8a766fe28fa0..7ed38d80bc08 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/core.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/core.c
@@ -770,7 +770,8 @@ static void mlxsw_core_driver_put(const char *kind)
 
 static int mlxsw_devlink_port_split(struct devlink *devlink,
 				    unsigned int port_index,
-				    unsigned int count)
+				    unsigned int count,
+				    struct netlink_ext_ack *extack)
 {
 	struct mlxsw_core *mlxsw_core = devlink_priv(devlink);
 
@@ -782,7 +783,8 @@ static int mlxsw_devlink_port_split(struct devlink *devlink,
 }
 
 static int mlxsw_devlink_port_unsplit(struct devlink *devlink,
-				      unsigned int port_index)
+				      unsigned int port_index,
+				      struct netlink_ext_ack *extack)
 {
 	struct mlxsw_core *mlxsw_core = devlink_priv(devlink);
 
@@ -963,7 +965,8 @@ mlxsw_devlink_sb_occ_tc_port_bind_get(struct devlink_port *devlink_port,
 						     pool_type, p_cur, p_max);
 }
 
-static int mlxsw_devlink_core_bus_device_reload(struct devlink *devlink)
+static int mlxsw_devlink_core_bus_device_reload(struct devlink *devlink,
+						struct netlink_ext_ack *extack)
 {
 	struct mlxsw_core *mlxsw_core = devlink_priv(devlink);
 	int err;
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_devlink.c b/drivers/net/ethernet/netronome/nfp/nfp_devlink.c
index 71c2edd83031..db463e20a876 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_devlink.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_devlink.c
@@ -92,7 +92,7 @@ nfp_devlink_set_lanes(struct nfp_pf *pf, unsigned int idx, unsigned int lanes)
 
 static int
 nfp_devlink_port_split(struct devlink *devlink, unsigned int port_index,
-		       unsigned int count)
+		       unsigned int count, struct netlink_ext_ack *extack)
 {
 	struct nfp_pf *pf = devlink_priv(devlink);
 	struct nfp_eth_table_port eth_port;
@@ -123,7 +123,8 @@ nfp_devlink_port_split(struct devlink *devlink, unsigned int port_index,
 }
 
 static int
-nfp_devlink_port_unsplit(struct devlink *devlink, unsigned int port_index)
+nfp_devlink_port_unsplit(struct devlink *devlink, unsigned int port_index,
+			 struct netlink_ext_ack *extack)
 {
 	struct nfp_pf *pf = devlink_priv(devlink);
 	struct nfp_eth_table_port eth_port;
diff --git a/drivers/net/netdevsim/devlink.c b/drivers/net/netdevsim/devlink.c
index bef7db5d129a..e8366cf372ff 100644
--- a/drivers/net/netdevsim/devlink.c
+++ b/drivers/net/netdevsim/devlink.c
@@ -147,7 +147,8 @@ static int devlink_resources_register(struct devlink *devlink)
 	return err;
 }
 
-static int nsim_devlink_reload(struct devlink *devlink)
+static int nsim_devlink_reload(struct devlink *devlink,
+			       struct netlink_ext_ack *extack)
 {
 	enum nsim_resource_id res_ids[] = {
 		NSIM_RESOURCE_IPV4_FIB, NSIM_RESOURCE_IPV4_FIB_RULES,
diff --git a/include/net/devlink.h b/include/net/devlink.h
index 9686a1aa4ec9..e336ea9c73df 100644
--- a/include/net/devlink.h
+++ b/include/net/devlink.h
@@ -296,12 +296,13 @@ struct devlink_resource {
 #define DEVLINK_RESOURCE_ID_PARENT_TOP 0
 
 struct devlink_ops {
-	int (*reload)(struct devlink *devlink);
+	int (*reload)(struct devlink *devlink, struct netlink_ext_ack *extack);
 	int (*port_type_set)(struct devlink_port *devlink_port,
 			     enum devlink_port_type port_type);
 	int (*port_split)(struct devlink *devlink, unsigned int port_index,
-			  unsigned int count);
-	int (*port_unsplit)(struct devlink *devlink, unsigned int port_index);
+			  unsigned int count, struct netlink_ext_ack *extack);
+	int (*port_unsplit)(struct devlink *devlink, unsigned int port_index,
+			    struct netlink_ext_ack *extack);
 	int (*sb_pool_get)(struct devlink *devlink, unsigned int sb_index,
 			   u16 pool_index,
 			   struct devlink_sb_pool_info *pool_info);
diff --git a/net/core/devlink.c b/net/core/devlink.c
index f75ee022e6b2..22099705cc41 100644
--- a/net/core/devlink.c
+++ b/net/core/devlink.c
@@ -702,12 +702,13 @@ static int devlink_nl_cmd_port_set_doit(struct sk_buff *skb,
 	return 0;
 }
 
-static int devlink_port_split(struct devlink *devlink,
-			      u32 port_index, u32 count)
+static int devlink_port_split(struct devlink *devlink, u32 port_index,
+			      u32 count, struct netlink_ext_ack *extack)
 
 {
 	if (devlink->ops && devlink->ops->port_split)
-		return devlink->ops->port_split(devlink, port_index, count);
+		return devlink->ops->port_split(devlink, port_index, count,
+						extack);
 	return -EOPNOTSUPP;
 }
 
@@ -724,14 +725,15 @@ static int devlink_nl_cmd_port_split_doit(struct sk_buff *skb,
 
 	port_index = nla_get_u32(info->attrs[DEVLINK_ATTR_PORT_INDEX]);
 	count = nla_get_u32(info->attrs[DEVLINK_ATTR_PORT_SPLIT_COUNT]);
-	return devlink_port_split(devlink, port_index, count);
+	return devlink_port_split(devlink, port_index, count, info->extack);
 }
 
-static int devlink_port_unsplit(struct devlink *devlink, u32 port_index)
+static int devlink_port_unsplit(struct devlink *devlink, u32 port_index,
+				struct netlink_ext_ack *extack)
 
 {
 	if (devlink->ops && devlink->ops->port_unsplit)
-		return devlink->ops->port_unsplit(devlink, port_index);
+		return devlink->ops->port_unsplit(devlink, port_index, extack);
 	return -EOPNOTSUPP;
 }
 
@@ -745,7 +747,7 @@ static int devlink_nl_cmd_port_unsplit_doit(struct sk_buff *skb,
 		return -EINVAL;
 
 	port_index = nla_get_u32(info->attrs[DEVLINK_ATTR_PORT_INDEX]);
-	return devlink_port_unsplit(devlink, port_index);
+	return devlink_port_unsplit(devlink, port_index, info->extack);
 }
 
 static int devlink_nl_sb_fill(struct sk_buff *msg, struct devlink *devlink,
@@ -2599,7 +2601,7 @@ static int devlink_nl_cmd_reload(struct sk_buff *skb, struct genl_info *info)
 		NL_SET_ERR_MSG_MOD(info->extack, "resources size validation failed");
 		return err;
 	}
-	return devlink->ops->reload(devlink);
+	return devlink->ops->reload(devlink, info->extack);
 }
 
 static const struct nla_policy devlink_nl_policy[DEVLINK_ATTR_MAX + 1] = {
-- 
2.11.0

^ permalink raw reply related

* [PATCH v2 net-next 0/3] devlink: Add extack messages for reload and port split/unsplit
From: dsahern @ 2018-06-05 15:14 UTC (permalink / raw)
  To: netdev; +Cc: idosch, jiri, jakub.kicinski, David Ahern

From: David Ahern <dsahern@gmail.com>

Patch 1 adds extack arg to reload, port_split and port_unsplit devlink
operations.

Patch 2 adds extack messages for reload operation in netdevsim.

Patch 3 adds extack messages to port split/unsplit in mlxsw driver.

v2
- make the extack messages align with existing dev_err

David Ahern (3):
  devlink: Add extack to reload and port_{un,}split operations
  netdevsim: Add extack error message for devlink reload
  mlxsw: Add extack messages for port_{un,}split failures

 drivers/net/ethernet/mellanox/mlxsw/core.c       | 23 ++++++++++++++++-------
 drivers/net/ethernet/mellanox/mlxsw/core.h       |  5 +++--
 drivers/net/ethernet/mellanox/mlxsw/spectrum.c   | 15 ++++++++++++---
 drivers/net/ethernet/netronome/nfp/nfp_devlink.c |  5 +++--
 drivers/net/netdevsim/devlink.c                  |  7 ++++---
 drivers/net/netdevsim/fib.c                      |  9 ++++++---
 drivers/net/netdevsim/netdevsim.h                |  3 ++-
 include/net/devlink.h                            |  7 ++++---
 net/core/devlink.c                               | 18 ++++++++++--------
 9 files changed, 60 insertions(+), 32 deletions(-)

-- 
2.11.0

^ permalink raw reply

* Re: [PATCH] netfilter: provide udp*_lib_lookup for nf_tproxy
From: Pablo Neira Ayuso @ 2018-06-05 15:12 UTC (permalink / raw)
  To: David Miller
  Cc: arnd, kuznet, yoshfuji, ecklm94, pabeni, willemb, edumazet,
	dsahern, kafai, netdev, linux-kernel
In-Reply-To: <20180605.105453.339908802413146875.davem@davemloft.net>

On Tue, Jun 05, 2018 at 10:54:53AM -0400, David Miller wrote:
> From: Arnd Bergmann <arnd@arndb.de>
> Date: Tue,  5 Jun 2018 13:40:34 +0200
> 
> > It is now possible to enable the libified nf_tproxy modules without
> > also enabling NETFILTER_XT_TARGET_TPROXY, which throws off the
> > ifdef logic in the udp core code:
> > 
> > net/ipv6/netfilter/nf_tproxy_ipv6.o: In function `nf_tproxy_get_sock_v6':
> > nf_tproxy_ipv6.c:(.text+0x1a8): undefined reference to `udp6_lib_lookup'
> > net/ipv4/netfilter/nf_tproxy_ipv4.o: In function `nf_tproxy_get_sock_v4':
> > nf_tproxy_ipv4.c:(.text+0x3d0): undefined reference to `udp4_lib_lookup'
> > 
> > We can actually simplify the conditions now to provide the two functions
> > exactly when they are needed.
> > 
> > Fixes: 45ca4e0cf273 ("netfilter: Libify xt_TPROXY")
> > Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> 
> Pablo, I'm going to apply this directly to fix the link failure.
> 
> Thanks Arnd.

Thanks David.

^ permalink raw reply

* Re: [PATCH] net: hns3: remove unused hclgevf_cfg_func_mta_filter
From: David Miller @ 2018-06-05 15:10 UTC (permalink / raw)
  To: arnd
  Cc: yisen.zhuang, salil.mehta, lipeng321, liangfuyun1, linyunsheng,
	shenjian15, wangxi11, netdev, linux-kernel
In-Reply-To: <20180605113904.1193671-1-arnd@arndb.de>

From: Arnd Bergmann <arnd@arndb.de>
Date: Tue,  5 Jun 2018 13:38:21 +0200

> The last patch apparently added a complete replacement for this
> function, but left the old one in place, which now causes a
> harmless warning:
> 
> drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c:731:12: 'hclgevf_cfg_func_mta_filter' defined but not used
> 
> I assume it can be removed.
> 
> Fixes: 3a678b5806e6 ("net: hns3: Optimize the VF's process of updating multicast MAC")
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>

Applied, thanks Arnd.

^ permalink raw reply

* Re: [PATCH net-next 1/2] ipv4: replace ip_hdr() with skb->data for optimization
From: David Miller @ 2018-06-05 15:08 UTC (permalink / raw)
  To: laoar.shao; +Cc: pabeni, edumazet, netdev, linux-kernel
In-Reply-To: <CALOAHbDLJjy-1s7Tyzwuc4WDz5qtqmoZRcKptj-7efo1E-22Vg@mail.gmail.com>

From: Yafang Shao <laoar.shao@gmail.com>
Date: Tue, 5 Jun 2018 20:29:05 +0800

> On Tue, Jun 5, 2018 at 8:20 PM, Paolo Abeni <pabeni@redhat.com> wrote:
>> On Tue, 2018-06-05 at 08:04 -0400, Yafang Shao wrote:
>>> In ip receive path, when ip header hasn't been pulled yet, ip_hdr() and
>>> skb->data are pointing to the same byte.
>>>
>>> In ip output path, when ip header is just pushed, ip_hdr() and skb->data
>>> are pointing to the same byte.
>>>
>>> As ip_hdr() is more expensive than using skb->data, so replace ip_hdr()
>>> with skb->data in these situations for optimization.
>>
>> IMHO this makes the code less readable and more error prone. Which kind
>> of performance improvement do you measure here?
>>
> 
> Correct the cc list.
> 
> Hi Paolo,
> 
> There's a "+" opertaion in ip_hdr(),  using skb->data and avoid this operation.

Paolo is asking what performance improvement did you "measure".

I don't think this can possibly show up in a benchmark at all, and I
agree the code becomes less readable, so I am not applying this,
sorry.

^ permalink raw reply

* net-next is CLOSED
From: David Miller @ 2018-06-05 15:07 UTC (permalink / raw)
  To: netdev; +Cc: linux-wireless, netfilter-devel


I only expect a bpf-next pull request from Alexei and Daniel at
this point.

Thank you.

^ permalink raw reply

* Re: [PATCH net-next 3/3] mlxsw: Add extack messages for port_{un,}split failures?
From: Jiri Pirko @ 2018-06-05 15:05 UTC (permalink / raw)
  To: David Ahern; +Cc: Ido Schimmel, dsahern, netdev, idosch, jiri, jakub.kicinski
In-Reply-To: <41305b81-34ae-7ccf-a309-e66c4ed9bcbb@gmail.com>

Tue, Jun 05, 2018 at 04:58:44PM CEST, dsahern@gmail.com wrote:
>On 6/5/18 1:18 AM, Jiri Pirko wrote:
>> Tue, Jun 05, 2018 at 10:05:28AM CEST, idosch@idosch.org wrote:
>>> On Tue, Jun 05, 2018 at 09:52:30AM +0200, Jiri Pirko wrote:
>>>> Tue, Jun 05, 2018 at 12:15:03AM CEST, dsahern@kernel.org wrote:
>>>>> 	if (!mlxsw_sp_port->split) {
>>>>> 		netdev_err(mlxsw_sp_port->dev, "Port wasn't split\n");
>>>>> +		NL_SET_ERR_MSG_MOD(extack, "Port was not split");
>>>>
>>>> I wonder if we need the dmesg for these as well. Plus it is not the same
>>>> (wasn't/was not) which is maybe confusing. Any objection against the
>>>> original dmesg messages removal?
>>>
>>> We had this discussion about three months ago and decided to keep the
>>> existing messages:
>>> https://marc.info/?l=linux-netdev&m=151982813309466&w=2
>> 
>> I forgot. Thanks for reminding me. So could we at least have the
>> messages 100% same? Thanks.
>> 
>
>ok if I convert the current message to 'was not' and avoid the
>contraction in messages?

Sure.

^ permalink raw reply

* Re: [PATCH net-next 3/3] mlxsw: Add extack messages for port_{un,}split failures?
From: David Miller @ 2018-06-05 15:05 UTC (permalink / raw)
  To: dsahern; +Cc: jiri, idosch, dsahern, netdev, idosch, jiri, jakub.kicinski
In-Reply-To: <41305b81-34ae-7ccf-a309-e66c4ed9bcbb@gmail.com>

From: David Ahern <dsahern@gmail.com>
Date: Tue, 5 Jun 2018 07:58:44 -0700

> On 6/5/18 1:18 AM, Jiri Pirko wrote:
>> Tue, Jun 05, 2018 at 10:05:28AM CEST, idosch@idosch.org wrote:
>>> On Tue, Jun 05, 2018 at 09:52:30AM +0200, Jiri Pirko wrote:
>>>> Tue, Jun 05, 2018 at 12:15:03AM CEST, dsahern@kernel.org wrote:
>>>>> 	if (!mlxsw_sp_port->split) {
>>>>> 		netdev_err(mlxsw_sp_port->dev, "Port wasn't split\n");
>>>>> +		NL_SET_ERR_MSG_MOD(extack, "Port was not split");
>>>>
>>>> I wonder if we need the dmesg for these as well. Plus it is not the same
>>>> (wasn't/was not) which is maybe confusing. Any objection against the
>>>> original dmesg messages removal?
>>>
>>> We had this discussion about three months ago and decided to keep the
>>> existing messages:
>>> https://marc.info/?l=linux-netdev&m=151982813309466&w=2
>> 
>> I forgot. Thanks for reminding me. So could we at least have the
>> messages 100% same? Thanks.
>> 
> 
> ok if I convert the current message to 'was not' and avoid the
> contraction in messages?

Sure.

^ permalink raw reply

* Re: [PATCH net-next 3/3] mlxsw: Add extack messages for port_{un,}split failures?
From: David Ahern @ 2018-06-05 14:58 UTC (permalink / raw)
  To: Jiri Pirko, Ido Schimmel; +Cc: dsahern, netdev, idosch, jiri, jakub.kicinski
In-Reply-To: <20180605081836.GD2164@nanopsycho>

On 6/5/18 1:18 AM, Jiri Pirko wrote:
> Tue, Jun 05, 2018 at 10:05:28AM CEST, idosch@idosch.org wrote:
>> On Tue, Jun 05, 2018 at 09:52:30AM +0200, Jiri Pirko wrote:
>>> Tue, Jun 05, 2018 at 12:15:03AM CEST, dsahern@kernel.org wrote:
>>>> 	if (!mlxsw_sp_port->split) {
>>>> 		netdev_err(mlxsw_sp_port->dev, "Port wasn't split\n");
>>>> +		NL_SET_ERR_MSG_MOD(extack, "Port was not split");
>>>
>>> I wonder if we need the dmesg for these as well. Plus it is not the same
>>> (wasn't/was not) which is maybe confusing. Any objection against the
>>> original dmesg messages removal?
>>
>> We had this discussion about three months ago and decided to keep the
>> existing messages:
>> https://marc.info/?l=linux-netdev&m=151982813309466&w=2
> 
> I forgot. Thanks for reminding me. So could we at least have the
> messages 100% same? Thanks.
> 

ok if I convert the current message to 'was not' and avoid the
contraction in messages?

^ permalink raw reply

* Re: [PATCH] netfilter: provide udp*_lib_lookup for nf_tproxy
From: David Miller @ 2018-06-05 14:54 UTC (permalink / raw)
  To: arnd
  Cc: pablo, kuznet, yoshfuji, ecklm94, pabeni, willemb, edumazet,
	dsahern, kafai, netdev, linux-kernel
In-Reply-To: <20180605114056.1239571-1-arnd@arndb.de>

From: Arnd Bergmann <arnd@arndb.de>
Date: Tue,  5 Jun 2018 13:40:34 +0200

> It is now possible to enable the libified nf_tproxy modules without
> also enabling NETFILTER_XT_TARGET_TPROXY, which throws off the
> ifdef logic in the udp core code:
> 
> net/ipv6/netfilter/nf_tproxy_ipv6.o: In function `nf_tproxy_get_sock_v6':
> nf_tproxy_ipv6.c:(.text+0x1a8): undefined reference to `udp6_lib_lookup'
> net/ipv4/netfilter/nf_tproxy_ipv4.o: In function `nf_tproxy_get_sock_v4':
> nf_tproxy_ipv4.c:(.text+0x3d0): undefined reference to `udp4_lib_lookup'
> 
> We can actually simplify the conditions now to provide the two functions
> exactly when they are needed.
> 
> Fixes: 45ca4e0cf273 ("netfilter: Libify xt_TPROXY")
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>

Pablo, I'm going to apply this directly to fix the link failure.

Thanks Arnd.

^ permalink raw reply

* Re: Qualcomm rmnet driver and qmi_wwan
From: Dan Williams @ 2018-06-05 14:54 UTC (permalink / raw)
  To: Daniele Palmas, Subash Abhinov Kasiviswanathan; +Cc: netdev
In-Reply-To: <CAGRyCJH06H_MLgZX4vd21F0SvjD5DhGwW4h4su_6AOKPCiD_8Q@mail.gmail.com>

On Tue, 2018-06-05 at 11:38 +0200, Daniele Palmas wrote:
> Hi,
> 
> 2018-02-21 20:47 GMT+01:00 Subash Abhinov Kasiviswanathan
> <subashab@codeaurora.org>:
> > On 2018-02-21 04:38, Daniele Palmas wrote:
> > > 
> > > Hello,
> > > 
> > > in rmnet kernel documentation I read:
> > > 
> > > "This driver can be used to register onto any physical network
> > > device in
> > > IP mode. Physical transports include USB, HSIC, PCIe and IP
> > > accelerator."
> > > 
> > > Does this mean that it can be used in association with the
> > > qmi_wwan
> > > driver?
> > > 
> > > If yes, can someone give me an hint on the steps to follow?
> > > 
> > > If not, does anyone know if it is possible to modify qmi_wwan in
> > > order
> > > to take advantage of the features provided by the rmnet driver?
> > > 
> > > In this case hint on the changes for modifying qmi_wwan are
> > > welcome.
> > > 
> > > Thanks in advance,
> > > Daniele
> > 
> > 
> > Hi
> > 
> > I havent used qmi_wwan so the following comment is based on code
> > inspection.
> > qmimux_register_device() is creating qmimux devices with usb net
> > device as
> > real_dev. The Multiplexing and aggregation header (qmimux_hdr) is
> > stripped
> > off
> > in qmimux_rx_fixup() and the packet is passed on to stack.
> > 
> > You could instead create rmnet devices with the usb netdevice as
> > real dev.
> > The packets from the usb net driver can be queued to network stack
> > directly
> > as rmnet driver will setup a RX handler. rmnet driver will process
> > the
> > packets
> > further and then queue to network stack.
> > 
> 
> in kernel documentation I read that rmnet user space configuration is
> done through librmnetctl available at
> 
> https://source.codeaurora.org/quic/la/platform/vendor/qcom-opensource
> /dataservices/tree/rmnetctl
> 
> However it seems to me that this is a bit outdated (e.g. it does not
> properly build since it is looking for kernel header
> linux/rmnet_data.h that, as far as I understand, is no more present).
> 
> Is there available a more recent version of the tool?

I'd expect that somebody (Subash?) would add support for the
rmnet/qmimux options to iproute2 via 'ip link' like exists for most
other device types.

Dan

> Thanks,
> Daniele
> 
> > --
> > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
> > a Linux Foundation Collaborative Project

^ permalink raw reply

* Re: [RFC PATCH] kcm: hold rx mux lock when updating the receive queue.
From: David Miller @ 2018-06-05 14:53 UTC (permalink / raw)
  To: pabeni; +Cc: netdev, tom, ktkhai
In-Reply-To: <fa80bc9f24e40e1a7a7fa1452330b7f0b7d6e1fe.1528194606.git.pabeni@redhat.com>

From: Paolo Abeni <pabeni@redhat.com>
Date: Tue,  5 Jun 2018 12:32:33 +0200

> @@ -1157,7 +1158,9 @@ static int kcm_recvmsg(struct socket *sock, struct msghdr *msg,
>  			/* Finished with message */
>  			msg->msg_flags |= MSG_EOR;
>  			KCM_STATS_INCR(kcm->stats.rx_msgs);
> +			spin_lock_bh(&kcm->mux->rx_lock);
>  			skb_unlink(skb, &sk->sk_receive_queue);
> +			spin_unlock_bh(&kcm->mux->rx_lock);

Hmmm, maybe I don't understand the corruption.

But, skb_unlink() takes the sk->sk_receive_queue.lock which should
prevent SKB list corruption.

^ permalink raw reply

* [PATCH bpf-next v4 2/2] samples/bpf: Add xdp_sample_pkts example
From: Toke Høiland-Jørgensen @ 2018-06-05 14:50 UTC (permalink / raw)
  To: netdev
In-Reply-To: <152821020087.23694.8231039605257373797.stgit@alrua-kau>

Add an example program showing how to sample packets from XDP using the
perf event buffer. The example userspace program just prints the ethernet
header for every packet sampled.

Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
---
 samples/bpf/Makefile               |    4 +
 samples/bpf/xdp_sample_pkts_kern.c |   62 +++++++++++++
 samples/bpf/xdp_sample_pkts_user.c |  176 ++++++++++++++++++++++++++++++++++++
 3 files changed, 242 insertions(+)
 create mode 100644 samples/bpf/xdp_sample_pkts_kern.c
 create mode 100644 samples/bpf/xdp_sample_pkts_user.c

diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
index 1303af10e54d..9ea2f7b64869 100644
--- a/samples/bpf/Makefile
+++ b/samples/bpf/Makefile
@@ -52,6 +52,7 @@ hostprogs-y += xdp_adjust_tail
 hostprogs-y += xdpsock
 hostprogs-y += xdp_fwd
 hostprogs-y += task_fd_query
+hostprogs-y += xdp_sample_pkts
 
 # Libbpf dependencies
 LIBBPF = $(TOOLS_PATH)/lib/bpf/libbpf.a
@@ -107,6 +108,7 @@ xdp_adjust_tail-objs := xdp_adjust_tail_user.o
 xdpsock-objs := bpf_load.o xdpsock_user.o
 xdp_fwd-objs := bpf_load.o xdp_fwd_user.o
 task_fd_query-objs := bpf_load.o task_fd_query_user.o $(TRACE_HELPERS)
+xdp_sample_pkts-objs := xdp_sample_pkts_user.o $(TRACE_HELPERS)
 
 # Tell kbuild to always build the programs
 always := $(hostprogs-y)
@@ -163,6 +165,7 @@ always += xdp_adjust_tail_kern.o
 always += xdpsock_kern.o
 always += xdp_fwd_kern.o
 always += task_fd_query_kern.o
+always += xdp_sample_pkts_kern.o
 
 HOSTCFLAGS += -I$(objtree)/usr/include
 HOSTCFLAGS += -I$(srctree)/tools/lib/
@@ -179,6 +182,7 @@ HOSTCFLAGS_spintest_user.o += -I$(srctree)/tools/lib/bpf/
 HOSTCFLAGS_trace_event_user.o += -I$(srctree)/tools/lib/bpf/
 HOSTCFLAGS_sampleip_user.o += -I$(srctree)/tools/lib/bpf/
 HOSTCFLAGS_task_fd_query_user.o += -I$(srctree)/tools/lib/bpf/
+HOSTCFLAGS_xdp_sample_pkts_user.o += -I$(srctree)/tools/lib/bpf/
 
 HOST_LOADLIBES		+= $(LIBBPF) -lelf
 HOSTLOADLIBES_tracex4		+= -lrt
diff --git a/samples/bpf/xdp_sample_pkts_kern.c b/samples/bpf/xdp_sample_pkts_kern.c
new file mode 100644
index 000000000000..4560522ca015
--- /dev/null
+++ b/samples/bpf/xdp_sample_pkts_kern.c
@@ -0,0 +1,62 @@
+#include <linux/ptrace.h>
+#include <linux/version.h>
+#include <uapi/linux/bpf.h>
+#include "bpf_helpers.h"
+
+#define SAMPLE_SIZE 64ul
+#define MAX_CPUS 24
+
+#define bpf_printk(fmt, ...)					\
+({								\
+	       char ____fmt[] = fmt;				\
+	       bpf_trace_printk(____fmt, sizeof(____fmt),	\
+				##__VA_ARGS__);			\
+})
+
+struct bpf_map_def SEC("maps") my_map = {
+	.type = BPF_MAP_TYPE_PERF_EVENT_ARRAY,
+	.key_size = sizeof(int),
+	.value_size = sizeof(u32),
+	.max_entries = MAX_CPUS,
+};
+
+SEC("xdp_sample")
+int xdp_sample_prog(struct xdp_md *ctx)
+{
+	void *data_end = (void *)(long)ctx->data_end;
+	void *data = (void *)(long)ctx->data;
+
+        /* Metadata will be in the perf event before the packet data. */
+	struct S {
+		u16 cookie;
+		u16 pkt_len;
+	} __attribute__((packed)) metadata;
+
+	if (data + SAMPLE_SIZE < data_end) {
+		/* The XDP perf_event_output handler will use the upper 32 bits
+		 * of the flags argument as a number of bytes to include of the
+		 * packet payload in the event data. If the size is too big, the
+		 * call to bpf_perf_event_output will fail and return -EFAULT.
+		 *
+		 * See bpf_xdp_event_output in net/core/filter.c.
+		 *
+		 * The BPF_F_CURRENT_CPU flag means that the event output fd
+		 * will be indexed by the CPU number in the event map.
+		 */
+		u64 flags = (SAMPLE_SIZE << 32) | BPF_F_CURRENT_CPU;
+		int ret;
+
+		metadata.cookie = 0xdead;
+		metadata.pkt_len = (u16)(data_end - data);
+
+		ret = bpf_perf_event_output(ctx, &my_map, flags,
+				      &metadata, sizeof(metadata));
+		if(ret)
+			bpf_printk("perf_event_output failed: %d\n", ret);
+	}
+
+	return XDP_PASS;
+}
+
+char _license[] SEC("license") = "GPL";
+u32 _version SEC("version") = LINUX_VERSION_CODE;
diff --git a/samples/bpf/xdp_sample_pkts_user.c b/samples/bpf/xdp_sample_pkts_user.c
new file mode 100644
index 000000000000..672392d48ce3
--- /dev/null
+++ b/samples/bpf/xdp_sample_pkts_user.c
@@ -0,0 +1,176 @@
+/* This program is free software; you can redistribute it and/or
+ * modify it under the terms of version 2 of the GNU General Public
+ * License as published by the Free Software Foundation.
+ */
+#include <stdio.h>
+#include <unistd.h>
+#include <stdlib.h>
+#include <stdbool.h>
+#include <string.h>
+#include <fcntl.h>
+#include <poll.h>
+#include <linux/perf_event.h>
+#include <linux/bpf.h>
+#include <net/if.h>
+#include <errno.h>
+#include <assert.h>
+#include <sys/sysinfo.h>
+#include <sys/syscall.h>
+#include <sys/ioctl.h>
+#include <sys/mman.h>
+#include <time.h>
+#include <signal.h>
+#include <libbpf.h>
+#include <bpf/bpf.h>
+
+#include "perf-sys.h"
+#include "trace_helpers.h"
+
+#define MAX_CPUS 24
+static int pmu_fds[MAX_CPUS], if_idx = 0;
+static struct perf_event_mmap_page *headers[MAX_CPUS];
+static char *if_name;
+
+static int do_attach(int idx, int fd, const char *name)
+{
+	int err;
+
+	err = bpf_set_link_xdp_fd(idx, fd, 0);
+	if (err < 0)
+		printf("ERROR: failed to attach program to %s\n", name);
+
+	return err;
+}
+
+static int do_detach(int idx, const char *name)
+{
+	int err;
+
+	err = bpf_set_link_xdp_fd(idx, -1, 0);
+	if (err < 0)
+		printf("ERROR: failed to detach program from %s\n", name);
+
+	return err;
+}
+
+#define SAMPLE_SIZE 64
+
+static int print_bpf_output(void *data, int size)
+{
+	struct {
+		__u16 cookie;
+		__u16 pkt_len;
+		__u8  pkt_data[SAMPLE_SIZE];
+	} __attribute__((packed)) *e = data;
+	int i;
+
+	if (e->cookie != 0xdead) {
+		printf("BUG cookie %x sized %d\n",
+		       e->cookie, size);
+		return LIBBPF_PERF_EVENT_ERROR;
+	}
+
+	printf("Pkt len: %-5d bytes. Ethernet hdr: ", e->pkt_len);
+	for (i = 0; i < 14 && i < e->pkt_len; i++)
+		printf("%02x ", e->pkt_data[i]);
+	printf("\n");
+
+	return LIBBPF_PERF_EVENT_CONT;
+}
+
+static void test_bpf_perf_event(int map_fd, int num)
+{
+	struct perf_event_attr attr = {
+		.sample_type = PERF_SAMPLE_RAW,
+		.type = PERF_TYPE_SOFTWARE,
+		.config = PERF_COUNT_SW_BPF_OUTPUT,
+		.wakeup_events = 1, /* get an fd notification for every event */
+	};
+	int i;
+
+	for (i = 0; i < num; i++) {
+		int key = i;
+
+		pmu_fds[i] = sys_perf_event_open(&attr, -1/*pid*/, i/*cpu*/, -1/*group_fd*/, 0);
+
+		assert(pmu_fds[i] >= 0);
+		assert(bpf_map_update_elem(map_fd, &key, &pmu_fds[i], BPF_ANY) == 0);
+		ioctl(pmu_fds[i], PERF_EVENT_IOC_ENABLE, 0);
+	}
+}
+
+static void sig_handler(int signo)
+{
+	do_detach(if_idx, if_name);
+	exit(0);
+}
+
+int main(int argc, char **argv)
+{
+	struct bpf_prog_load_attr prog_load_attr = {
+		.prog_type	= BPF_PROG_TYPE_XDP,
+	};
+	struct bpf_object *obj;
+	struct bpf_map *map;
+	int prog_fd, map_fd;
+	char filename[256];
+	int ret, err, i;
+	int numcpus;
+
+	if (argc < 2) {
+		printf("Usage: %s <ifname>\n", argv[0]);
+		return 1;
+	}
+
+	numcpus = get_nprocs();
+	if (numcpus > MAX_CPUS)
+		numcpus = MAX_CPUS;
+
+	snprintf(filename, sizeof(filename), "%s_kern.o", argv[0]);
+	prog_load_attr.file = filename;
+
+	if (bpf_prog_load_xattr(&prog_load_attr, &obj, &prog_fd))
+		return 1;
+
+	if (!prog_fd) {
+		printf("load_bpf_file: %s\n", strerror(errno));
+		return 1;
+	}
+
+	map = bpf_map__next(NULL, obj);
+	if (!map) {
+		printf("finding a map in obj file failed\n");
+		return 1;
+	}
+	map_fd = bpf_map__fd(map);
+
+	if_idx = if_nametoindex(argv[1]);
+	if (!if_idx)
+		if_idx = strtoul(argv[1], NULL, 0);
+
+	if (!if_idx) {
+		fprintf(stderr, "Invalid ifname\n");
+		return 1;
+	}
+	if_name = argv[1];
+	err = do_attach(if_idx, prog_fd, argv[1]);
+	if (err)
+		return err;
+
+	if (signal(SIGINT, sig_handler) ||
+	    signal(SIGHUP, sig_handler) ||
+	    signal(SIGTERM, sig_handler)) {
+		perror("signal");
+		return 1;
+	}
+
+	test_bpf_perf_event(map_fd, numcpus);
+
+	for (i = 0; i < numcpus; i++)
+		if (perf_event_mmap_header(pmu_fds[i], &headers[i]) < 0)
+			return 1;
+
+	ret = perf_event_poller_multi(pmu_fds, headers, numcpus, print_bpf_output);
+	kill(0, SIGINT);
+	return ret;
+}

^ permalink raw reply related

* [PATCH bpf-next v4 1/2] trace_helpers.c: Add helpers to poll multiple perf FDs for events
From: Toke Høiland-Jørgensen @ 2018-06-05 14:50 UTC (permalink / raw)
  To: netdev

Add two new helper functions to trace_helpers that supports polling
multiple perf file descriptors for events. These are used to the XDP
perf_event_output example, which needs to work with one perf fd per CPU.

Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
---
 tools/testing/selftests/bpf/trace_helpers.c |   47 ++++++++++++++++++++++++++-
 tools/testing/selftests/bpf/trace_helpers.h |    4 ++
 2 files changed, 49 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/bpf/trace_helpers.c b/tools/testing/selftests/bpf/trace_helpers.c
index 3868dcb63420..1e62d89f34cf 100644
--- a/tools/testing/selftests/bpf/trace_helpers.c
+++ b/tools/testing/selftests/bpf/trace_helpers.c
@@ -88,7 +88,7 @@ static int page_size;
 static int page_cnt = 8;
 static struct perf_event_mmap_page *header;
 
-int perf_event_mmap(int fd)
+int perf_event_mmap_header(int fd, struct perf_event_mmap_page **header)
 {
 	void *base;
 	int mmap_size;
@@ -102,10 +102,15 @@ int perf_event_mmap(int fd)
 		return -1;
 	}
 
-	header = base;
+	*header = base;
 	return 0;
 }
 
+int perf_event_mmap(int fd)
+{
+	return perf_event_mmap_header(fd, &header);
+}
+
 static int perf_event_poll(int fd)
 {
 	struct pollfd pfd = { .fd = fd, .events = POLLIN };
@@ -163,3 +168,41 @@ int perf_event_poller(int fd, perf_event_print_fn output_fn)
 
 	return ret;
 }
+
+int perf_event_poller_multi(int *fds, struct perf_event_mmap_page **headers,
+			    int num_fds, perf_event_print_fn output_fn)
+{
+	enum bpf_perf_event_ret ret;
+	struct pollfd *pfds;
+	void *buf = NULL;
+	size_t len = 0;
+	int i;
+
+	pfds = malloc(sizeof(*pfds) * num_fds);
+	if (!pfds)
+		return -1;
+
+	memset(pfds, 0, sizeof(*pfds) * num_fds);
+	for (i = 0; i < num_fds; i++) {
+		pfds[i].fd = fds[i];
+		pfds[i].events = POLLIN;
+	}
+
+	for (;;) {
+		poll(pfds, num_fds, 1000);
+		for (i = 0; i < num_fds; i++) {
+			if (pfds[i].revents) {
+				ret = bpf_perf_event_read_simple(headers[i], page_cnt * page_size,
+								page_size, &buf, &len,
+								bpf_perf_event_print,
+								output_fn);
+				if (ret != LIBBPF_PERF_EVENT_CONT)
+					break;
+			}
+		}
+	}
+	free(buf);
+	free(pfds);
+
+	return ret;
+}
diff --git a/tools/testing/selftests/bpf/trace_helpers.h b/tools/testing/selftests/bpf/trace_helpers.h
index 3b4bcf7f5084..18924f23db1b 100644
--- a/tools/testing/selftests/bpf/trace_helpers.h
+++ b/tools/testing/selftests/bpf/trace_helpers.h
@@ -3,6 +3,7 @@
 #define __TRACE_HELPER_H
 
 #include <libbpf.h>
+#include <linux/perf_event.h>
 
 struct ksym {
 	long addr;
@@ -16,6 +17,9 @@ long ksym_get_addr(const char *name);
 typedef enum bpf_perf_event_ret (*perf_event_print_fn)(void *data, int size);
 
 int perf_event_mmap(int fd);
+int perf_event_mmap_header(int fd, struct perf_event_mmap_page **header);
 /* return LIBBPF_PERF_EVENT_DONE or LIBBPF_PERF_EVENT_ERROR */
 int perf_event_poller(int fd, perf_event_print_fn output_fn);
+int perf_event_poller_multi(int *fds, struct perf_event_mmap_page **headers,
+			    int num_fds, perf_event_print_fn output_fn);
 #endif

^ permalink raw reply related

* Re: [PATCH net-next] qed*: Utilize FW 8.37.2.0
From: David Miller @ 2018-06-05 14:48 UTC (permalink / raw)
  To: Michal.Kalderon
  Cc: netdev, linux-rdma, linux-scsi, Ariel.Elior, manish.rangankar
In-Reply-To: <20180605101116.30292-1-Michal.Kalderon@cavium.com>

From: Michal Kalderon <Michal.Kalderon@cavium.com>
Date: Tue, 5 Jun 2018 13:11:16 +0300

> This FW contains several fixes and features.
> 
> RDMA
> - Several modifications and fixes for Memory Windows
> - drop vlan and tcp timestamp from mss calculation in driver for
>   this FW
> - Fix SQ completion flow when local ack timeout is infinite
> - Modifications in t10dif support
> 
> ETH
> - Fix aRFS for tunneled traffic without inner IP.
> - Fix chip configuration which may fail under heavy traffic conditions.
> - Support receiving any-VNI in VXLAN and GENEVE RX classification.
> 
> iSCSI / FcoE
> - Fix iSCSI recovery flow
> - Drop vlan and tcp timestamp from mss calc for fw 8.37.2.0
> 
> Misc
> - Several registers (split registers) won't read correctly with
>   ethtool -d
> 
> Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
> Signed-off-by: Manish Rangankar <manish.rangankar@cavium.com>
> Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>

Applied, thank you.

^ permalink raw reply

* Re: [PATCH] net-tcp: remove useless tw_timeout field
From: David Miller @ 2018-06-05 14:45 UTC (permalink / raw)
  To: eric.dumazet; +Cc: zenczykowski, maze, edumazet, netdev
In-Reply-To: <ed973e26-eb29-362f-fc86-eb0955b978ad@gmail.com>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Tue, 5 Jun 2018 05:59:27 -0700

> On 06/05/2018 03:07 AM, Maciej Żenczykowski wrote:
>> From: Maciej Żenczykowski <maze@google.com>
>> 
>> Tested: 'git grep tw_timeout' comes up empty and it builds :-)
>> 
>> Signed-off-by: Maciej Żenczykowski <maze@google.com>
>> Cc: Eric Dumazet <edumazet@google.com>
> 
> This field became no longer needed when tcp_tw_recycle was removed in linux-4.12
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Applied, thanks everyone.

^ permalink raw reply

* Re: [PATCH bpf-next v3 1/2] trace_helpers.c: Add helpers to poll multiple perf FDs for events
From: Toke Høiland-Jørgensen @ 2018-06-05 14:44 UTC (permalink / raw)
  To: Daniel Borkmann, netdev
In-Reply-To: <d23ef7fd-27d4-2f9b-b5cd-cb0048514c4a@iogearbox.net>

Daniel Borkmann <daniel@iogearbox.net> writes:

> Hi Toke,
>
> On 06/05/2018 01:14 PM, Toke Høiland-Jørgensen wrote:
>> Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
>
> Please no empty commit message. Not sure why from the previous patch
> you removed it here.

Ah, right, sorry; think I got patch versions mixed up :/

Will resend

-Toke

^ permalink raw reply

* Re: [PATCH] netfilter: provide udp*_lib_lookup for nf_tproxy
From: Eckl, Máté @ 2018-06-05 14:42 UTC (permalink / raw)
  To: arnd
  Cc: pablo, davem, kuznet, yoshfuji, pabeni, willemb, edumazet,
	dsahern, kafai, netdev, linux-kernel
In-Reply-To: <20180605114056.1239571-1-arnd@arndb.de>

Arnd Bergmann <arnd@arndb.de> ezt írta (időpont: 2018. jún. 5., K, 13:41):
>
> It is now possible to enable the libified nf_tproxy modules without
> also enabling NETFILTER_XT_TARGET_TPROXY, which throws off the
> ifdef logic in the udp core code:
>
> net/ipv6/netfilter/nf_tproxy_ipv6.o: In function `nf_tproxy_get_sock_v6':
> nf_tproxy_ipv6.c:(.text+0x1a8): undefined reference to `udp6_lib_lookup'
> net/ipv4/netfilter/nf_tproxy_ipv4.o: In function `nf_tproxy_get_sock_v4':
> nf_tproxy_ipv4.c:(.text+0x3d0): undefined reference to `udp4_lib_lookup'
>
> We can actually simplify the conditions now to provide the two functions
> exactly when they are needed.
>
> Fixes: 45ca4e0cf273 ("netfilter: Libify xt_TPROXY")
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Máté Eckl <ecklm94@gmail.com>

^ permalink raw reply

* Re: [PATCH net] net: sched: cls: Fix offloading when ingress dev is vxlan
From: David Miller @ 2018-06-05 14:30 UTC (permalink / raw)
  To: paulb
  Cc: jiri, xiyou.wangcong, jhs, netdev, kliteyn, roid, shahark, markb,
	ogerlitz
In-Reply-To: <1528185843-18645-1-git-send-email-paulb@mellanox.com>

From: Paul Blakey <paulb@mellanox.com>
Date: Tue,  5 Jun 2018 11:04:03 +0300

> When using a vxlan device as the ingress dev, we count it as a
> "no offload dev", so when such a rule comes and err stop is true,
> we fail early and don't try the egdev route which can offload it
> through the egress device.
> 
> Fix that by not calling the block offload if one of the devices
> attached to it is not offload capable, but make sure egress on such case
> is capable instead.
> 
> Fixes: caa7260156eb ("net: sched: keep track of offloaded filters [..]")
> Reviewed-by: Roi Dayan <roid@mellanox.com>
> Acked-by: Jiri Pirko <jiri@mellanox.com>
> Signed-off-by: Paul Blakey <paulb@mellanox.com>

Applied and queued up for -stable, thanks.

^ permalink raw reply

* Re: [PATCH net-next 3/3] mlxsw: Add extack messages for port_{un,}split failures?
From: David Miller @ 2018-06-05 14:24 UTC (permalink / raw)
  To: jiri; +Cc: idosch, dsahern, netdev, idosch, jiri, jakub.kicinski, dsahern
In-Reply-To: <20180605081836.GD2164@nanopsycho>

From: Jiri Pirko <jiri@resnulli.us>
Date: Tue, 5 Jun 2018 10:18:36 +0200

> Tue, Jun 05, 2018 at 10:05:28AM CEST, idosch@idosch.org wrote:
>>On Tue, Jun 05, 2018 at 09:52:30AM +0200, Jiri Pirko wrote:
>>> Tue, Jun 05, 2018 at 12:15:03AM CEST, dsahern@kernel.org wrote:
>>> > 	if (!mlxsw_sp_port->split) {
>>> > 		netdev_err(mlxsw_sp_port->dev, "Port wasn't split\n");
>>> >+		NL_SET_ERR_MSG_MOD(extack, "Port was not split");
>>> 
>>> I wonder if we need the dmesg for these as well. Plus it is not the same
>>> (wasn't/was not) which is maybe confusing. Any objection against the
>>> original dmesg messages removal?
>>
>>We had this discussion about three months ago and decided to keep the
>>existing messages:
>>https://marc.info/?l=linux-netdev&m=151982813309466&w=2
> 
> I forgot. Thanks for reminding me. So could we at least have the
> messages 100% same? Thanks.

Seems like a reasonable request, David A.?

^ permalink raw reply

* Re: [PATCH net] sctp: not allow transport timeout value less than HZ/5 for hb_timer
From: David Miller @ 2018-06-05 14:23 UTC (permalink / raw)
  To: lucien.xin
  Cc: netdev, linux-sctp, edumazet, marcelo.leitner, nhorman, dvyukov,
	syzkaller
In-Reply-To: <97b99fac474db414ea8486a1fbd3a37dacd4b1b1.1528172218.git.lucien.xin@gmail.com>

From: Xin Long <lucien.xin@gmail.com>
Date: Tue,  5 Jun 2018 12:16:58 +0800

> syzbot reported a rcu_sched self-detected stall on CPU which is caused
> by too small value set on rto_min with SCTP_RTOINFO sockopt. With this
> value, hb_timer will get stuck there, as in its timer handler it starts
> this timer again with this value, then goes to the timer handler again.
> 
> This problem is there since very beginning, and thanks to Eric for the
> reproducer shared from a syzbot mail.
> 
> This patch fixes it by not allowing sctp_transport_timeout to return a
> smaller value than HZ/5 for hb_timer, which is based on TCP's min rto.
> 
> Note that it doesn't fix this issue by limiting rto_min, as some users
> are still using small rto and no proper value was found for it yet.
> 
> Reported-by: syzbot+3dcd59a1f907245f891f@syzkaller.appspotmail.com
> Suggested-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
> Signed-off-by: Xin Long <lucien.xin@gmail.com>

Applied and queued up for -stable, thanks Xin.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox