* Re: [BUG] access to null-pointer in dsa_switch_event when bridge set up
From: Andrew Lunn @ 2019-08-12 13:42 UTC (permalink / raw)
To: Frank Wunderlich
Cc: Vivien Didelot, Florian Fainelli, David S. Miller, netdev,
linux-kernel
In-Reply-To: <trinity-99bcd71d-8f78-4bbe-a439-f6a915040b0a-1565606589515@3c-app-gmx-bs80>
On Mon, Aug 12, 2019 at 12:43:09PM +0200, Frank Wunderlich wrote:
> Hi,
>
> i've noticed a bug when using bridge on dsa-ports. Tested on Bpi-r2, Crash happens on 5.3-rc1 and rc4, 5.2-rc7 (last version pre 5.3 i have found on my tftp) is not affected.
Hi Frank
A patch was merged last night with a fix for dsa_port_mdb_add. The
call stack looks the same. So i think this is fixed.
Andrew
> [ 115.038406] [<c09f28c0>] (dsa_switch_event) from [<c014d4f8>] (notifier_call_chain+0x58/0x94)
> [ 115.046940] r10:00000000 r9:c09f1dd0 r8:00000000 r7:00000005 r6:e71edd54 r5:00000000
> [ 115.054771] r4:ffffffff
> [ 115.057308] [<c014d4a0>] (notifier_call_chain) from [<c014d658>] (raw_notifier_call_chain+0x28/0x30)
> [ 115.066447] r9:c09f1dd0 r8:c09f0740 r7:e71fd800 r6:00000000 r5:c1104c48 r4:c1104c48
> [ 115.074197] [<c014d630>] (raw_notifier_call_chain) from [<c09efe1c>] (dsa_port_mdb_add+0x58/0x84)
> [ 115.083078] [<c09efdc4>] (dsa_port_mdb_add) from [<c09f1e2c>] (dsa_slave_port_obj_add+0x5c/0x78)
> [ 115.091866] r4:e71ede38
> [ 115.094403] [<c09f1dd0>] (dsa_slave_port_obj_add) from [<c0b47cc4>] (__switchdev_handle_port_obj_add+0x64/0xe4)
> [ 115.104499] [<c0b47c60>] (__switchdev_handle_port_obj_add) from [<c0b47d5c>] (switchdev_handle_port_obj_add+0x18/0x24)
> [ 115.115201] r10:00000000 r9:00000000 r8:00000000 r7:00000006 r6:e71ede38 r5:00000000
> [ 115.123032] r4:ffffffff
> [ 115.125570] [<c0b47d44>] (switchdev_handle_port_obj_add) from [<c09f1c58>] (dsa_slave_switchdev_blocking_event+0x50/0xb0)
> [ 115.136535] [<c09f1c08>] (dsa_slave_switchdev_blocking_event) from [<c014d4f8>] (notifier_call_chain+0x58/0x94)
> [ 115.146632] [<c014d4a0>] (notifier_call_chain) from [<c014dd70>] (blocking_notifier_call_chain+0x54/0x6c)
> [ 115.156206] r9:00000000 r8:e6932dd0 r7:e71fd800 r6:e71ede38 r5:c11bc820 r4:00000006
> [ 115.163956] [<c014dd1c>] (blocking_notifier_call_chain) from [<c0b479ec>] (switchdev_port_obj_notify+0x54/0xb8)
> [ 115.174049] r6:00000000 r5:1021bd52 r4:c1104c48
> [ 115.178670] [<c0b47998>] (switchdev_port_obj_notify) from [<c0b47af4>] (switchdev_port_obj_add_now+0xa4/0x118)
> [ 115.188675] r5:e71ede73 r4:c1104c48
> [ 115.192254] [<c0b47a50>] (switchdev_port_obj_add_now) from [<c0b47b8c>] (switchdev_port_obj_add_deferred+0x24/0x70)
> [ 115.202698] r9:c11c50f0 r8:00000000 r7:00000100 r6:e71fd800 r5:e6932dd0 r4:e6932dc0
> [ 115.210450] [<c0b47b68>] (switchdev_port_obj_add_deferred) from [<c0b477e0>] (switchdev_deferred_process+0x84/0x118)
> [ 115.220978] r7:00000100 r6:c12332ac r5:c11bc818 r4:e6932dc0
> [ 115.226643] [<c0b4775c>] (switchdev_deferred_process) from [<c0b47890>] (switchdev_deferred_process_work+0x1c/0x24)
> [ 115.237085] r7:ead92200 r6:ead8f100 r5:e909f380 r4:c11bc83c
> [ 115.242751] [<c0b47874>] (switchdev_deferred_process_work) from [<c0144dac>] (process_one_work+0x1ac/0x4bc)
> [ 115.252499] [<c0144c00>] (process_one_work) from [<c0145b8c>] (worker_thread+0x5c/0x580)
> [ 115.260597] r10:c1103d00 r9:00000008 r8:ffffe000 r7:ead8f118 r6:e909f394 r5:ead8f100
> [ 115.268427] r4:e909f380
> [ 115.270965] [<c0145b30>] (worker_thread) from [<c014ba18>] (kthread+0x168/0x170)
> [ 115.278368] r10:ea13fe74 r9:c0145b30 r8:e909f380 r7:e71ec000 r6:00000000 r5:e910bf00
> [ 115.286199] r4:e910bf40
> [ 115.288737] [<c014b8b0>] (kthread) from [<c01010e8>] (ret_from_fork+0x14/0x2c)
> [ 115.295961] Exception stack(0xe71edfb0 to 0xe71edff8)
> [ 115.301014] dfa0: 00000000 00000000 00000000 00000000
> [ 115.309197] dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> [ 115.317379] dfe0: 00000000 00000000 00000000 00000000 00000013 00000000
> [ 115.323997] r10:00000000 r9:00000000 r8:00000000 r7:00000000 r6:00000000 r5:c014b8b0
> [ 115.331827] r4:e910bf00
> [ 115.334363] Code: bad PC value
> [ 115.337583] ---[ end trace 3bdbb989816b27f4 ]---
>
> regards Frank
>
^ permalink raw reply
* [patch net-next v3 0/3] net: devlink: Finish network namespace support
From: Jiri Pirko @ 2019-08-12 13:47 UTC (permalink / raw)
To: netdev; +Cc: davem, jakub.kicinski, stephen, dsahern, mlxsw
From: Jiri Pirko <jiri@mellanox.com>
Devlink from the beginning counts with network namespaces, but the
instances has been fixed to init_net. The first patch allows user
to move existing devlink instances into namespaces:
$ devlink dev
netdevsim/netdevsim1
$ ip netns add ns1
$ devlink dev set netdevsim/netdevsim1 netns ns1
$ devlink -N ns1 dev
netdevsim/netdevsim1
The last patch allows user to create new netdevsim instance directly
inside network namespace of a caller.
Jiri Pirko (3):
net: devlink: allow to change namespaces
net: devlink: export devlink net set/get helpers
netdevsim: create devlink and netdev instances in namespace
drivers/net/netdevsim/bus.c | 1 +
drivers/net/netdevsim/dev.c | 17 ++-
drivers/net/netdevsim/netdev.c | 4 +-
drivers/net/netdevsim/netdevsim.h | 8 +-
include/net/devlink.h | 3 +
include/uapi/linux/devlink.h | 4 +
net/core/devlink.c | 182 +++++++++++++++++++++++++++++-
7 files changed, 205 insertions(+), 14 deletions(-)
--
2.21.0
^ permalink raw reply
* [patch net-next v3 1/3] net: devlink: allow to change namespaces
From: Jiri Pirko @ 2019-08-12 13:47 UTC (permalink / raw)
To: netdev; +Cc: davem, jakub.kicinski, stephen, dsahern, mlxsw
In-Reply-To: <20190812134751.30838-1-jiri@resnulli.us>
From: Jiri Pirko <jiri@mellanox.com>
All devlink instances are created in init_net and stay there for a
lifetime. Allow user to be able to move devlink instances into
namespaces.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
v2->v3:
- added notifications for all objects
v1->v2:
- change the check for multiple attributes
- add warnon in case there is no attribute passed
---
include/uapi/linux/devlink.h | 4 +
net/core/devlink.c | 166 ++++++++++++++++++++++++++++++++++-
2 files changed, 167 insertions(+), 3 deletions(-)
diff --git a/include/uapi/linux/devlink.h b/include/uapi/linux/devlink.h
index ffc993256527..95f0a1edab99 100644
--- a/include/uapi/linux/devlink.h
+++ b/include/uapi/linux/devlink.h
@@ -348,6 +348,10 @@ enum devlink_attr {
DEVLINK_ATTR_PORT_PCI_PF_NUMBER, /* u16 */
DEVLINK_ATTR_PORT_PCI_VF_NUMBER, /* u16 */
+ DEVLINK_ATTR_NETNS_FD, /* u32 */
+ DEVLINK_ATTR_NETNS_PID, /* u32 */
+ DEVLINK_ATTR_NETNS_ID, /* u32 */
+
/* add new attributes above here, update the policy in devlink.c */
__DEVLINK_ATTR_MAX,
diff --git a/net/core/devlink.c b/net/core/devlink.c
index e3a1ae44f93d..6f8c1b2cdfb2 100644
--- a/net/core/devlink.c
+++ b/net/core/devlink.c
@@ -430,8 +430,16 @@ static void devlink_nl_post_doit(const struct genl_ops *ops,
{
struct devlink *devlink;
- devlink = devlink_get_from_info(info);
- if (~ops->internal_flags & DEVLINK_NL_FLAG_NO_LOCK)
+ /* When devlink changes netns, it would not be found
+ * by devlink_get_from_info(). So try if it is stored first.
+ */
+ if (ops->internal_flags & DEVLINK_NL_FLAG_NEED_DEVLINK) {
+ devlink = info->user_ptr[0];
+ } else {
+ devlink = devlink_get_from_info(info);
+ WARN_ON(IS_ERR(devlink));
+ }
+ if (!IS_ERR(devlink) && ~ops->internal_flags & DEVLINK_NL_FLAG_NO_LOCK)
mutex_unlock(&devlink->lock);
mutex_unlock(&devlink_mutex);
}
@@ -636,6 +644,74 @@ static int devlink_nl_cmd_get_doit(struct sk_buff *skb, struct genl_info *info)
return genlmsg_reply(msg, info);
}
+static struct net *devlink_netns_get(struct sk_buff *skb,
+ struct devlink *devlink,
+ struct genl_info *info)
+{
+ struct nlattr *netns_pid_attr = info->attrs[DEVLINK_ATTR_NETNS_PID];
+ struct nlattr *netns_fd_attr = info->attrs[DEVLINK_ATTR_NETNS_FD];
+ struct nlattr *netns_id_attr = info->attrs[DEVLINK_ATTR_NETNS_ID];
+ struct net *net;
+
+ if (!!netns_pid_attr + !!netns_fd_attr + !!netns_id_attr > 1) {
+ NL_SET_ERR_MSG(info->extack, "multiple netns identifying attributes specified");
+ return ERR_PTR(-EINVAL);
+ }
+
+ if (netns_pid_attr) {
+ net = get_net_ns_by_pid(nla_get_u32(netns_pid_attr));
+ } else if (netns_fd_attr) {
+ net = get_net_ns_by_fd(nla_get_u32(netns_fd_attr));
+ } else if (netns_id_attr) {
+ net = get_net_ns_by_id(sock_net(skb->sk),
+ nla_get_u32(netns_id_attr));
+ if (!net)
+ net = ERR_PTR(-EINVAL);
+ } else {
+ WARN_ON(1);
+ net = ERR_PTR(-EINVAL);
+ }
+ if (IS_ERR(net)) {
+ NL_SET_ERR_MSG(info->extack, "Unknown network namespace");
+ return ERR_PTR(-EINVAL);
+ }
+ if (!netlink_ns_capable(skb, net->user_ns, CAP_NET_ADMIN)) {
+ put_net(net);
+ return ERR_PTR(-EPERM);
+ }
+ return net;
+}
+
+static void devlink_all_add_notify(struct devlink *devlink);
+static void devlink_all_del_notify(struct devlink *devlink);
+
+static void devlink_netns_change(struct devlink *devlink, struct net *net)
+{
+ if (net_eq(devlink_net(devlink), net))
+ return;
+ devlink_all_del_notify(devlink);
+ devlink_net_set(devlink, net);
+ devlink_all_add_notify(devlink);
+}
+
+static int devlink_nl_cmd_set_doit(struct sk_buff *skb, struct genl_info *info)
+{
+ struct devlink *devlink = info->user_ptr[0];
+
+ if (info->attrs[DEVLINK_ATTR_NETNS_PID] ||
+ info->attrs[DEVLINK_ATTR_NETNS_FD] ||
+ info->attrs[DEVLINK_ATTR_NETNS_ID]) {
+ struct net *net;
+
+ net = devlink_netns_get(skb, devlink, info);
+ if (IS_ERR(net))
+ return PTR_ERR(net);
+ devlink_netns_change(devlink, net);
+ put_net(net);
+ }
+ return 0;
+}
+
static int devlink_nl_cmd_get_dumpit(struct sk_buff *msg,
struct netlink_callback *cb)
{
@@ -5184,6 +5260,9 @@ static const struct nla_policy devlink_nl_policy[DEVLINK_ATTR_MAX + 1] = {
[DEVLINK_ATTR_HEALTH_REPORTER_AUTO_RECOVER] = { .type = NLA_U8 },
[DEVLINK_ATTR_FLASH_UPDATE_FILE_NAME] = { .type = NLA_NUL_STRING },
[DEVLINK_ATTR_FLASH_UPDATE_COMPONENT] = { .type = NLA_NUL_STRING },
+ [DEVLINK_ATTR_NETNS_PID] = { .type = NLA_U32 },
+ [DEVLINK_ATTR_NETNS_FD] = { .type = NLA_U32 },
+ [DEVLINK_ATTR_NETNS_ID] = { .type = NLA_U32 },
};
static const struct genl_ops devlink_nl_ops[] = {
@@ -5195,6 +5274,13 @@ static const struct genl_ops devlink_nl_ops[] = {
.internal_flags = DEVLINK_NL_FLAG_NEED_DEVLINK,
/* can be retrieved by unprivileged users */
},
+ {
+ .cmd = DEVLINK_CMD_SET,
+ .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
+ .doit = devlink_nl_cmd_set_doit,
+ .flags = GENL_ADMIN_PERM,
+ .internal_flags = DEVLINK_NL_FLAG_NEED_DEVLINK,
+ },
{
.cmd = DEVLINK_CMD_PORT_GET,
.validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
@@ -6834,6 +6920,56 @@ int devlink_region_snapshot_create(struct devlink_region *region,
}
EXPORT_SYMBOL_GPL(devlink_region_snapshot_create);
+static void devlink_all_del_notify(struct devlink *devlink)
+{
+ struct devlink_port *devlink_port;
+ struct devlink_region *region;
+ struct devlink_param_item *param_item;
+ struct devlink_snapshot *snapshot;
+
+ list_for_each_entry(region, &devlink->region_list, list) {
+ list_for_each_entry(snapshot, ®ion->snapshot_list, list)
+ devlink_nl_region_notify(region, snapshot,
+ DEVLINK_CMD_REGION_DEL);
+ devlink_nl_region_notify(region, NULL, DEVLINK_CMD_REGION_DEL);
+ }
+ list_for_each_entry(devlink_port, &devlink->port_list, list) {
+ list_for_each_entry(param_item, &devlink_port->param_list, list)
+ devlink_param_notify(devlink, devlink_port->index,
+ param_item, DEVLINK_CMD_PARAM_DEL);
+ devlink_port_notify(devlink_port, DEVLINK_CMD_PORT_DEL);
+ }
+ list_for_each_entry(param_item, &devlink->param_list, list)
+ devlink_param_notify(devlink, 0, param_item,
+ DEVLINK_CMD_PARAM_DEL);
+ devlink_notify(devlink, DEVLINK_CMD_DEL);
+}
+
+static void devlink_all_add_notify(struct devlink *devlink)
+{
+ struct devlink_port *devlink_port;
+ struct devlink_region *region;
+ struct devlink_param_item *param_item;
+ struct devlink_snapshot *snapshot;
+
+ devlink_notify(devlink, DEVLINK_CMD_NEW);
+ list_for_each_entry(param_item, &devlink->param_list, list)
+ devlink_param_notify(devlink, 0, param_item,
+ DEVLINK_CMD_PARAM_NEW);
+ list_for_each_entry(devlink_port, &devlink->port_list, list) {
+ devlink_port_notify(devlink_port, DEVLINK_CMD_PORT_NEW);
+ list_for_each_entry(param_item, &devlink_port->param_list, list)
+ devlink_param_notify(devlink, devlink_port->index,
+ param_item, DEVLINK_CMD_PARAM_NEW);
+ }
+ list_for_each_entry(region, &devlink->region_list, list) {
+ devlink_nl_region_notify(region, NULL, DEVLINK_CMD_REGION_NEW);
+ list_for_each_entry(snapshot, ®ion->snapshot_list, list)
+ devlink_nl_region_notify(region, snapshot,
+ DEVLINK_CMD_REGION_NEW);
+ }
+}
+
static void __devlink_compat_running_version(struct devlink *devlink,
char *buf, size_t len)
{
@@ -6953,9 +7089,33 @@ int devlink_compat_switch_id_get(struct net_device *dev,
return 0;
}
+static void __net_exit devlink_pernet_exit(struct net *net)
+{
+ struct devlink *devlink;
+
+ mutex_lock(&devlink_mutex);
+ list_for_each_entry(devlink, &devlink_list, list)
+ if (net_eq(devlink_net(devlink), net))
+ devlink_netns_change(devlink, &init_net);
+ mutex_unlock(&devlink_mutex);
+}
+
+static struct pernet_operations __net_initdata devlink_pernet_ops = {
+ .exit = devlink_pernet_exit,
+};
+
static int __init devlink_init(void)
{
- return genl_register_family(&devlink_nl_family);
+ int err;
+
+ err = genl_register_family(&devlink_nl_family);
+ if (err)
+ goto out;
+ err = register_pernet_device(&devlink_pernet_ops);
+
+out:
+ WARN_ON(err);
+ return err;
}
subsys_initcall(devlink_init);
--
2.21.0
^ permalink raw reply related
* [patch net-next v3 3/3] netdevsim: create devlink and netdev instances in namespace
From: Jiri Pirko @ 2019-08-12 13:47 UTC (permalink / raw)
To: netdev; +Cc: davem, jakub.kicinski, stephen, dsahern, mlxsw
In-Reply-To: <20190812134751.30838-1-jiri@resnulli.us>
From: Jiri Pirko <jiri@mellanox.com>
When user does create new netdevsim instance using sysfs bus file,
create the devlink instance and related netdev instance in the namespace
of the caller.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
v1->v2:
- remove net_namespace.h include and forward decralared net struct
- add comment to initial_net pointer
---
drivers/net/netdevsim/bus.c | 1 +
drivers/net/netdevsim/dev.c | 17 +++++++++++------
drivers/net/netdevsim/netdev.c | 4 +++-
drivers/net/netdevsim/netdevsim.h | 8 +++++++-
4 files changed, 22 insertions(+), 8 deletions(-)
diff --git a/drivers/net/netdevsim/bus.c b/drivers/net/netdevsim/bus.c
index 1a0ff3d7747b..6aeed0c600f8 100644
--- a/drivers/net/netdevsim/bus.c
+++ b/drivers/net/netdevsim/bus.c
@@ -283,6 +283,7 @@ nsim_bus_dev_new(unsigned int id, unsigned int port_count)
nsim_bus_dev->dev.bus = &nsim_bus;
nsim_bus_dev->dev.type = &nsim_bus_dev_type;
nsim_bus_dev->port_count = port_count;
+ nsim_bus_dev->initial_net = current->nsproxy->net_ns;
err = device_register(&nsim_bus_dev->dev);
if (err)
diff --git a/drivers/net/netdevsim/dev.c b/drivers/net/netdevsim/dev.c
index e76ea6a3cb60..8485dd805f7c 100644
--- a/drivers/net/netdevsim/dev.c
+++ b/drivers/net/netdevsim/dev.c
@@ -381,7 +381,8 @@ static const struct devlink_ops nsim_dev_devlink_ops = {
#define NSIM_DEV_TEST1_DEFAULT true
static struct nsim_dev *
-nsim_dev_create(struct nsim_bus_dev *nsim_bus_dev, unsigned int port_count)
+nsim_dev_create(struct net *net, struct nsim_bus_dev *nsim_bus_dev,
+ unsigned int port_count)
{
struct nsim_dev *nsim_dev;
struct devlink *devlink;
@@ -390,6 +391,7 @@ nsim_dev_create(struct nsim_bus_dev *nsim_bus_dev, unsigned int port_count)
devlink = devlink_alloc(&nsim_dev_devlink_ops, sizeof(*nsim_dev));
if (!devlink)
return ERR_PTR(-ENOMEM);
+ devlink_net_set(devlink, net);
nsim_dev = devlink_priv(devlink);
nsim_dev->nsim_bus_dev = nsim_bus_dev;
nsim_dev->switch_id.id_len = sizeof(nsim_dev->switch_id.id);
@@ -469,7 +471,7 @@ static void nsim_dev_destroy(struct nsim_dev *nsim_dev)
devlink_free(devlink);
}
-static int __nsim_dev_port_add(struct nsim_dev *nsim_dev,
+static int __nsim_dev_port_add(struct net *net, struct nsim_dev *nsim_dev,
unsigned int port_index)
{
struct nsim_dev_port *nsim_dev_port;
@@ -495,7 +497,7 @@ static int __nsim_dev_port_add(struct nsim_dev *nsim_dev,
if (err)
goto err_dl_port_unregister;
- nsim_dev_port->ns = nsim_create(nsim_dev, nsim_dev_port);
+ nsim_dev_port->ns = nsim_create(net, nsim_dev, nsim_dev_port);
if (IS_ERR(nsim_dev_port->ns)) {
err = PTR_ERR(nsim_dev_port->ns);
goto err_port_debugfs_exit;
@@ -538,17 +540,19 @@ static void nsim_dev_port_del_all(struct nsim_dev *nsim_dev)
int nsim_dev_probe(struct nsim_bus_dev *nsim_bus_dev)
{
+ struct net *initial_net = nsim_bus_dev->initial_net;
struct nsim_dev *nsim_dev;
int i;
int err;
- nsim_dev = nsim_dev_create(nsim_bus_dev, nsim_bus_dev->port_count);
+ nsim_dev = nsim_dev_create(initial_net, nsim_bus_dev,
+ nsim_bus_dev->port_count);
if (IS_ERR(nsim_dev))
return PTR_ERR(nsim_dev);
dev_set_drvdata(&nsim_bus_dev->dev, nsim_dev);
for (i = 0; i < nsim_bus_dev->port_count; i++) {
- err = __nsim_dev_port_add(nsim_dev, i);
+ err = __nsim_dev_port_add(initial_net, nsim_dev, i);
if (err)
goto err_port_del_all;
}
@@ -583,13 +587,14 @@ int nsim_dev_port_add(struct nsim_bus_dev *nsim_bus_dev,
unsigned int port_index)
{
struct nsim_dev *nsim_dev = dev_get_drvdata(&nsim_bus_dev->dev);
+ struct net *net = devlink_net(priv_to_devlink(nsim_dev));
int err;
mutex_lock(&nsim_dev->port_list_lock);
if (__nsim_dev_port_lookup(nsim_dev, port_index))
err = -EEXIST;
else
- err = __nsim_dev_port_add(nsim_dev, port_index);
+ err = __nsim_dev_port_add(net, nsim_dev, port_index);
mutex_unlock(&nsim_dev->port_list_lock);
return err;
}
diff --git a/drivers/net/netdevsim/netdev.c b/drivers/net/netdevsim/netdev.c
index 0740940f41b1..25c7de7a4a31 100644
--- a/drivers/net/netdevsim/netdev.c
+++ b/drivers/net/netdevsim/netdev.c
@@ -280,7 +280,8 @@ static void nsim_setup(struct net_device *dev)
}
struct netdevsim *
-nsim_create(struct nsim_dev *nsim_dev, struct nsim_dev_port *nsim_dev_port)
+nsim_create(struct net *net, struct nsim_dev *nsim_dev,
+ struct nsim_dev_port *nsim_dev_port)
{
struct net_device *dev;
struct netdevsim *ns;
@@ -290,6 +291,7 @@ nsim_create(struct nsim_dev *nsim_dev, struct nsim_dev_port *nsim_dev_port)
if (!dev)
return ERR_PTR(-ENOMEM);
+ dev_net_set(dev, net);
ns = netdev_priv(dev);
ns->netdev = dev;
ns->nsim_dev = nsim_dev;
diff --git a/drivers/net/netdevsim/netdevsim.h b/drivers/net/netdevsim/netdevsim.h
index 4c758c6919f5..521802d429a0 100644
--- a/drivers/net/netdevsim/netdevsim.h
+++ b/drivers/net/netdevsim/netdevsim.h
@@ -74,8 +74,11 @@ struct netdevsim {
struct nsim_ipsec ipsec;
};
+struct net;
+
struct netdevsim *
-nsim_create(struct nsim_dev *nsim_dev, struct nsim_dev_port *nsim_dev_port);
+nsim_create(struct net *net, struct nsim_dev *nsim_dev,
+ struct nsim_dev_port *nsim_dev_port);
void nsim_destroy(struct netdevsim *ns);
#ifdef CONFIG_BPF_SYSCALL
@@ -216,6 +219,9 @@ struct nsim_bus_dev {
struct device dev;
struct list_head list;
unsigned int port_count;
+ struct net *initial_net; /* Purpose of this is to carry net pointer
+ * during the probe time only.
+ */
unsigned int num_vfs;
struct nsim_vf_config *vfconfigs;
};
--
2.21.0
^ permalink raw reply related
* [patch net-next v3 2/3] net: devlink: export devlink net set/get helpers
From: Jiri Pirko @ 2019-08-12 13:47 UTC (permalink / raw)
To: netdev; +Cc: davem, jakub.kicinski, stephen, dsahern, mlxsw
In-Reply-To: <20190812134751.30838-1-jiri@resnulli.us>
From: Jiri Pirko <jiri@mellanox.com>
Allow drivers to set/get net struct for devlink instance. Set is only
allowed for newly allocated devlink instance.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
include/net/devlink.h | 3 +++
net/core/devlink.c | 18 ++++++++++++++----
2 files changed, 17 insertions(+), 4 deletions(-)
diff --git a/include/net/devlink.h b/include/net/devlink.h
index 451268f64880..c45b10d79b14 100644
--- a/include/net/devlink.h
+++ b/include/net/devlink.h
@@ -35,6 +35,7 @@ struct devlink {
struct device *dev;
possible_net_t _net;
struct mutex lock;
+ bool registered;
char priv[0] __aligned(NETDEV_ALIGN);
};
@@ -591,6 +592,8 @@ static inline struct devlink *netdev_to_devlink(struct net_device *dev)
struct ib_device;
+struct net *devlink_net(const struct devlink *devlink);
+void devlink_net_set(struct devlink *devlink, struct net *net);
struct devlink *devlink_alloc(const struct devlink_ops *ops, size_t priv_size);
int devlink_register(struct devlink *devlink, struct device *dev);
void devlink_unregister(struct devlink *devlink);
diff --git a/net/core/devlink.c b/net/core/devlink.c
index 6f8c1b2cdfb2..80a4b3ae9d39 100644
--- a/net/core/devlink.c
+++ b/net/core/devlink.c
@@ -92,16 +92,25 @@ static LIST_HEAD(devlink_list);
*/
static DEFINE_MUTEX(devlink_mutex);
-static struct net *devlink_net(const struct devlink *devlink)
+struct net *devlink_net(const struct devlink *devlink)
{
return read_pnet(&devlink->_net);
}
+EXPORT_SYMBOL_GPL(devlink_net);
-static void devlink_net_set(struct devlink *devlink, struct net *net)
+static void __devlink_net_set(struct devlink *devlink, struct net *net)
{
write_pnet(&devlink->_net, net);
}
+void devlink_net_set(struct devlink *devlink, struct net *net)
+{
+ if (WARN_ON(devlink->registered))
+ return;
+ __devlink_net_set(devlink, net);
+}
+EXPORT_SYMBOL_GPL(devlink_net_set);
+
static struct devlink *devlink_get_from_attrs(struct net *net,
struct nlattr **attrs)
{
@@ -690,7 +699,7 @@ static void devlink_netns_change(struct devlink *devlink, struct net *net)
if (net_eq(devlink_net(devlink), net))
return;
devlink_all_del_notify(devlink);
- devlink_net_set(devlink, net);
+ __devlink_net_set(devlink, net);
devlink_all_add_notify(devlink);
}
@@ -5606,7 +5615,7 @@ struct devlink *devlink_alloc(const struct devlink_ops *ops, size_t priv_size)
if (!devlink)
return NULL;
devlink->ops = ops;
- devlink_net_set(devlink, &init_net);
+ __devlink_net_set(devlink, &init_net);
INIT_LIST_HEAD(&devlink->port_list);
INIT_LIST_HEAD(&devlink->sb_list);
INIT_LIST_HEAD_RCU(&devlink->dpipe_table_list);
@@ -5630,6 +5639,7 @@ int devlink_register(struct devlink *devlink, struct device *dev)
{
mutex_lock(&devlink_mutex);
devlink->dev = dev;
+ devlink->registered = true;
list_add_tail(&devlink->list, &devlink_list);
devlink_notify(devlink, DEVLINK_CMD_NEW);
mutex_unlock(&devlink_mutex);
--
2.21.0
^ permalink raw reply related
* [patch iproute2-next v3 1/2] devlink: introduce cmdline option to switch to a different namespace
From: Jiri Pirko @ 2019-08-12 13:51 UTC (permalink / raw)
To: netdev; +Cc: davem, jakub.kicinski, stephen, dsahern, mlxsw
In-Reply-To: <20190812134751.30838-1-jiri@resnulli.us>
From: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
devlink/devlink.c | 12 ++++++++++--
man/man8/devlink.8 | 4 ++++
2 files changed, 14 insertions(+), 2 deletions(-)
diff --git a/devlink/devlink.c b/devlink/devlink.c
index 91c85dc1de73..6bda25e92238 100644
--- a/devlink/devlink.c
+++ b/devlink/devlink.c
@@ -31,6 +31,7 @@
#include "mnlg.h"
#include "json_writer.h"
#include "utils.h"
+#include "namespace.h"
#define ESWITCH_MODE_LEGACY "legacy"
#define ESWITCH_MODE_SWITCHDEV "switchdev"
@@ -6333,7 +6334,7 @@ static int cmd_health(struct dl *dl)
static void help(void)
{
pr_err("Usage: devlink [ OPTIONS ] OBJECT { COMMAND | help }\n"
- " devlink [ -f[orce] ] -b[atch] filename\n"
+ " devlink [ -f[orce] ] -b[atch] filename -N[etns] netnsname\n"
"where OBJECT := { dev | port | sb | monitor | dpipe | resource | region | health }\n"
" OPTIONS := { -V[ersion] | -n[o-nice-names] | -j[son] | -p[retty] | -v[erbose] }\n");
}
@@ -6479,6 +6480,7 @@ int main(int argc, char **argv)
{ "json", no_argument, NULL, 'j' },
{ "pretty", no_argument, NULL, 'p' },
{ "verbose", no_argument, NULL, 'v' },
+ { "Netns", required_argument, NULL, 'N' },
{ NULL, 0, NULL, 0 }
};
const char *batch_file = NULL;
@@ -6494,7 +6496,7 @@ int main(int argc, char **argv)
return EXIT_FAILURE;
}
- while ((opt = getopt_long(argc, argv, "Vfb:njpv",
+ while ((opt = getopt_long(argc, argv, "Vfb:njpvN:",
long_options, NULL)) >= 0) {
switch (opt) {
@@ -6520,6 +6522,12 @@ int main(int argc, char **argv)
case 'v':
dl->verbose = true;
break;
+ case 'N':
+ if (netns_switch(optarg)) {
+ ret = EXIT_FAILURE;
+ goto dl_free;
+ }
+ break;
default:
pr_err("Unknown option.\n");
help();
diff --git a/man/man8/devlink.8 b/man/man8/devlink.8
index 13d4dcd908b3..9fc9b034eefe 100644
--- a/man/man8/devlink.8
+++ b/man/man8/devlink.8
@@ -51,6 +51,10 @@ When combined with -j generate a pretty JSON output.
.BR "\-v" , " --verbose"
Turn on verbose output.
+.TP
+.BR "\-N", " \-Netns " <NETNSNAME>
+Switches to the specified network namespace.
+
.SS
.I OBJECT
--
2.21.0
^ permalink raw reply related
* [patch iproute2-next v3 2/2] devlink: add support for network namespace change
From: Jiri Pirko @ 2019-08-12 13:51 UTC (permalink / raw)
To: netdev; +Cc: davem, jakub.kicinski, stephen, dsahern, mlxsw
In-Reply-To: <20190812134751.30838-1-jiri@resnulli.us>
From: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
devlink/devlink.c | 54 +++++++++++++++++++++++++++++++++++-
include/uapi/linux/devlink.h | 4 +++
man/man8/devlink-dev.8 | 12 ++++++++
3 files changed, 69 insertions(+), 1 deletion(-)
diff --git a/devlink/devlink.c b/devlink/devlink.c
index 6bda25e92238..42aebb38e67c 100644
--- a/devlink/devlink.c
+++ b/devlink/devlink.c
@@ -234,6 +234,7 @@ static void ifname_map_free(struct ifname_map *ifname_map)
#define DL_OPT_HEALTH_REPORTER_NAME BIT(27)
#define DL_OPT_HEALTH_REPORTER_GRACEFUL_PERIOD BIT(27)
#define DL_OPT_HEALTH_REPORTER_AUTO_RECOVER BIT(28)
+#define DL_OPT_NETNS BIT(29)
struct dl_opts {
uint32_t present; /* flags of present items */
@@ -270,6 +271,8 @@ struct dl_opts {
const char *reporter_name;
uint64_t reporter_graceful_period;
bool reporter_auto_recover;
+ bool netns_is_pid;
+ uint32_t netns;
};
struct dl {
@@ -1330,6 +1333,22 @@ static int dl_argv_parse(struct dl *dl, uint32_t o_required,
if (err)
return err;
o_found |= DL_OPT_HEALTH_REPORTER_AUTO_RECOVER;
+ } else if (dl_argv_match(dl, "netns") &&
+ (o_all & DL_OPT_NETNS)) {
+ const char *netns_str;
+
+ dl_arg_inc(dl);
+ err = dl_argv_str(dl, &netns_str);
+ if (err)
+ return err;
+ opts->netns = netns_get_fd(netns_str);
+ if (opts->netns < 0) {
+ err = dl_argv_uint32_t(dl, &opts->netns);
+ if (err)
+ return err;
+ opts->netns_is_pid = true;
+ }
+ o_found |= DL_OPT_NETNS;
} else {
pr_err("Unknown option \"%s\"\n", dl_argv(dl));
return -EINVAL;
@@ -1443,7 +1462,11 @@ static void dl_opts_put(struct nlmsghdr *nlh, struct dl *dl)
if (opts->present & DL_OPT_HEALTH_REPORTER_AUTO_RECOVER)
mnl_attr_put_u8(nlh, DEVLINK_ATTR_HEALTH_REPORTER_AUTO_RECOVER,
opts->reporter_auto_recover);
-
+ if (opts->present & DL_OPT_NETNS)
+ mnl_attr_put_u32(nlh,
+ opts->netns_is_pid ? DEVLINK_ATTR_NETNS_PID :
+ DEVLINK_ATTR_NETNS_FD,
+ opts->netns);
}
static int dl_argv_parse_put(struct nlmsghdr *nlh, struct dl *dl,
@@ -1498,6 +1521,7 @@ static bool dl_dump_filter(struct dl *dl, struct nlattr **tb)
static void cmd_dev_help(void)
{
pr_err("Usage: devlink dev show [ DEV ]\n");
+ pr_err(" devlink dev set DEV netns { PID | NAME | ID }\n");
pr_err(" devlink dev eswitch set DEV [ mode { legacy | switchdev } ]\n");
pr_err(" [ inline-mode { none | link | network | transport } ]\n");
pr_err(" [ encap { disable | enable } ]\n");
@@ -2547,6 +2571,31 @@ static int cmd_dev_show(struct dl *dl)
return err;
}
+static void cmd_dev_set_help(void)
+{
+ pr_err("Usage: devlink dev set DEV netns { PID | NAME | ID }\n");
+}
+
+static int cmd_dev_set(struct dl *dl)
+{
+ struct nlmsghdr *nlh;
+ int err;
+
+ if (dl_argv_match(dl, "help") || dl_no_arg(dl)) {
+ cmd_dev_set_help();
+ return 0;
+ }
+
+ nlh = mnlg_msg_prepare(dl->nlg, DEVLINK_CMD_SET,
+ NLM_F_REQUEST | NLM_F_ACK);
+
+ err = dl_argv_parse_put(nlh, dl, DL_OPT_HANDLE, DL_OPT_NETNS);
+ if (err)
+ return err;
+
+ return _mnlg_socket_sndrcv(dl->nlg, nlh, NULL, NULL);
+}
+
static void cmd_dev_reload_help(void)
{
pr_err("Usage: devlink dev reload [ DEV ]\n");
@@ -2743,6 +2792,9 @@ static int cmd_dev(struct dl *dl)
dl_argv_match(dl, "list") || dl_no_arg(dl)) {
dl_arg_inc(dl);
return cmd_dev_show(dl);
+ } else if (dl_argv_match(dl, "set")) {
+ dl_arg_inc(dl);
+ return cmd_dev_set(dl);
} else if (dl_argv_match(dl, "eswitch")) {
dl_arg_inc(dl);
return cmd_dev_eswitch(dl);
diff --git a/include/uapi/linux/devlink.h b/include/uapi/linux/devlink.h
index fc195cbd66f4..bc1869993e20 100644
--- a/include/uapi/linux/devlink.h
+++ b/include/uapi/linux/devlink.h
@@ -348,6 +348,10 @@ enum devlink_attr {
DEVLINK_ATTR_PORT_PCI_PF_NUMBER, /* u16 */
DEVLINK_ATTR_PORT_PCI_VF_NUMBER, /* u16 */
+ DEVLINK_ATTR_NETNS_FD, /* u32 */
+ DEVLINK_ATTR_NETNS_PID, /* u32 */
+ DEVLINK_ATTR_NETNS_ID, /* u32 */
+
/* add new attributes above here, update the policy in devlink.c */
__DEVLINK_ATTR_MAX,
diff --git a/man/man8/devlink-dev.8 b/man/man8/devlink-dev.8
index 1804463b2321..0e1a5523fa7b 100644
--- a/man/man8/devlink-dev.8
+++ b/man/man8/devlink-dev.8
@@ -25,6 +25,13 @@ devlink-dev \- devlink device configuration
.ti -8
.B devlink dev help
+.ti -8
+.BR "devlink dev set"
+.IR DEV
+.RI "[ "
+.BI "netns { " PID " | " NAME " | " ID " }
+.RI "]"
+
.ti -8
.BR "devlink dev eswitch set"
.IR DEV
@@ -92,6 +99,11 @@ Format is:
.in +2
BUS_NAME/BUS_ADDRESS
+.SS devlink dev set - sets devlink device attributes
+
+.TP
+.BI "netns { " PID " | " NAME " | " ID " }
+
.SS devlink dev eswitch show - display devlink device eswitch attributes
.SS devlink dev eswitch set - sets devlink device eswitch attributes
--
2.21.0
^ permalink raw reply related
* Re: WARNING in aa_sock_msg_perm
From: Tetsuo Handa @ 2019-08-12 13:53 UTC (permalink / raw)
To: syzbot, linux-kernel, netdev, syzkaller-bugs, David Howells; +Cc: linux-afs
In-Reply-To: <00000000000021eea2058feaaf82@google.com>
On 2019/08/12 21:30, syzbot wrote:
> Hello,
>
> syzbot found the following crash on:
>
> HEAD commit: fcc32a21 liquidio: Use pcie_flr() instead of reimplementin..
> git tree: net-next
> console output: https://syzkaller.appspot.com/x/log.txt?x=11233726600000
> kernel config: https://syzkaller.appspot.com/x/.config?x=d4cf1ffb87d590d7
> dashboard link: https://syzkaller.appspot.com/bug?extid=cda1ac91660a61b51495
> compiler: gcc (GCC) 9.0.0 20181231 (experimental)
> ------------[ cut here ]------------
> AppArmor WARN aa_sock_msg_perm: ((!sock)):
This is not AppArmor's bug. LSM modules expect that "struct socket" is not NULL.
For some reason, peer->local->socket became NULL. Thus, suspecting rxrpc's bug.
> rxrpc_send_keepalive+0x1ff/0x940 net/rxrpc/output.c:656
^ permalink raw reply
* Re: [PATCH] dpaa2-ethsw: move the DPAA2 Ethernet Switch driver out of staging
From: Andrew Lunn @ 2019-08-12 13:57 UTC (permalink / raw)
To: Ioana Ciornei
Cc: davem@davemloft.net, gregkh@linuxfoundation.org,
linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
f.fainelli@gmail.com, Ioana Ciocoi Radulescu
In-Reply-To: <VI1PR0402MB2800FF2E5C4DE24B25E7D843E0D10@VI1PR0402MB2800.eurprd04.prod.outlook.com>
> In the DPAA2 architecture MACs are not the only entities that can be
> connected to a switch port.
> Below is an exemple of a 4 port DPAA2 switch which is configured to
> interconnect 2 DPNIs (network interfaces) and 2 DPMACs.
>
>
> [ethA] [ethB] [ethC] [ethD] [ethE] [ethF]
> : : : : : :
> : : : : : :
> [eth drv] [eth drv] [ ethsw drv ]
> : : : : : : kernel
> ========================================================================
> : : : : : :
> hardware
> [DPNI] [DPNI] [============= DPSW =================]
> | | | | | |
> | ---------- | [DPMAC] [DPMAC]
> ------------------------------- | |
> | |
> [PHY] [PHY]
>
> You can see it as a hardware-accelerated software bridge where
> forwarding rules are managed from the host software partition.
Hi Ioana
What are the use cases for this?
Configuration is rather unintuitive. To bridge etha and ethb you need
to
ip link add name br0 type bridge
ip link set ethc master br0
ip link set ethd master br0
And once you make ethc and ethd actually send/receive frames, etha and
ethc become equivalent.
If this was a PCI device, i could imagine passing etha into a VM as a
PCI VF. But i don't think it is PCI?
I'm not sure moving etha into a different name space makes much sense
either. My guess would be, a veth pair with one end connected to the
software bridge would be more efficient than DMAing the packet out and
then back in again.
Thanks
Andrew
^ permalink raw reply
* Re: [PATCH] dpaa2-ethsw: move the DPAA2 Ethernet Switch driver out of staging
From: Andrew Lunn @ 2019-08-12 14:03 UTC (permalink / raw)
To: Ioana Ciornei
Cc: davem@davemloft.net, gregkh@linuxfoundation.org,
linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
f.fainelli@gmail.com, Ioana Ciocoi Radulescu
In-Reply-To: <VI1PR0402MB2800292DCA9CC91085E5033FE0D00@VI1PR0402MB2800.eurprd04.prod.outlook.com>
> >> Yes, we only support a single bridge.
> >
> > That is a pretty severe restriction for a device of this class. Some
> > of the very simple switches DSA support have a similar restriction,
> > but in general, most do support multiple bridges.
> >
>
> Let me make a distinction here: we do no support multiple bridges on the
> same DPSW object but we do support multiple DPSW objects, each with its
> bridge.
>
>
> > Are there any plans to fix this?
> >
>
> We had some internal discussions on this, the hardware could support
> this kind of further partitioning the switch object but, at the moment,
> the firmware doesn't.
I assume the firmware allows you to create switch objects on the fly?
I think this was discussed a long time ago, but why not create a new
switch object when you need it? That seems like the whole point of
this dynamic hardware design of dpaa2.
Andrew
^ permalink raw reply
* Re: [PATCH v4 04/14] net: phy: adin: add {write,read}_mmd hooks
From: Andrew Lunn @ 2019-08-12 14:06 UTC (permalink / raw)
To: Alexandru Ardelean
Cc: netdev, devicetree, linux-kernel, davem, robh+dt, mark.rutland,
f.fainelli, hkallweit1
In-Reply-To: <20190812112350.15242-5-alexandru.ardelean@analog.com>
On Mon, Aug 12, 2019 at 02:23:40PM +0300, Alexandru Ardelean wrote:
> Both ADIN1200 & ADIN1300 support Clause 45 access for some registers.
> The Extended Management Interface (EMI) registers are accessible via both
> Clause 45 (at register MDIO_MMD_VEND1) and using Clause 22.
>
> The Clause 22 access for MMD regs differs from the standard one defined by
> 802.3. The ADIN PHYs use registers ExtRegPtr (0x0010) and ExtRegData
> (0x0011) to access Clause 45 & EMI registers.
>
> The indirect access is done via the following mechanism (for both R/W):
> 1. Write the address of the register in the ExtRegPtr
> 2. Read/write the value of the register via reg ExtRegData
>
> This mechanism is needed to manage configuration of chip settings and to
> access EEE registers via Clause 22.
>
> Since Clause 45 access will likely never be used, it is not implemented via
> this hook.
>
> Signed-off-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Andrew
^ permalink raw reply
* Re: [PATCH v4 09/14] net: phy: adin: add EEE translation layer from Clause 45 to Clause 22
From: Andrew Lunn @ 2019-08-12 14:08 UTC (permalink / raw)
To: Alexandru Ardelean
Cc: netdev, devicetree, linux-kernel, davem, robh+dt, mark.rutland,
f.fainelli, hkallweit1
In-Reply-To: <20190812112350.15242-10-alexandru.ardelean@analog.com>
On Mon, Aug 12, 2019 at 02:23:45PM +0300, Alexandru Ardelean wrote:
> The ADIN1200 & ADIN1300 PHYs support EEE by using standard Clause 45 access
> to access MMD registers for EEE.
>
> The EEE register addresses (when using Clause 22) are available at
> different addresses (than Clause 45), and since accessing these regs (via
> Clause 22) needs a special mechanism, a translation table is required to
> convert these addresses.
>
> For Clause 45, this is not needed since the driver will likely never use
> this access mode.
>
> Signed-off-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Andrew
^ permalink raw reply
* Re: [PATCH net-next v7 5/6] flow_offload: support get multi-subsystem block
From: Vlad Buslov @ 2019-08-12 14:11 UTC (permalink / raw)
To: wenxu@ucloud.cn, Jakub Kicinski
Cc: David Miller, Jiri Pirko, pablo@netfilter.org,
netfilter-devel@vger.kernel.org, netdev@vger.kernel.org
In-Reply-To: <1565140434-8109-6-git-send-email-wenxu@ucloud.cn>
On Wed 07 Aug 2019 at 04:13, wenxu@ucloud.cn wrote:
> From: wenxu <wenxu@ucloud.cn>
>
> It provide a callback list to find the blocks of tc
> and nft subsystems
>
> Signed-off-by: wenxu <wenxu@ucloud.cn>
> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
> ---
> v7: add a mutex lock for add/del flow_indr_block_ing_cb
>
> include/net/flow_offload.h | 10 ++++++++-
> net/core/flow_offload.c | 51 ++++++++++++++++++++++++++++++++++------------
> net/sched/cls_api.c | 9 +++++++-
> 3 files changed, 55 insertions(+), 15 deletions(-)
>
> diff --git a/include/net/flow_offload.h b/include/net/flow_offload.h
> index 46b8777..e8069b6 100644
> --- a/include/net/flow_offload.h
> +++ b/include/net/flow_offload.h
> @@ -379,6 +379,15 @@ typedef void flow_indr_block_ing_cmd_t(struct net_device *dev,
> void *cb_priv,
> enum flow_block_command command);
>
> +struct flow_indr_block_ing_entry {
> + flow_indr_block_ing_cmd_t *cb;
> + struct list_head list;
> +};
> +
> +void flow_indr_add_block_ing_cb(struct flow_indr_block_ing_entry *entry);
> +
> +void flow_indr_del_block_ing_cb(struct flow_indr_block_ing_entry *entry);
> +
> int __flow_indr_block_cb_register(struct net_device *dev, void *cb_priv,
> flow_indr_block_bind_cb_t *cb,
> void *cb_ident);
> @@ -395,7 +404,6 @@ void flow_indr_block_cb_unregister(struct net_device *dev,
> void *cb_ident);
>
> void flow_indr_block_call(struct net_device *dev,
> - flow_indr_block_ing_cmd_t *cb,
> struct flow_block_offload *bo,
> enum flow_block_command command);
>
> diff --git a/net/core/flow_offload.c b/net/core/flow_offload.c
> index 4cc18e4..64c3d4d 100644
> --- a/net/core/flow_offload.c
> +++ b/net/core/flow_offload.c
> @@ -3,6 +3,7 @@
> #include <linux/slab.h>
> #include <net/flow_offload.h>
> #include <linux/rtnetlink.h>
> +#include <linux/mutex.h>
>
> struct flow_rule *flow_rule_alloc(unsigned int num_actions)
> {
> @@ -282,6 +283,8 @@ int flow_block_cb_setup_simple(struct flow_block_offload *f,
> }
> EXPORT_SYMBOL(flow_block_cb_setup_simple);
>
> +static LIST_HEAD(block_ing_cb_list);
> +
> static struct rhashtable indr_setup_block_ht;
>
> struct flow_indr_block_cb {
> @@ -295,7 +298,6 @@ struct flow_indr_block_dev {
> struct rhash_head ht_node;
> struct net_device *dev;
> unsigned int refcnt;
> - flow_indr_block_ing_cmd_t *block_ing_cmd_cb;
> struct list_head cb_list;
> };
>
> @@ -389,6 +391,20 @@ static void flow_indr_block_cb_del(struct flow_indr_block_cb *indr_block_cb)
> kfree(indr_block_cb);
> }
>
> +static void flow_block_ing_cmd(struct net_device *dev,
> + flow_indr_block_bind_cb_t *cb,
> + void *cb_priv,
> + enum flow_block_command command)
> +{
> + struct flow_indr_block_ing_entry *entry;
> +
> + rcu_read_lock();
> + list_for_each_entry_rcu(entry, &block_ing_cb_list, list) {
> + entry->cb(dev, cb, cb_priv, command);
> + }
> + rcu_read_unlock();
> +}
Hi,
I'm getting following incorrect rcu usage warnings with this patch
caused by rcu_read_lock in flow_block_ing_cmd:
[ 401.510948] =============================
[ 401.510952] WARNING: suspicious RCU usage
[ 401.510993] 5.3.0-rc3+ #589 Not tainted
[ 401.510996] -----------------------------
[ 401.511001] include/linux/rcupdate.h:265 Illegal context switch in RCU read-side critical section!
[ 401.511004]
other info that might help us debug this:
[ 401.511008]
rcu_scheduler_active = 2, debug_locks = 1
[ 401.511012] 7 locks held by test-ecmp-add-v/7576:
[ 401.511015] #0: 00000000081d71a5 (sb_writers#4){.+.+}, at: vfs_write+0x166/0x1d0
[ 401.511037] #1: 000000002bd338c3 (&of->mutex){+.+.}, at: kernfs_fop_write+0xef/0x1b0
[ 401.511051] #2: 00000000c921c634 (kn->count#317){.+.+}, at: kernfs_fop_write+0xf7/0x1b0
[ 401.511062] #3: 00000000a19cdd56 (&dev->mutex){....}, at: sriov_numvfs_store+0x6b/0x130
[ 401.511079] #4: 000000005425fa52 (pernet_ops_rwsem){++++}, at: unregister_netdevice_notifier+0x30/0x140
[ 401.511092] #5: 00000000c5822793 (rtnl_mutex){+.+.}, at: unregister_netdevice_notifier+0x35/0x140
[ 401.511101] #6: 00000000c2f3507e (rcu_read_lock){....}, at: flow_block_ing_cmd+0x5/0x130
[ 401.511115]
stack backtrace:
[ 401.511121] CPU: 21 PID: 7576 Comm: test-ecmp-add-v Not tainted 5.3.0-rc3+ #589
[ 401.511124] Hardware name: Supermicro SYS-2028TP-DECR/X10DRT-P, BIOS 2.0b 03/30/2017
[ 401.511127] Call Trace:
[ 401.511138] dump_stack+0x85/0xc0
[ 401.511146] ___might_sleep+0x100/0x180
[ 401.511154] __mutex_lock+0x5b/0x960
[ 401.511162] ? find_held_lock+0x2b/0x80
[ 401.511173] ? __tcf_get_next_chain+0x1d/0xb0
[ 401.511179] ? mark_held_locks+0x49/0x70
[ 401.511194] ? __tcf_get_next_chain+0x1d/0xb0
[ 401.511198] __tcf_get_next_chain+0x1d/0xb0
[ 401.511251] ? uplink_rep_async_event+0x70/0x70 [mlx5_core]
[ 401.511261] tcf_block_playback_offloads+0x39/0x160
[ 401.511276] tcf_block_setup+0x1b0/0x240
[ 401.511312] ? mlx5e_rep_indr_setup_tc_cb+0xca/0x290 [mlx5_core]
[ 401.511347] ? mlx5e_rep_indr_tc_block_unbind+0x50/0x50 [mlx5_core]
[ 401.511359] tc_indr_block_get_and_ing_cmd+0x11b/0x1e0
[ 401.511404] ? mlx5e_rep_indr_tc_block_unbind+0x50/0x50 [mlx5_core]
[ 401.511414] flow_block_ing_cmd+0x7e/0x130
[ 401.511453] ? mlx5e_rep_indr_tc_block_unbind+0x50/0x50 [mlx5_core]
[ 401.511462] __flow_indr_block_cb_unregister+0x7f/0xf0
[ 401.511502] mlx5e_nic_rep_netdevice_event+0x75/0xb0 [mlx5_core]
[ 401.511513] unregister_netdevice_notifier+0xe9/0x140
[ 401.511554] mlx5e_cleanup_rep_tx+0x6f/0xe0 [mlx5_core]
[ 401.511597] mlx5e_detach_netdev+0x4b/0x60 [mlx5_core]
[ 401.511637] mlx5e_vport_rep_unload+0x71/0xc0 [mlx5_core]
[ 401.511679] esw_offloads_disable+0x5b/0x90 [mlx5_core]
[ 401.511724] mlx5_eswitch_disable.cold+0xdf/0x176 [mlx5_core]
[ 401.511759] mlx5_device_disable_sriov+0xab/0xb0 [mlx5_core]
[ 401.511794] mlx5_core_sriov_configure+0xaf/0xd0 [mlx5_core]
[ 401.511805] sriov_numvfs_store+0xf8/0x130
[ 401.511817] kernfs_fop_write+0x122/0x1b0
[ 401.511826] vfs_write+0xdb/0x1d0
[ 401.511835] ksys_write+0x65/0xe0
[ 401.511847] do_syscall_64+0x5c/0xb0
[ 401.511857] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 401.511862] RIP: 0033:0x7fad892d30f8
[ 401.511868] Code: 89 02 48 c7 c0 ff ff ff ff eb bb 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 8d 05 25 96 0d 00 8b 00 85 c0 75 17 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 60 c3 0f 1f 80 00 00 00 00 48 83
ec 28 48 89
[ 401.511871] RSP: 002b:00007ffca2a9fad8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ 401.511875] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007fad892d30f8
[ 401.511878] RDX: 0000000000000002 RSI: 000055afeb072a90 RDI: 0000000000000001
[ 401.511881] RBP: 000055afeb072a90 R08: 00000000ffffffff R09: 000000000000000a
[ 401.511884] R10: 000055afeb058710 R11: 0000000000000246 R12: 0000000000000002
[ 401.511887] R13: 00007fad893a8780 R14: 0000000000000002 R15: 00007fad893a3740
I don't think it is correct approach to try to call these callbacks with
rcu protection because:
- Cls API uses sleeping locks that cannot be used in rcu read section
(hence the included trace).
- It assumes that all implementation of classifier ops reoffload() don't
sleep.
- And that all driver offload callbacks (both block and classifier
setup) don't sleep, which is not the case.
I don't see any straightforward way to fix this, besides using some
other locking mechanism to protect block_ing_cb_list.
Regards,
Vlad
^ permalink raw reply
* Re: [PATCH v4 10/14] net: phy: adin: implement PHY subsystem software reset
From: Andrew Lunn @ 2019-08-12 14:19 UTC (permalink / raw)
To: Alexandru Ardelean
Cc: netdev, devicetree, linux-kernel, davem, robh+dt, mark.rutland,
f.fainelli, hkallweit1
In-Reply-To: <20190812112350.15242-11-alexandru.ardelean@analog.com>
> +static int adin_reset(struct phy_device *phydev)
> +{
> + /* If there is a reset GPIO just exit */
> + if (!IS_ERR_OR_NULL(phydev->mdio.reset_gpio))
> + return 0;
I'm not so happy with this.
First off, there are two possible GPIO configurations. The GPIO can be
applied to all PHYs on the MDIO bus. That GPIO is used when the bus is
probed. There can also be a per PHY GPIO, which is what you are
looking at here.
The idea of putting the GPIO handling in the core is that PHYs don't
need to worry about it. How much of a difference does it make if the
PHY is both reset via GPIO and then again in software? How slow is the
software reset? Maybe just unconditionally do the reset, if it is not
too slow.
> +
> + /* Reset PHY core regs & subsystem regs */
> + return adin_subsytem_soft_reset(phydev);
> +}
> +
> +static int adin_probe(struct phy_device *phydev)
> +{
> + return adin_reset(phydev);
> +}
Why did you decide to do this as part of probe, and not use the
.soft_reset member of phy_driver?
> +
> static struct phy_driver adin_driver[] = {
> {
> PHY_ID_MATCH_MODEL(PHY_ID_ADIN1200),
> .name = "ADIN1200",
> .config_init = adin_config_init,
> + .probe = adin_probe,
> .config_aneg = adin_config_aneg,
> .read_status = adin_read_status,
> .ack_interrupt = adin_phy_ack_intr,
> @@ -461,6 +503,7 @@ static struct phy_driver adin_driver[] = {
> PHY_ID_MATCH_MODEL(PHY_ID_ADIN1300),
> .name = "ADIN1300",
> .config_init = adin_config_init,
> + .probe = adin_probe,
> .config_aneg = adin_config_aneg,
> .read_status = adin_read_status,
> .ack_interrupt = adin_phy_ack_intr,
Thanks
Andrew
^ permalink raw reply
* Aw: Re: [BUG] access to null-pointer in dsa_switch_event when bridge set up
From: Frank Wunderlich @ 2019-08-12 14:21 UTC (permalink / raw)
To: Andrew Lunn
Cc: Vivien Didelot, Florian Fainelli, David S. Miller, netdev,
linux-kernel
In-Reply-To: <20190812134243.GK14290@lunn.ch>
Hi Andrew,
[1] seems to fix it, thank you
regards Frank
[1] https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/?id=58799865be84e2a895dab72de0e1b996ed943f22
> Gesendet: Montag, 12. August 2019 um 15:42 Uhr
> Von: "Andrew Lunn" <andrew@lunn.ch>
> Hi Frank
>
> A patch was merged last night with a fix for dsa_port_mdb_add. The
> call stack looks the same. So i think this is fixed.
>
> Andrew
^ permalink raw reply
* Re: [PATCH v4 12/14] net: phy: adin: implement downshift configuration via phy-tunable
From: Andrew Lunn @ 2019-08-12 14:21 UTC (permalink / raw)
To: Alexandru Ardelean
Cc: netdev, devicetree, linux-kernel, davem, robh+dt, mark.rutland,
f.fainelli, hkallweit1
In-Reply-To: <20190812112350.15242-13-alexandru.ardelean@analog.com>
On Mon, Aug 12, 2019 at 02:23:48PM +0300, Alexandru Ardelean wrote:
> Down-speed auto-negotiation may not always be enabled, in which case the
> PHY won't down-shift to 100 or 10 during auto-negotiation.
>
> This change enables downshift and configures the number of retries to
> default 4 (which is also in the datasheet
>
> The downshift control mechanism can also be controlled via the phy-tunable
> interface (ETHTOOL_PHY_DOWNSHIFT control).
>
> The change has been adapted from the Aquantia PHY driver.
>
> Signed-off-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Andrew
^ permalink raw reply
* Re: [PATCH v4 13/14] net: phy: adin: add ethtool get_stats support
From: Andrew Lunn @ 2019-08-12 14:26 UTC (permalink / raw)
To: Alexandru Ardelean
Cc: netdev, devicetree, linux-kernel, davem, robh+dt, mark.rutland,
f.fainelli, hkallweit1
In-Reply-To: <20190812112350.15242-14-alexandru.ardelean@analog.com>
> +/* Named just like in the datasheet */
> +static struct adin_hw_stat adin_hw_stats[] = {
> + { "RxErrCnt", 0x0014, },
> + { "MseA", 0x8402, 0, true },
> + { "MseB", 0x8403, 0, true },
> + { "MseC", 0x8404, 0, true },
> + { "MseD", 0x8405, 0, true },
> + { "FcFrmCnt", 0x940A, 0x940B }, /* FcFrmCntH + FcFrmCntL */
> + { "FcLenErrCnt", 0x940C },
> + { "FcAlgnErrCnt", 0x940D },
> + { "FcSymbErrCnt", 0x940E },
> + { "FcOszCnt", 0x940F },
> + { "FcUszCnt", 0x9410 },
> + { "FcOddCnt", 0x9411 },
> + { "FcOddPreCnt", 0x9412 },
> + { "FcDribbleBitsCnt", 0x9413 },
> + { "FcFalseCarrierCnt", 0x9414 },
I see some value in using the names from the datasheet. However, i
found it quite hard to now what these counters represent given there
current name. What is Mse? How does MseA differ from MseB? You have up
to ETH_GSTRING_LEN characters, so maybe longer names would be better?
Andrew
^ permalink raw reply
* Re: [RFC PATCH v7] rtl8xxxu: Improve TX performance of RTL8723BU on rtl8xxxu driver
From: Kalle Valo @ 2019-08-12 14:32 UTC (permalink / raw)
To: Jes Sorensen
Cc: Chris Chiu, davem, linux-wireless, netdev, linux-kernel, linux,
Daniel Drake
In-Reply-To: <d0047834-957d-0cf3-5792-31faa5315ad1@gmail.com>
Jes Sorensen <jes.sorensen@gmail.com> writes:
> On 8/5/19 9:14 AM, Chris Chiu wrote:
>> We have 3 laptops which connect the wifi by the same RTL8723BU.
>> The PCI VID/PID of the wifi chip is 10EC:B720 which is supported.
>> They have the same problem with the in-kernel rtl8xxxu driver, the
>> iperf (as a client to an ethernet-connected server) gets ~1Mbps.
>> Nevertheless, the signal strength is reported as around -40dBm,
>> which is quite good. From the wireshark capture, the tx rate for each
>> data and qos data packet is only 1Mbps. Compare to the Realtek driver
>> at https://github.com/lwfinger/rtl8723bu, the same iperf test gets
>> ~12Mbps or better. The signal strength is reported similarly around
>> -40dBm. That's why we want to improve.
>>
>> After reading the source code of the rtl8xxxu driver and Realtek's, the
>> major difference is that Realtek's driver has a watchdog which will keep
>> monitoring the signal quality and updating the rate mask just like the
>> rtl8xxxu_gen2_update_rate_mask() does if signal quality changes.
>> And this kind of watchdog also exists in rtlwifi driver of some specific
>> chips, ex rtl8192ee, rtl8188ee, rtl8723ae, rtl8821ae...etc. They have
>> the same member function named dm_watchdog and will invoke the
>> corresponding dm_refresh_rate_adaptive_mask to adjust the tx rate
>> mask.
>>
>> With this commit, the tx rate of each data and qos data packet will
>> be 39Mbps (MCS4) with the 0xF00000 as the tx rate mask. The 20th bit
>> to 23th bit means MCS4 to MCS7. It means that the firmware still picks
>> the lowest rate from the rate mask and explains why the tx rate of
>> data and qos data is always lowest 1Mbps because the default rate mask
>> passed is always 0xFFFFFFF ranges from the basic CCK rate, OFDM rate,
>> and MCS rate. However, with Realtek's driver, the tx rate observed from
>> wireshark under the same condition is almost 65Mbps or 72Mbps, which
>> indicating that rtl8xxxu could still be further improved.
>>
>> Signed-off-by: Chris Chiu <chiu@endlessm.com>
>> Reviewed-by: Daniel Drake <drake@endlessm.com>
>> ---
>
> Looks good to me! Nice work! I am actually very curious if this will
> improve performance 8192eu as well.
>
> Ideally I'd like to figure out how to make host controlled rates work,
> but in all my experiments with that, I never really got it to work well.
>
> Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
This is marked as RFC so I'm not sure what's the plan. Should I apply
this?
--
https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches
^ permalink raw reply
* Re: [PATCH v4 13/14] net: phy: adin: add ethtool get_stats support
From: Andrew Lunn @ 2019-08-12 14:33 UTC (permalink / raw)
To: Alexandru Ardelean
Cc: netdev, devicetree, linux-kernel, davem, robh+dt, mark.rutland,
f.fainelli, hkallweit1
In-Reply-To: <20190812112350.15242-14-alexandru.ardelean@analog.com>
> +static int adin_read_mmd_stat_regs(struct phy_device *phydev,
> + struct adin_hw_stat *stat,
> + u32 *val)
> +{
> + int ret;
> +
> + ret = phy_read_mmd(phydev, MDIO_MMD_VEND1, stat->reg1);
> + if (ret < 0)
> + return ret;
> +
> + *val = (ret & 0xffff);
> +
> + if (stat->reg2 == 0)
> + return 0;
> +
> + ret = phy_read_mmd(phydev, MDIO_MMD_VEND1, stat->reg2);
> + if (ret < 0)
> + return ret;
> +
> + *val <<= 16;
> + *val |= (ret & 0xffff);
> +
> + return 0;
> +}
It still looks like you have not dealt with overflow from the LSB into
the MSB between the two reads.
do {
hi1 = phy_read_mmd(phydev, MDIO_MMD_VEND1, stat->reg2);
if (hi1 < 0)
return hi1;
low = phy_read_mmd(phydev, MDIO_MMD_VEND1, stat->reg1);
if (low < 0)
return low;
hi2 = phy_read_mmd(phydev, MDIO_MMD_VEND1, stat->reg2);
if (hi2 < 0)
return hi1;
} while (hi1 != hi2)
return low | (hi << 16);
Andrew
^ permalink raw reply
* Re: [PATCH v4 14/14] dt-bindings: net: add bindings for ADIN PHY driver
From: Andrew Lunn @ 2019-08-12 14:34 UTC (permalink / raw)
To: Alexandru Ardelean
Cc: netdev, devicetree, linux-kernel, davem, robh+dt, mark.rutland,
f.fainelli, hkallweit1
In-Reply-To: <20190812112350.15242-15-alexandru.ardelean@analog.com>
On Mon, Aug 12, 2019 at 02:23:50PM +0300, Alexandru Ardelean wrote:
> This change adds bindings for the Analog Devices ADIN PHY driver, detailing
> all the properties implemented by the driver.
>
> Signed-off-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Andrew
^ permalink raw reply
* Re: [PATCH] net/mlx4_en: fix a memory leak bug
From: Wenwen Wang @ 2019-08-12 14:36 UTC (permalink / raw)
To: Tariq Toukan
Cc: David S. Miller, open list:MELLANOX ETHERNET DRIVER (mlx4_en),
open list:MELLANOX MLX4 core VPI driver, open list, Wenwen Wang
In-Reply-To: <75e09920-4ae3-0a19-4c2a-112d16bb81a5@mellanox.com>
On Mon, Aug 12, 2019 at 5:05 AM Tariq Toukan <tariqt@mellanox.com> wrote:
>
> Hi Wenwen,
>
> Thanks for your patch.
>
> On 8/12/2019 9:36 AM, Wenwen Wang wrote:
> > In mlx4_en_config_rss_steer(), 'rss_map->indir_qp' is allocated through
> > kzalloc(). After that, mlx4_qp_alloc() is invoked to configure RSS
> > indirection. However, if mlx4_qp_alloc() fails, the allocated
> > 'rss_map->indir_qp' is not deallocated, leading to a memory leak bug.
> >
> > To fix the above issue, add the 'mlx4_err' label to free
> > 'rss_map->indir_qp'.
> >
>
> Add a Fixes line.
>
> > Signed-off-by: Wenwen Wang <wenwen@cs.uga.edu> > ---
> > drivers/net/ethernet/mellanox/mlx4/en_rx.c | 3 ++-
> > 1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
> > index 6c01314..9476dbd 100644
> > --- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
> > +++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
> > @@ -1187,7 +1187,7 @@ int mlx4_en_config_rss_steer(struct mlx4_en_priv *priv)
> > err = mlx4_qp_alloc(mdev->dev, priv->base_qpn, rss_map->indir_qp);
> > if (err) {
> > en_err(priv, "Failed to allocate RSS indirection QP\n");
> > - goto rss_err;
> > + goto mlx4_err;
> > }
> >
> > rss_map->indir_qp->event = mlx4_en_sqp_event;
> > @@ -1241,6 +1241,7 @@ int mlx4_en_config_rss_steer(struct mlx4_en_priv *priv)
> > MLX4_QP_STATE_RST, NULL, 0, 0, rss_map->indir_qp);
> > mlx4_qp_remove(mdev->dev, rss_map->indir_qp);
> > mlx4_qp_free(mdev->dev, rss_map->indir_qp);
> > +mlx4_err:
>
> I don't like the label name. It's too general and not informative.
> Maybe qp_alloc_err?
Thanks! I will rework the patch.
Wenwen
^ permalink raw reply
* [PATCH net-next] net: hns3: Make hclge_func_reset_sync_vf static
From: YueHaibing @ 2019-08-12 14:41 UTC (permalink / raw)
To: yisen.zhuang, salil.mehta, davem, lipeng321, tanhuazhong
Cc: linux-kernel, netdev, YueHaibing
Fix sparse warning:
drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c:3190:5:
warning: symbol 'hclge_func_reset_sync_vf' was not declared. Should it be static?
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
---
drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
index d207dac..a3ca0e6 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
@@ -3187,7 +3187,7 @@ static int hclge_set_all_vf_rst(struct hclge_dev *hdev, bool reset)
return 0;
}
-int hclge_func_reset_sync_vf(struct hclge_dev *hdev)
+static int hclge_func_reset_sync_vf(struct hclge_dev *hdev)
{
struct hclge_pf_rst_sync_cmd *req;
struct hclge_desc desc;
--
2.7.4
^ permalink raw reply related
* [PATCH net] s390/qeth: serialize cmd reply with concurrent timeout
From: Julian Wiedmann @ 2019-08-12 14:44 UTC (permalink / raw)
To: David Miller
Cc: netdev, linux-s390, Heiko Carstens, Stefan Raspl, Ursula Braun,
Julian Wiedmann
Callbacks for a cmd reply run outside the protection of card->lock, to
allow for additional cmds to be issued & enqueued in parallel.
When qeth_send_control_data() bails out for a cmd without having
received a reply (eg. due to timeout), its callback may concurrently be
processing a reply that just arrived. In this case, the callback
potentially accesses a stale reply->reply_param area that eg. was
on-stack and has already been released.
To avoid this race, add some locking so that qeth_send_control_data()
can (1) wait for a concurrently running callback, and (2) zap any
pending callback that still wants to run.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
---
drivers/s390/net/qeth_core.h | 1 +
drivers/s390/net/qeth_core_main.c | 20 ++++++++++++++++++++
2 files changed, 21 insertions(+)
diff --git a/drivers/s390/net/qeth_core.h b/drivers/s390/net/qeth_core.h
index c7ee07ce3615..28db887d38ed 100644
--- a/drivers/s390/net/qeth_core.h
+++ b/drivers/s390/net/qeth_core.h
@@ -629,6 +629,7 @@ struct qeth_seqno {
struct qeth_reply {
struct list_head list;
struct completion received;
+ spinlock_t lock;
int (*callback)(struct qeth_card *, struct qeth_reply *,
unsigned long);
u32 seqno;
diff --git a/drivers/s390/net/qeth_core_main.c b/drivers/s390/net/qeth_core_main.c
index 4d0caeebc802..9c3310c4d61d 100644
--- a/drivers/s390/net/qeth_core_main.c
+++ b/drivers/s390/net/qeth_core_main.c
@@ -544,6 +544,7 @@ static struct qeth_reply *qeth_alloc_reply(struct qeth_card *card)
if (reply) {
refcount_set(&reply->refcnt, 1);
init_completion(&reply->received);
+ spin_lock_init(&reply->lock);
}
return reply;
}
@@ -799,6 +800,13 @@ static void qeth_issue_next_read_cb(struct qeth_card *card,
if (!reply->callback) {
rc = 0;
+ goto no_callback;
+ }
+
+ spin_lock_irqsave(&reply->lock, flags);
+ if (reply->rc) {
+ /* Bail out when the requestor has already left: */
+ rc = reply->rc;
} else {
if (cmd) {
reply->offset = (u16)((char *)cmd - (char *)iob->data);
@@ -807,7 +815,9 @@ static void qeth_issue_next_read_cb(struct qeth_card *card,
rc = reply->callback(card, reply, (unsigned long)iob);
}
}
+ spin_unlock_irqrestore(&reply->lock, flags);
+no_callback:
if (rc <= 0)
qeth_notify_reply(reply, rc);
qeth_put_reply(reply);
@@ -1749,6 +1759,16 @@ static int qeth_send_control_data(struct qeth_card *card,
rc = (timeout == -ERESTARTSYS) ? -EINTR : -ETIME;
qeth_dequeue_reply(card, reply);
+
+ if (reply_cb) {
+ /* Wait until the callback for a late reply has completed: */
+ spin_lock_irq(&reply->lock);
+ if (rc)
+ /* Zap any callback that's still pending: */
+ reply->rc = rc;
+ spin_unlock_irq(&reply->lock);
+ }
+
if (!rc)
rc = reply->rc;
qeth_put_reply(reply);
--
2.17.1
^ permalink raw reply related
* [PATCH net-next v6 0/6] net: mscc: PTP Hardware Clock (PHC) support
From: Antoine Tenart @ 2019-08-12 14:45 UTC (permalink / raw)
To: davem, richardcochran, alexandre.belloni, UNGLinuxDriver
Cc: Antoine Tenart, netdev, thomas.petazzoni, allan.nielsen, andrew
Hello,
This series introduces the PTP Hardware Clock (PHC) support to the Mscc
Ocelot switch driver. In order to make use of this, a new register bank
is added and described in the device tree, as well as a new interrupt.
The use this bank and interrupt was made optional in the driver for dt
compatibility reasons.
Thanks!
Antoine
Since v5:
- Made sure both the PTP interrupt and register bank were available to
enable supporting h/w timestamping.
- Added a check after a kzalloc.
- Add Reviewed-by tags from Andrew.
Since v4:
- Added SKBTX_IN_PROGRESS.
- Fixed two xmas trees.
- Rework the loop condition in ocelot_ptp_rdy_irq_handler.
Since v3:
- Fixed a spin_unlock_irqrestore issue.
Since v2:
- Prevented from a possible infinite loop when reading the h/w
timestamps.
- s/GFP_KERNEL/GFP_ATOMIC/ in the Tx path.
- Set rx_filter to HWTSTAMP_FILTER_PTP_V2_EVENT at probe.
- Fixed s/w timestamping dependencies.
- Added Paul Burton's Acked-by on patches 2 and 4.
Since v1:
- Used list_for_each_safe() in ocelot_deinit().
- Fixed a memory leak in ocelot_deinit() by calling
dev_kfree_skb_any().
- Fixed a locking issue in get_hwtimestamp().
- Handled the NULL case of ptp_clock_register().
- Added comments on optional dt properties.
Antoine Tenart (6):
Documentation/bindings: net: ocelot: document the PTP bank
Documentation/bindings: net: ocelot: document the PTP ready IRQ
net: mscc: describe the PTP register range
net: mscc: improve the frame header parsing readability
net: mscc: remove the frame_info cpuq member
net: mscc: PTP Hardware Clock (PHC) support
.../devicetree/bindings/net/mscc-ocelot.txt | 20 +-
drivers/net/ethernet/mscc/ocelot.c | 401 +++++++++++++++++-
drivers/net/ethernet/mscc/ocelot.h | 49 ++-
drivers/net/ethernet/mscc/ocelot_board.c | 145 ++++++-
drivers/net/ethernet/mscc/ocelot_ptp.h | 41 ++
drivers/net/ethernet/mscc/ocelot_regs.c | 11 +
6 files changed, 633 insertions(+), 34 deletions(-)
create mode 100644 drivers/net/ethernet/mscc/ocelot_ptp.h
--
2.21.0
^ permalink raw reply
* [PATCH net-next v6 1/6] Documentation/bindings: net: ocelot: document the PTP bank
From: Antoine Tenart @ 2019-08-12 14:45 UTC (permalink / raw)
To: davem, richardcochran, alexandre.belloni, UNGLinuxDriver
Cc: Antoine Tenart, netdev, thomas.petazzoni, allan.nielsen, andrew
In-Reply-To: <20190812144537.14497-1-antoine.tenart@bootlin.com>
One additional register range needs to be described within the Ocelot
device tree node: the PTP. This patch documents the binding needed to do
so.
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
---
Documentation/devicetree/bindings/net/mscc-ocelot.txt | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/Documentation/devicetree/bindings/net/mscc-ocelot.txt b/Documentation/devicetree/bindings/net/mscc-ocelot.txt
index 9e5c17d426ce..4d05a3b0f786 100644
--- a/Documentation/devicetree/bindings/net/mscc-ocelot.txt
+++ b/Documentation/devicetree/bindings/net/mscc-ocelot.txt
@@ -12,6 +12,7 @@ Required properties:
- "sys"
- "rew"
- "qs"
+ - "ptp" (optional due to backward compatibility)
- "qsys"
- "ana"
- "portX" with X from 0 to the number of last port index available on that
@@ -44,6 +45,7 @@ Example:
reg = <0x1010000 0x10000>,
<0x1030000 0x10000>,
<0x1080000 0x100>,
+ <0x10e0000 0x10000>,
<0x11e0000 0x100>,
<0x11f0000 0x100>,
<0x1200000 0x100>,
@@ -57,9 +59,10 @@ Example:
<0x1280000 0x100>,
<0x1800000 0x80000>,
<0x1880000 0x10000>;
- reg-names = "sys", "rew", "qs", "port0", "port1", "port2",
- "port3", "port4", "port5", "port6", "port7",
- "port8", "port9", "port10", "qsys", "ana";
+ reg-names = "sys", "rew", "qs", "ptp", "port0", "port1",
+ "port2", "port3", "port4", "port5", "port6",
+ "port7", "port8", "port9", "port10", "qsys",
+ "ana";
interrupts = <21 22>;
interrupt-names = "xtr", "inj";
--
2.21.0
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox