* [PATCH v2 net-next 01/14] rtnetlink: Allocate linkinfo[] as struct rtnl_newlink_tbs.
2024-10-16 18:53 [PATCH v2 net-next 00/14] rtnetlink: Refactor rtnl_{new,del,set}link() for per-netns RTNL Kuniyuki Iwashima
@ 2024-10-16 18:53 ` Kuniyuki Iwashima
2024-10-16 18:53 ` [PATCH v2 net-next 02/14] rtnetlink: Call validate_linkmsg() in do_setlink() Kuniyuki Iwashima
` (13 subsequent siblings)
14 siblings, 0 replies; 24+ messages in thread
From: Kuniyuki Iwashima @ 2024-10-16 18:53 UTC (permalink / raw)
To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev
We will move linkinfo to rtnl_newlink() and pass it down to other
functions.
Let's pack it into rtnl_newlink_tbs.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
---
net/core/rtnetlink.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index a9c92392fb1d..37193402a42c 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -3622,6 +3622,7 @@ static int rtnl_newlink_create(struct sk_buff *skb, struct ifinfomsg *ifm,
struct rtnl_newlink_tbs {
struct nlattr *tb[IFLA_MAX + 1];
+ struct nlattr *linkinfo[IFLA_INFO_MAX + 1];
struct nlattr *attr[RTNL_MAX_TYPE + 1];
struct nlattr *slave_attr[RTNL_SLAVE_MAX_TYPE + 1];
};
@@ -3630,7 +3631,7 @@ static int __rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
struct rtnl_newlink_tbs *tbs,
struct netlink_ext_ack *extack)
{
- struct nlattr *linkinfo[IFLA_INFO_MAX + 1];
+ struct nlattr ** const linkinfo = tbs->linkinfo;
struct nlattr ** const tb = tbs->tb;
const struct rtnl_link_ops *m_ops;
struct net_device *master_dev;
@@ -3685,8 +3686,9 @@ static int __rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
ifla_info_policy, NULL);
if (err < 0)
return err;
- } else
- memset(linkinfo, 0, sizeof(linkinfo));
+ } else {
+ memset(linkinfo, 0, sizeof(tbs->linkinfo));
+ }
if (linkinfo[IFLA_INFO_KIND]) {
nla_strscpy(kind, linkinfo[IFLA_INFO_KIND], sizeof(kind));
--
2.39.5 (Apple Git-154)
^ permalink raw reply related [flat|nested] 24+ messages in thread* [PATCH v2 net-next 02/14] rtnetlink: Call validate_linkmsg() in do_setlink().
2024-10-16 18:53 [PATCH v2 net-next 00/14] rtnetlink: Refactor rtnl_{new,del,set}link() for per-netns RTNL Kuniyuki Iwashima
2024-10-16 18:53 ` [PATCH v2 net-next 01/14] rtnetlink: Allocate linkinfo[] as struct rtnl_newlink_tbs Kuniyuki Iwashima
@ 2024-10-16 18:53 ` Kuniyuki Iwashima
2024-10-16 18:53 ` [PATCH v2 net-next 03/14] rtnetlink: Factorise do_setlink() path from __rtnl_newlink() Kuniyuki Iwashima
` (12 subsequent siblings)
14 siblings, 0 replies; 24+ messages in thread
From: Kuniyuki Iwashima @ 2024-10-16 18:53 UTC (permalink / raw)
To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev
There are 3 paths that finally call do_setlink(), and validate_linkmsg()
is called in each path.
1. RTM_NEWLINK
1-1. dev is found in __rtnl_newlink()
1-2. dev isn't found, but IFLA_GROUP is specified in
rtnl_group_changelink()
2. RTM_SETLINK
The next patch factorises 1-1 to a separate function.
As a preparation, let's move validate_linkmsg() calls to do_setlink().
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
---
net/core/rtnetlink.c | 15 ++++-----------
1 file changed, 4 insertions(+), 11 deletions(-)
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 37193402a42c..76593de7014c 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -2855,6 +2855,10 @@ static int do_setlink(const struct sk_buff *skb,
char ifname[IFNAMSIZ];
int err;
+ err = validate_linkmsg(dev, tb, extack);
+ if (err < 0)
+ goto errout;
+
if (tb[IFLA_IFNAME])
nla_strscpy(ifname, tb[IFLA_IFNAME], IFNAMSIZ);
else
@@ -3269,10 +3273,6 @@ static int rtnl_setlink(struct sk_buff *skb, struct nlmsghdr *nlh,
goto errout;
}
- err = validate_linkmsg(dev, tb, extack);
- if (err < 0)
- goto errout;
-
err = do_setlink(skb, dev, ifm, extack, tb, 0);
errout:
return err;
@@ -3516,9 +3516,6 @@ static int rtnl_group_changelink(const struct sk_buff *skb,
for_each_netdev_safe(net, dev, aux) {
if (dev->group == group) {
- err = validate_linkmsg(dev, tb, extack);
- if (err < 0)
- return err;
err = do_setlink(skb, dev, ifm, extack, tb, 0);
if (err < 0)
return err;
@@ -3744,10 +3741,6 @@ static int __rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
if (nlh->nlmsg_flags & NLM_F_REPLACE)
return -EOPNOTSUPP;
- err = validate_linkmsg(dev, tb, extack);
- if (err < 0)
- return err;
-
if (linkinfo[IFLA_INFO_DATA]) {
if (!ops || ops != dev->rtnl_link_ops ||
!ops->changelink)
--
2.39.5 (Apple Git-154)
^ permalink raw reply related [flat|nested] 24+ messages in thread* [PATCH v2 net-next 03/14] rtnetlink: Factorise do_setlink() path from __rtnl_newlink().
2024-10-16 18:53 [PATCH v2 net-next 00/14] rtnetlink: Refactor rtnl_{new,del,set}link() for per-netns RTNL Kuniyuki Iwashima
2024-10-16 18:53 ` [PATCH v2 net-next 01/14] rtnetlink: Allocate linkinfo[] as struct rtnl_newlink_tbs Kuniyuki Iwashima
2024-10-16 18:53 ` [PATCH v2 net-next 02/14] rtnetlink: Call validate_linkmsg() in do_setlink() Kuniyuki Iwashima
@ 2024-10-16 18:53 ` Kuniyuki Iwashima
2024-10-16 18:53 ` [PATCH v2 net-next 04/14] rtnetlink: Move simple validation from __rtnl_newlink() to rtnl_newlink() Kuniyuki Iwashima
` (11 subsequent siblings)
14 siblings, 0 replies; 24+ messages in thread
From: Kuniyuki Iwashima @ 2024-10-16 18:53 UTC (permalink / raw)
To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev
__rtnl_newlink() got too long to maintain.
For example, netdev_master_upper_dev_get()->rtnl_link_ops is fetched even
when IFLA_INFO_SLAVE_DATA is not specified.
Let's factorise the single dev do_setlink() path to a separate function.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
---
net/core/rtnetlink.c | 142 ++++++++++++++++++++++---------------------
1 file changed, 74 insertions(+), 68 deletions(-)
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 76593de7014c..21165cc2b697 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -3505,6 +3505,78 @@ struct net_device *rtnl_create_link(struct net *net, const char *ifname,
}
EXPORT_SYMBOL(rtnl_create_link);
+struct rtnl_newlink_tbs {
+ struct nlattr *tb[IFLA_MAX + 1];
+ struct nlattr *linkinfo[IFLA_INFO_MAX + 1];
+ struct nlattr *attr[RTNL_MAX_TYPE + 1];
+ struct nlattr *slave_attr[RTNL_SLAVE_MAX_TYPE + 1];
+};
+
+static int rtnl_changelink(const struct sk_buff *skb, struct nlmsghdr *nlh,
+ const struct rtnl_link_ops *ops,
+ struct net_device *dev,
+ struct rtnl_newlink_tbs *tbs,
+ struct nlattr **data,
+ struct netlink_ext_ack *extack)
+{
+ struct nlattr ** const linkinfo = tbs->linkinfo;
+ struct nlattr ** const tb = tbs->tb;
+ int status = 0;
+ int err;
+
+ if (nlh->nlmsg_flags & NLM_F_EXCL)
+ return -EEXIST;
+
+ if (nlh->nlmsg_flags & NLM_F_REPLACE)
+ return -EOPNOTSUPP;
+
+ if (linkinfo[IFLA_INFO_DATA]) {
+ if (!ops || ops != dev->rtnl_link_ops || !ops->changelink)
+ return -EOPNOTSUPP;
+
+ err = ops->changelink(dev, tb, data, extack);
+ if (err < 0)
+ return err;
+
+ status |= DO_SETLINK_NOTIFY;
+ }
+
+ if (linkinfo[IFLA_INFO_SLAVE_DATA]) {
+ const struct rtnl_link_ops *m_ops = NULL;
+ struct nlattr **slave_data = NULL;
+ struct net_device *master_dev;
+
+ master_dev = netdev_master_upper_dev_get(dev);
+ if (master_dev)
+ m_ops = master_dev->rtnl_link_ops;
+
+ if (!m_ops || !m_ops->slave_changelink)
+ return -EOPNOTSUPP;
+
+ if (m_ops->slave_maxtype > RTNL_SLAVE_MAX_TYPE)
+ return -EINVAL;
+
+ if (m_ops->slave_maxtype) {
+ err = nla_parse_nested_deprecated(tbs->slave_attr,
+ m_ops->slave_maxtype,
+ linkinfo[IFLA_INFO_SLAVE_DATA],
+ m_ops->slave_policy, extack);
+ if (err < 0)
+ return err;
+
+ slave_data = tbs->slave_attr;
+ }
+
+ err = m_ops->slave_changelink(master_dev, dev, tb, slave_data, extack);
+ if (err < 0)
+ return err;
+
+ status |= DO_SETLINK_NOTIFY;
+ }
+
+ return do_setlink(skb, dev, nlmsg_data(nlh), extack, tb, status);
+}
+
static int rtnl_group_changelink(const struct sk_buff *skb,
struct net *net, int group,
struct ifinfomsg *ifm,
@@ -3617,24 +3689,14 @@ static int rtnl_newlink_create(struct sk_buff *skb, struct ifinfomsg *ifm,
goto out;
}
-struct rtnl_newlink_tbs {
- struct nlattr *tb[IFLA_MAX + 1];
- struct nlattr *linkinfo[IFLA_INFO_MAX + 1];
- struct nlattr *attr[RTNL_MAX_TYPE + 1];
- struct nlattr *slave_attr[RTNL_SLAVE_MAX_TYPE + 1];
-};
-
static int __rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
struct rtnl_newlink_tbs *tbs,
struct netlink_ext_ack *extack)
{
struct nlattr ** const linkinfo = tbs->linkinfo;
struct nlattr ** const tb = tbs->tb;
- const struct rtnl_link_ops *m_ops;
- struct net_device *master_dev;
struct net *net = sock_net(skb->sk);
const struct rtnl_link_ops *ops;
- struct nlattr **slave_data;
char kind[MODULE_NAME_LEN];
struct net_device *dev;
struct ifinfomsg *ifm;
@@ -3669,14 +3731,6 @@ static int __rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
dev = NULL;
}
- master_dev = NULL;
- m_ops = NULL;
- if (dev) {
- master_dev = netdev_master_upper_dev_get(dev);
- if (master_dev)
- m_ops = master_dev->rtnl_link_ops;
- }
-
if (tb[IFLA_LINKINFO]) {
err = nla_parse_nested_deprecated(linkinfo, IFLA_INFO_MAX,
tb[IFLA_LINKINFO],
@@ -3715,56 +3769,8 @@ static int __rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
}
}
- slave_data = NULL;
- if (m_ops) {
- if (m_ops->slave_maxtype > RTNL_SLAVE_MAX_TYPE)
- return -EINVAL;
-
- if (m_ops->slave_maxtype &&
- linkinfo[IFLA_INFO_SLAVE_DATA]) {
- err = nla_parse_nested_deprecated(tbs->slave_attr,
- m_ops->slave_maxtype,
- linkinfo[IFLA_INFO_SLAVE_DATA],
- m_ops->slave_policy,
- extack);
- if (err < 0)
- return err;
- slave_data = tbs->slave_attr;
- }
- }
-
- if (dev) {
- int status = 0;
-
- if (nlh->nlmsg_flags & NLM_F_EXCL)
- return -EEXIST;
- if (nlh->nlmsg_flags & NLM_F_REPLACE)
- return -EOPNOTSUPP;
-
- if (linkinfo[IFLA_INFO_DATA]) {
- if (!ops || ops != dev->rtnl_link_ops ||
- !ops->changelink)
- return -EOPNOTSUPP;
-
- err = ops->changelink(dev, tb, data, extack);
- if (err < 0)
- return err;
- status |= DO_SETLINK_NOTIFY;
- }
-
- if (linkinfo[IFLA_INFO_SLAVE_DATA]) {
- if (!m_ops || !m_ops->slave_changelink)
- return -EOPNOTSUPP;
-
- err = m_ops->slave_changelink(master_dev, dev, tb,
- slave_data, extack);
- if (err < 0)
- return err;
- status |= DO_SETLINK_NOTIFY;
- }
-
- return do_setlink(skb, dev, ifm, extack, tb, status);
- }
+ if (dev)
+ return rtnl_changelink(skb, nlh, ops, dev, tbs, data, extack);
if (!(nlh->nlmsg_flags & NLM_F_CREATE)) {
/* No dev found and NLM_F_CREATE not set. Requested dev does not exist,
--
2.39.5 (Apple Git-154)
^ permalink raw reply related [flat|nested] 24+ messages in thread* [PATCH v2 net-next 04/14] rtnetlink: Move simple validation from __rtnl_newlink() to rtnl_newlink().
2024-10-16 18:53 [PATCH v2 net-next 00/14] rtnetlink: Refactor rtnl_{new,del,set}link() for per-netns RTNL Kuniyuki Iwashima
` (2 preceding siblings ...)
2024-10-16 18:53 ` [PATCH v2 net-next 03/14] rtnetlink: Factorise do_setlink() path from __rtnl_newlink() Kuniyuki Iwashima
@ 2024-10-16 18:53 ` Kuniyuki Iwashima
2024-10-16 18:53 ` [PATCH v2 net-next 05/14] rtnetlink: Move rtnl_link_ops_get() and retry " Kuniyuki Iwashima
` (10 subsequent siblings)
14 siblings, 0 replies; 24+ messages in thread
From: Kuniyuki Iwashima @ 2024-10-16 18:53 UTC (permalink / raw)
To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev
We will push RTNL down to rtnl_newlink().
Let's move RTNL-independent validation to rtnl_newlink().
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
---
net/core/rtnetlink.c | 43 ++++++++++++++++++++++++-------------------
1 file changed, 24 insertions(+), 19 deletions(-)
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 21165cc2b697..97d6ad65647c 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -3707,15 +3707,6 @@ static int __rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
#ifdef CONFIG_MODULES
replay:
#endif
- err = nlmsg_parse_deprecated(nlh, sizeof(*ifm), tb, IFLA_MAX,
- ifla_policy, extack);
- if (err < 0)
- return err;
-
- err = rtnl_ensure_unique_netns(tb, extack, false);
- if (err < 0)
- return err;
-
ifm = nlmsg_data(nlh);
if (ifm->ifi_index > 0) {
link_specified = true;
@@ -3731,16 +3722,6 @@ static int __rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
dev = NULL;
}
- if (tb[IFLA_LINKINFO]) {
- err = nla_parse_nested_deprecated(linkinfo, IFLA_INFO_MAX,
- tb[IFLA_LINKINFO],
- ifla_info_policy, NULL);
- if (err < 0)
- return err;
- } else {
- memset(linkinfo, 0, sizeof(tbs->linkinfo));
- }
-
if (linkinfo[IFLA_INFO_KIND]) {
nla_strscpy(kind, linkinfo[IFLA_INFO_KIND], sizeof(kind));
ops = rtnl_link_ops_get(kind);
@@ -3809,6 +3790,7 @@ static int __rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
static int rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
struct netlink_ext_ack *extack)
{
+ struct nlattr **tb, **linkinfo;
struct rtnl_newlink_tbs *tbs;
int ret;
@@ -3816,7 +3798,30 @@ static int rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
if (!tbs)
return -ENOMEM;
+ tb = tbs->tb;
+ ret = nlmsg_parse_deprecated(nlh, sizeof(struct ifinfomsg), tb,
+ IFLA_MAX, ifla_policy, extack);
+ if (ret < 0)
+ goto free;
+
+ ret = rtnl_ensure_unique_netns(tb, extack, false);
+ if (ret < 0)
+ goto free;
+
+ linkinfo = tbs->linkinfo;
+ if (tb[IFLA_LINKINFO]) {
+ ret = nla_parse_nested_deprecated(linkinfo, IFLA_INFO_MAX,
+ tb[IFLA_LINKINFO],
+ ifla_info_policy, NULL);
+ if (ret < 0)
+ goto free;
+ } else {
+ memset(linkinfo, 0, sizeof(tbs->linkinfo));
+ }
+
ret = __rtnl_newlink(skb, nlh, tbs, extack);
+
+free:
kfree(tbs);
return ret;
}
--
2.39.5 (Apple Git-154)
^ permalink raw reply related [flat|nested] 24+ messages in thread* [PATCH v2 net-next 05/14] rtnetlink: Move rtnl_link_ops_get() and retry to rtnl_newlink().
2024-10-16 18:53 [PATCH v2 net-next 00/14] rtnetlink: Refactor rtnl_{new,del,set}link() for per-netns RTNL Kuniyuki Iwashima
` (3 preceding siblings ...)
2024-10-16 18:53 ` [PATCH v2 net-next 04/14] rtnetlink: Move simple validation from __rtnl_newlink() to rtnl_newlink() Kuniyuki Iwashima
@ 2024-10-16 18:53 ` Kuniyuki Iwashima
2024-10-16 18:53 ` [PATCH v2 net-next 06/14] rtnetlink: Move ops->validate " Kuniyuki Iwashima
` (9 subsequent siblings)
14 siblings, 0 replies; 24+ messages in thread
From: Kuniyuki Iwashima @ 2024-10-16 18:53 UTC (permalink / raw)
To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev
Currently, if neither dev nor rtnl_link_ops is found in __rtnl_newlink(),
we release RTNL and redo the whole process after request_module(), which
complicates the logic.
The ops will be RTNL-independent later.
Let's move the ops lookup to rtnl_newlink() and do the retry earlier.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
net/core/rtnetlink.c | 42 ++++++++++++++++++------------------------
1 file changed, 18 insertions(+), 24 deletions(-)
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 97d6ad65647c..e708f0852602 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -3690,23 +3690,19 @@ static int rtnl_newlink_create(struct sk_buff *skb, struct ifinfomsg *ifm,
}
static int __rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
+ const struct rtnl_link_ops *ops,
struct rtnl_newlink_tbs *tbs,
struct netlink_ext_ack *extack)
{
struct nlattr ** const linkinfo = tbs->linkinfo;
struct nlattr ** const tb = tbs->tb;
struct net *net = sock_net(skb->sk);
- const struct rtnl_link_ops *ops;
- char kind[MODULE_NAME_LEN];
struct net_device *dev;
struct ifinfomsg *ifm;
struct nlattr **data;
bool link_specified;
int err;
-#ifdef CONFIG_MODULES
-replay:
-#endif
ifm = nlmsg_data(nlh);
if (ifm->ifi_index > 0) {
link_specified = true;
@@ -3722,14 +3718,6 @@ static int __rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
dev = NULL;
}
- if (linkinfo[IFLA_INFO_KIND]) {
- nla_strscpy(kind, linkinfo[IFLA_INFO_KIND], sizeof(kind));
- ops = rtnl_link_ops_get(kind);
- } else {
- kind[0] = '\0';
- ops = NULL;
- }
-
data = NULL;
if (ops) {
if (ops->maxtype > RTNL_MAX_TYPE)
@@ -3770,16 +3758,6 @@ static int __rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
return -EOPNOTSUPP;
if (!ops) {
-#ifdef CONFIG_MODULES
- if (kind[0]) {
- __rtnl_unlock();
- request_module("rtnl-link-%s", kind);
- rtnl_lock();
- ops = rtnl_link_ops_get(kind);
- if (ops)
- goto replay;
- }
-#endif
NL_SET_ERR_MSG(extack, "Unknown device type");
return -EOPNOTSUPP;
}
@@ -3790,6 +3768,7 @@ static int __rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
static int rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
struct netlink_ext_ack *extack)
{
+ const struct rtnl_link_ops *ops = NULL;
struct nlattr **tb, **linkinfo;
struct rtnl_newlink_tbs *tbs;
int ret;
@@ -3819,7 +3798,22 @@ static int rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
memset(linkinfo, 0, sizeof(tbs->linkinfo));
}
- ret = __rtnl_newlink(skb, nlh, tbs, extack);
+ if (linkinfo[IFLA_INFO_KIND]) {
+ char kind[MODULE_NAME_LEN];
+
+ nla_strscpy(kind, linkinfo[IFLA_INFO_KIND], sizeof(kind));
+ ops = rtnl_link_ops_get(kind);
+#ifdef CONFIG_MODULES
+ if (!ops) {
+ __rtnl_unlock();
+ request_module("rtnl-link-%s", kind);
+ rtnl_lock();
+ ops = rtnl_link_ops_get(kind);
+ }
+#endif
+ }
+
+ ret = __rtnl_newlink(skb, nlh, ops, tbs, extack);
free:
kfree(tbs);
--
2.39.5 (Apple Git-154)
^ permalink raw reply related [flat|nested] 24+ messages in thread* [PATCH v2 net-next 06/14] rtnetlink: Move ops->validate to rtnl_newlink().
2024-10-16 18:53 [PATCH v2 net-next 00/14] rtnetlink: Refactor rtnl_{new,del,set}link() for per-netns RTNL Kuniyuki Iwashima
` (4 preceding siblings ...)
2024-10-16 18:53 ` [PATCH v2 net-next 05/14] rtnetlink: Move rtnl_link_ops_get() and retry " Kuniyuki Iwashima
@ 2024-10-16 18:53 ` Kuniyuki Iwashima
2024-10-16 18:53 ` [PATCH v2 net-next 07/14] rtnetlink: Protect struct rtnl_link_ops with SRCU Kuniyuki Iwashima
` (8 subsequent siblings)
14 siblings, 0 replies; 24+ messages in thread
From: Kuniyuki Iwashima @ 2024-10-16 18:53 UTC (permalink / raw)
To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev
ops->validate() does not require RTNL.
Let's move it to rtnl_newlink().
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
---
net/core/rtnetlink.c | 49 ++++++++++++++++++++++----------------------
1 file changed, 24 insertions(+), 25 deletions(-)
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index e708f0852602..9c9290a6c271 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -3692,16 +3692,14 @@ static int rtnl_newlink_create(struct sk_buff *skb, struct ifinfomsg *ifm,
static int __rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
const struct rtnl_link_ops *ops,
struct rtnl_newlink_tbs *tbs,
+ struct nlattr **data,
struct netlink_ext_ack *extack)
{
- struct nlattr ** const linkinfo = tbs->linkinfo;
struct nlattr ** const tb = tbs->tb;
struct net *net = sock_net(skb->sk);
struct net_device *dev;
struct ifinfomsg *ifm;
- struct nlattr **data;
bool link_specified;
- int err;
ifm = nlmsg_data(nlh);
if (ifm->ifi_index > 0) {
@@ -3718,26 +3716,6 @@ static int __rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
dev = NULL;
}
- data = NULL;
- if (ops) {
- if (ops->maxtype > RTNL_MAX_TYPE)
- return -EINVAL;
-
- if (ops->maxtype && linkinfo[IFLA_INFO_DATA]) {
- err = nla_parse_nested_deprecated(tbs->attr, ops->maxtype,
- linkinfo[IFLA_INFO_DATA],
- ops->policy, extack);
- if (err < 0)
- return err;
- data = tbs->attr;
- }
- if (ops->validate) {
- err = ops->validate(tb, data, extack);
- if (err < 0)
- return err;
- }
- }
-
if (dev)
return rtnl_changelink(skb, nlh, ops, dev, tbs, data, extack);
@@ -3768,8 +3746,8 @@ static int __rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
static int rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
struct netlink_ext_ack *extack)
{
+ struct nlattr **tb, **linkinfo, **data = NULL;
const struct rtnl_link_ops *ops = NULL;
- struct nlattr **tb, **linkinfo;
struct rtnl_newlink_tbs *tbs;
int ret;
@@ -3813,7 +3791,28 @@ static int rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
#endif
}
- ret = __rtnl_newlink(skb, nlh, ops, tbs, extack);
+ if (ops) {
+ if (ops->maxtype > RTNL_MAX_TYPE)
+ return -EINVAL;
+
+ if (ops->maxtype && linkinfo[IFLA_INFO_DATA]) {
+ ret = nla_parse_nested_deprecated(tbs->attr, ops->maxtype,
+ linkinfo[IFLA_INFO_DATA],
+ ops->policy, extack);
+ if (ret < 0)
+ goto free;
+
+ data = tbs->attr;
+ }
+
+ if (ops->validate) {
+ ret = ops->validate(tb, data, extack);
+ if (ret < 0)
+ goto free;
+ }
+ }
+
+ ret = __rtnl_newlink(skb, nlh, ops, tbs, data, extack);
free:
kfree(tbs);
--
2.39.5 (Apple Git-154)
^ permalink raw reply related [flat|nested] 24+ messages in thread* [PATCH v2 net-next 07/14] rtnetlink: Protect struct rtnl_link_ops with SRCU.
2024-10-16 18:53 [PATCH v2 net-next 00/14] rtnetlink: Refactor rtnl_{new,del,set}link() for per-netns RTNL Kuniyuki Iwashima
` (5 preceding siblings ...)
2024-10-16 18:53 ` [PATCH v2 net-next 06/14] rtnetlink: Move ops->validate " Kuniyuki Iwashima
@ 2024-10-16 18:53 ` Kuniyuki Iwashima
2024-10-22 8:35 ` Paolo Abeni
2024-10-16 18:53 ` [PATCH v2 net-next 08/14] rtnetlink: Call rtnl_link_get_net_capable() in rtnl_newlink() Kuniyuki Iwashima
` (7 subsequent siblings)
14 siblings, 1 reply; 24+ messages in thread
From: Kuniyuki Iwashima @ 2024-10-16 18:53 UTC (permalink / raw)
To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev
Once RTNL is replaced with rtnl_net_lock(), we need a mechanism to
guarantee that rtnl_link_ops is alive during inflight RTM_NEWLINK
even when its module is being unloaded.
Let's use SRCU to protect ops.
rtnl_link_ops_get() now iterates link_ops under RCU and returns
SRCU-protected ops pointer. The caller must call rtnl_link_ops_put()
to release the pointer after the use.
Also, __rtnl_link_unregister() unlinks the ops first and calls
synchronize_srcu() to wait for inflight RTM_NEWLINK requests to
complete.
Note that link_ops needs to be protected by its dedicated lock
when RTNL is removed.
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
v2:
* Handle error of init_srcu_struct().
* Call cleanup_srcu_struct() after synchronize_srcu().
---
include/net/rtnetlink.h | 5 ++-
net/core/rtnetlink.c | 83 ++++++++++++++++++++++++++++++-----------
2 files changed, 65 insertions(+), 23 deletions(-)
diff --git a/include/net/rtnetlink.h b/include/net/rtnetlink.h
index bb49c5708ce7..1a6aa5ca74f3 100644
--- a/include/net/rtnetlink.h
+++ b/include/net/rtnetlink.h
@@ -3,6 +3,7 @@
#define __NET_RTNETLINK_H
#include <linux/rtnetlink.h>
+#include <linux/srcu.h>
#include <net/netlink.h>
typedef int (*rtnl_doit_func)(struct sk_buff *, struct nlmsghdr *,
@@ -69,7 +70,8 @@ static inline int rtnl_msg_family(const struct nlmsghdr *nlh)
/**
* struct rtnl_link_ops - rtnetlink link operations
*
- * @list: Used internally
+ * @list: Used internally, protected by RTNL and SRCU
+ * @srcu: Used internally
* @kind: Identifier
* @netns_refund: Physical device, move to init_net on netns exit
* @maxtype: Highest device specific netlink attribute number
@@ -100,6 +102,7 @@ static inline int rtnl_msg_family(const struct nlmsghdr *nlh)
*/
struct rtnl_link_ops {
struct list_head list;
+ struct srcu_struct srcu;
const char *kind;
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 9c9290a6c271..31b105b3a834 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -457,15 +457,29 @@ EXPORT_SYMBOL_GPL(__rtnl_unregister_many);
static LIST_HEAD(link_ops);
-static const struct rtnl_link_ops *rtnl_link_ops_get(const char *kind)
+static struct rtnl_link_ops *rtnl_link_ops_get(const char *kind, int *srcu_index)
{
- const struct rtnl_link_ops *ops;
+ struct rtnl_link_ops *ops;
- list_for_each_entry(ops, &link_ops, list) {
- if (!strcmp(ops->kind, kind))
- return ops;
+ rcu_read_lock();
+
+ list_for_each_entry_rcu(ops, &link_ops, list) {
+ if (!strcmp(ops->kind, kind)) {
+ *srcu_index = srcu_read_lock(&ops->srcu);
+ goto unlock;
+ }
}
- return NULL;
+
+ ops = NULL;
+unlock:
+ rcu_read_unlock();
+
+ return ops;
+}
+
+static void rtnl_link_ops_put(struct rtnl_link_ops *ops, int srcu_index)
+{
+ srcu_read_unlock(&ops->srcu, srcu_index);
}
/**
@@ -480,8 +494,16 @@ static const struct rtnl_link_ops *rtnl_link_ops_get(const char *kind)
*/
int __rtnl_link_register(struct rtnl_link_ops *ops)
{
- if (rtnl_link_ops_get(ops->kind))
- return -EEXIST;
+ struct rtnl_link_ops *tmp;
+ int err;
+
+ /* When RTNL is removed, add lock for link_ops. */
+ ASSERT_RTNL();
+
+ list_for_each_entry(tmp, &link_ops, list) {
+ if (!strcmp(ops->kind, tmp->kind))
+ return -EEXIST;
+ }
/* The check for alloc/setup is here because if ops
* does not have that filled up, it is not possible
@@ -491,7 +513,12 @@ int __rtnl_link_register(struct rtnl_link_ops *ops)
if ((ops->alloc || ops->setup) && !ops->dellink)
ops->dellink = unregister_netdevice_queue;
- list_add_tail(&ops->list, &link_ops);
+ err = init_srcu_struct(&ops->srcu);
+ if (err)
+ return err;
+
+ list_add_tail_rcu(&ops->list, &link_ops);
+
return 0;
}
EXPORT_SYMBOL_GPL(__rtnl_link_register);
@@ -542,10 +569,12 @@ void __rtnl_link_unregister(struct rtnl_link_ops *ops)
{
struct net *net;
- for_each_net(net) {
+ list_del_rcu(&ops->list);
+ synchronize_srcu(&ops->srcu);
+ cleanup_srcu_struct(&ops->srcu);
+
+ for_each_net(net)
__rtnl_kill_links(net, ops);
- }
- list_del(&ops->list);
}
EXPORT_SYMBOL_GPL(__rtnl_link_unregister);
@@ -2158,10 +2187,11 @@ static const struct nla_policy ifla_xdp_policy[IFLA_XDP_MAX + 1] = {
[IFLA_XDP_PROG_ID] = { .type = NLA_U32 },
};
-static const struct rtnl_link_ops *linkinfo_to_kind_ops(const struct nlattr *nla)
+static struct rtnl_link_ops *linkinfo_to_kind_ops(const struct nlattr *nla,
+ int *ops_srcu_index)
{
- const struct rtnl_link_ops *ops = NULL;
struct nlattr *linfo[IFLA_INFO_MAX + 1];
+ struct rtnl_link_ops *ops = NULL;
if (nla_parse_nested_deprecated(linfo, IFLA_INFO_MAX, nla, ifla_info_policy, NULL) < 0)
return NULL;
@@ -2170,7 +2200,7 @@ static const struct rtnl_link_ops *linkinfo_to_kind_ops(const struct nlattr *nla
char kind[MODULE_NAME_LEN];
nla_strscpy(kind, linfo[IFLA_INFO_KIND], sizeof(kind));
- ops = rtnl_link_ops_get(kind);
+ ops = rtnl_link_ops_get(kind, ops_srcu_index);
}
return ops;
@@ -2290,8 +2320,8 @@ static int rtnl_valid_dump_ifinfo_req(const struct nlmsghdr *nlh,
static int rtnl_dump_ifinfo(struct sk_buff *skb, struct netlink_callback *cb)
{
- const struct rtnl_link_ops *kind_ops = NULL;
struct netlink_ext_ack *extack = cb->extack;
+ struct rtnl_link_ops *kind_ops = NULL;
const struct nlmsghdr *nlh = cb->nlh;
struct net *net = sock_net(skb->sk);
unsigned int flags = NLM_F_MULTI;
@@ -2302,6 +2332,7 @@ static int rtnl_dump_ifinfo(struct sk_buff *skb, struct netlink_callback *cb)
struct net *tgt_net = net;
u32 ext_filter_mask = 0;
struct net_device *dev;
+ int ops_srcu_index;
int master_idx = 0;
int netnsid = -1;
int err, i;
@@ -2335,7 +2366,7 @@ static int rtnl_dump_ifinfo(struct sk_buff *skb, struct netlink_callback *cb)
master_idx = nla_get_u32(tb[i]);
break;
case IFLA_LINKINFO:
- kind_ops = linkinfo_to_kind_ops(tb[i]);
+ kind_ops = linkinfo_to_kind_ops(tb[i], &ops_srcu_index);
break;
default:
if (cb->strict_check) {
@@ -2361,6 +2392,10 @@ static int rtnl_dump_ifinfo(struct sk_buff *skb, struct netlink_callback *cb)
if (err < 0)
break;
}
+
+ if (kind_ops)
+ rtnl_link_ops_put(kind_ops, ops_srcu_index);
+
cb->seq = tgt_net->dev_base_seq;
nl_dump_check_consistent(cb, nlmsg_hdr(skb));
if (netnsid >= 0)
@@ -3747,8 +3782,9 @@ static int rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
struct netlink_ext_ack *extack)
{
struct nlattr **tb, **linkinfo, **data = NULL;
- const struct rtnl_link_ops *ops = NULL;
+ struct rtnl_link_ops *ops = NULL;
struct rtnl_newlink_tbs *tbs;
+ int ops_srcu_index;
int ret;
tbs = kmalloc(sizeof(*tbs), GFP_KERNEL);
@@ -3780,13 +3816,13 @@ static int rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
char kind[MODULE_NAME_LEN];
nla_strscpy(kind, linkinfo[IFLA_INFO_KIND], sizeof(kind));
- ops = rtnl_link_ops_get(kind);
+ ops = rtnl_link_ops_get(kind, &ops_srcu_index);
#ifdef CONFIG_MODULES
if (!ops) {
__rtnl_unlock();
request_module("rtnl-link-%s", kind);
rtnl_lock();
- ops = rtnl_link_ops_get(kind);
+ ops = rtnl_link_ops_get(kind, &ops_srcu_index);
}
#endif
}
@@ -3800,7 +3836,7 @@ static int rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
linkinfo[IFLA_INFO_DATA],
ops->policy, extack);
if (ret < 0)
- goto free;
+ goto put_ops;
data = tbs->attr;
}
@@ -3808,12 +3844,15 @@ static int rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
if (ops->validate) {
ret = ops->validate(tb, data, extack);
if (ret < 0)
- goto free;
+ goto put_ops;
}
}
ret = __rtnl_newlink(skb, nlh, ops, tbs, data, extack);
+put_ops:
+ if (ops)
+ rtnl_link_ops_put(ops, ops_srcu_index);
free:
kfree(tbs);
return ret;
--
2.39.5 (Apple Git-154)
^ permalink raw reply related [flat|nested] 24+ messages in thread* Re: [PATCH v2 net-next 07/14] rtnetlink: Protect struct rtnl_link_ops with SRCU.
2024-10-16 18:53 ` [PATCH v2 net-next 07/14] rtnetlink: Protect struct rtnl_link_ops with SRCU Kuniyuki Iwashima
@ 2024-10-22 8:35 ` Paolo Abeni
2024-10-22 8:42 ` Paolo Abeni
0 siblings, 1 reply; 24+ messages in thread
From: Paolo Abeni @ 2024-10-22 8:35 UTC (permalink / raw)
To: Kuniyuki Iwashima, David S. Miller, Eric Dumazet, Jakub Kicinski
Cc: Kuniyuki Iwashima, netdev
Hi,
On 10/16/24 20:53, Kuniyuki Iwashima wrote:
> @@ -457,15 +457,29 @@ EXPORT_SYMBOL_GPL(__rtnl_unregister_many);
>
> static LIST_HEAD(link_ops);
>
> -static const struct rtnl_link_ops *rtnl_link_ops_get(const char *kind)
> +static struct rtnl_link_ops *rtnl_link_ops_get(const char *kind, int *srcu_index)
> {
This lacks an 'acquire' annotation to make sparse happy. Similar thing
for the _put() helper. Also if let such helper to cope with NULL ops,
some dups checks could be avoided in later code.
Since the netdev backlog is huge, I don't consider this blocking, but
please follow-up ASAP,
thanks!
Paolo
^ permalink raw reply [flat|nested] 24+ messages in thread* Re: [PATCH v2 net-next 07/14] rtnetlink: Protect struct rtnl_link_ops with SRCU.
2024-10-22 8:35 ` Paolo Abeni
@ 2024-10-22 8:42 ` Paolo Abeni
0 siblings, 0 replies; 24+ messages in thread
From: Paolo Abeni @ 2024-10-22 8:42 UTC (permalink / raw)
To: Kuniyuki Iwashima, David S. Miller, Eric Dumazet, Jakub Kicinski
Cc: Kuniyuki Iwashima, netdev
On 10/22/24 10:35, Paolo Abeni wrote:
> On 10/16/24 20:53, Kuniyuki Iwashima wrote:
>> @@ -457,15 +457,29 @@ EXPORT_SYMBOL_GPL(__rtnl_unregister_many);
>>
>> static LIST_HEAD(link_ops);
>>
>> -static const struct rtnl_link_ops *rtnl_link_ops_get(const char *kind)
>> +static struct rtnl_link_ops *rtnl_link_ops_get(const char *kind, int *srcu_index)
>> {
>
> This lacks an 'acquire' annotation to make sparse happy. Similar thing
> for the _put() helper. Also if let such helper to cope with NULL ops,
> some dups checks could be avoided in later code.
>
> Since the netdev backlog is huge, I don't consider this blocking, but
> please follow-up ASAP,
Just ignore the above comment. Unfortunately we can't attach the
relevant annotation to rtnl_link_ops_get().
/P
^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH v2 net-next 08/14] rtnetlink: Call rtnl_link_get_net_capable() in rtnl_newlink().
2024-10-16 18:53 [PATCH v2 net-next 00/14] rtnetlink: Refactor rtnl_{new,del,set}link() for per-netns RTNL Kuniyuki Iwashima
` (6 preceding siblings ...)
2024-10-16 18:53 ` [PATCH v2 net-next 07/14] rtnetlink: Protect struct rtnl_link_ops with SRCU Kuniyuki Iwashima
@ 2024-10-16 18:53 ` Kuniyuki Iwashima
2024-10-16 18:53 ` [PATCH v2 net-next 09/14] rtnetlink: Fetch IFLA_LINK_NETNSID " Kuniyuki Iwashima
` (6 subsequent siblings)
14 siblings, 0 replies; 24+ messages in thread
From: Kuniyuki Iwashima @ 2024-10-16 18:53 UTC (permalink / raw)
To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev
As a prerequisite of per-netns RTNL, we must fetch netns before
looking up dev or moving it to another netns.
rtnl_link_get_net_capable() is called in rtnl_newlink_create() and
do_setlink(), but both of them need to be moved to the RTNL-independent
region, which will be rtnl_newlink().
Let's call rtnl_link_get_net_capable() in rtnl_newlink() and pass the
netns down to where needed.
Note that the latter two have not passed the nets to do_setlink() yet
but will do so after the remaining rtnl_link_get_net_capable() is moved
to rtnl_setlink() later.
While at it, dest_net is renamed to tgt_net in rtnl_newlink_create() to
align with rtnl_{del,set}link().
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
---
net/core/rtnetlink.c | 51 ++++++++++++++++++++++++--------------------
1 file changed, 28 insertions(+), 23 deletions(-)
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 31b105b3a834..f6823c8d21ad 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -3549,7 +3549,7 @@ struct rtnl_newlink_tbs {
static int rtnl_changelink(const struct sk_buff *skb, struct nlmsghdr *nlh,
const struct rtnl_link_ops *ops,
- struct net_device *dev,
+ struct net_device *dev, struct net *tgt_net,
struct rtnl_newlink_tbs *tbs,
struct nlattr **data,
struct netlink_ext_ack *extack)
@@ -3613,10 +3613,10 @@ static int rtnl_changelink(const struct sk_buff *skb, struct nlmsghdr *nlh,
}
static int rtnl_group_changelink(const struct sk_buff *skb,
- struct net *net, int group,
- struct ifinfomsg *ifm,
- struct netlink_ext_ack *extack,
- struct nlattr **tb)
+ struct net *net, struct net *tgt_net,
+ int group, struct ifinfomsg *ifm,
+ struct netlink_ext_ack *extack,
+ struct nlattr **tb)
{
struct net_device *dev, *aux;
int err;
@@ -3634,6 +3634,7 @@ static int rtnl_group_changelink(const struct sk_buff *skb,
static int rtnl_newlink_create(struct sk_buff *skb, struct ifinfomsg *ifm,
const struct rtnl_link_ops *ops,
+ struct net *tgt_net,
const struct nlmsghdr *nlh,
struct nlattr **tb, struct nlattr **data,
struct netlink_ext_ack *extack)
@@ -3641,9 +3642,9 @@ static int rtnl_newlink_create(struct sk_buff *skb, struct ifinfomsg *ifm,
unsigned char name_assign_type = NET_NAME_USER;
struct net *net = sock_net(skb->sk);
u32 portid = NETLINK_CB(skb).portid;
- struct net *dest_net, *link_net;
struct net_device *dev;
char ifname[IFNAMSIZ];
+ struct net *link_net;
int err;
if (!ops->alloc && !ops->setup)
@@ -3656,14 +3657,10 @@ static int rtnl_newlink_create(struct sk_buff *skb, struct ifinfomsg *ifm,
name_assign_type = NET_NAME_ENUM;
}
- dest_net = rtnl_link_get_net_capable(skb, net, tb, CAP_NET_ADMIN);
- if (IS_ERR(dest_net))
- return PTR_ERR(dest_net);
-
if (tb[IFLA_LINK_NETNSID]) {
int id = nla_get_s32(tb[IFLA_LINK_NETNSID]);
- link_net = get_net_ns_by_id(dest_net, id);
+ link_net = get_net_ns_by_id(tgt_net, id);
if (!link_net) {
NL_SET_ERR_MSG(extack, "Unknown network namespace id");
err = -EINVAL;
@@ -3676,7 +3673,7 @@ static int rtnl_newlink_create(struct sk_buff *skb, struct ifinfomsg *ifm,
link_net = NULL;
}
- dev = rtnl_create_link(link_net ? : dest_net, ifname,
+ dev = rtnl_create_link(link_net ? : tgt_net, ifname,
name_assign_type, ops, tb, extack);
if (IS_ERR(dev)) {
err = PTR_ERR(dev);
@@ -3698,7 +3695,7 @@ static int rtnl_newlink_create(struct sk_buff *skb, struct ifinfomsg *ifm,
if (err < 0)
goto out_unregister;
if (link_net) {
- err = dev_change_net_namespace(dev, dest_net, ifname);
+ err = dev_change_net_namespace(dev, tgt_net, ifname);
if (err < 0)
goto out_unregister;
}
@@ -3710,7 +3707,7 @@ static int rtnl_newlink_create(struct sk_buff *skb, struct ifinfomsg *ifm,
out:
if (link_net)
put_net(link_net);
- put_net(dest_net);
+
return err;
out_unregister:
if (ops->newlink) {
@@ -3726,6 +3723,7 @@ static int rtnl_newlink_create(struct sk_buff *skb, struct ifinfomsg *ifm,
static int __rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
const struct rtnl_link_ops *ops,
+ struct net *tgt_net,
struct rtnl_newlink_tbs *tbs,
struct nlattr **data,
struct netlink_ext_ack *extack)
@@ -3752,19 +3750,18 @@ static int __rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
}
if (dev)
- return rtnl_changelink(skb, nlh, ops, dev, tbs, data, extack);
+ return rtnl_changelink(skb, nlh, ops, dev, tgt_net, tbs, data, extack);
if (!(nlh->nlmsg_flags & NLM_F_CREATE)) {
/* No dev found and NLM_F_CREATE not set. Requested dev does not exist,
* or it's for a group
*/
- if (link_specified)
+ if (link_specified || !tb[IFLA_GROUP])
return -ENODEV;
- if (tb[IFLA_GROUP])
- return rtnl_group_changelink(skb, net,
- nla_get_u32(tb[IFLA_GROUP]),
- ifm, extack, tb);
- return -ENODEV;
+
+ return rtnl_group_changelink(skb, net, tgt_net,
+ nla_get_u32(tb[IFLA_GROUP]),
+ ifm, extack, tb);
}
if (tb[IFLA_MAP] || tb[IFLA_PROTINFO])
@@ -3775,7 +3772,7 @@ static int __rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
return -EOPNOTSUPP;
}
- return rtnl_newlink_create(skb, ifm, ops, nlh, tb, data, extack);
+ return rtnl_newlink_create(skb, ifm, ops, tgt_net, nlh, tb, data, extack);
}
static int rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
@@ -3784,6 +3781,7 @@ static int rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
struct nlattr **tb, **linkinfo, **data = NULL;
struct rtnl_link_ops *ops = NULL;
struct rtnl_newlink_tbs *tbs;
+ struct net *tgt_net;
int ops_srcu_index;
int ret;
@@ -3848,8 +3846,15 @@ static int rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
}
}
- ret = __rtnl_newlink(skb, nlh, ops, tbs, data, extack);
+ tgt_net = rtnl_link_get_net_capable(skb, sock_net(skb->sk), tb, CAP_NET_ADMIN);
+ if (IS_ERR(tgt_net)) {
+ ret = PTR_ERR(tgt_net);
+ goto put_ops;
+ }
+
+ ret = __rtnl_newlink(skb, nlh, ops, tgt_net, tbs, data, extack);
+ put_net(tgt_net);
put_ops:
if (ops)
rtnl_link_ops_put(ops, ops_srcu_index);
--
2.39.5 (Apple Git-154)
^ permalink raw reply related [flat|nested] 24+ messages in thread* [PATCH v2 net-next 09/14] rtnetlink: Fetch IFLA_LINK_NETNSID in rtnl_newlink().
2024-10-16 18:53 [PATCH v2 net-next 00/14] rtnetlink: Refactor rtnl_{new,del,set}link() for per-netns RTNL Kuniyuki Iwashima
` (7 preceding siblings ...)
2024-10-16 18:53 ` [PATCH v2 net-next 08/14] rtnetlink: Call rtnl_link_get_net_capable() in rtnl_newlink() Kuniyuki Iwashima
@ 2024-10-16 18:53 ` Kuniyuki Iwashima
2024-10-16 18:53 ` [PATCH v2 net-next 10/14] rtnetlink: Clean up rtnl_dellink() Kuniyuki Iwashima
` (5 subsequent siblings)
14 siblings, 0 replies; 24+ messages in thread
From: Kuniyuki Iwashima @ 2024-10-16 18:53 UTC (permalink / raw)
To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev
Another netns option for RTM_NEWLINK is IFLA_LINK_NETNSID and
is fetched in rtnl_newlink_create().
This must be done before holding rtnl_net_lock().
Let's move IFLA_LINK_NETNSID processing to rtnl_newlink().
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
---
net/core/rtnetlink.c | 49 ++++++++++++++++++++++----------------------
1 file changed, 24 insertions(+), 25 deletions(-)
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index f6823c8d21ad..eee0f820ddf6 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -3634,7 +3634,7 @@ static int rtnl_group_changelink(const struct sk_buff *skb,
static int rtnl_newlink_create(struct sk_buff *skb, struct ifinfomsg *ifm,
const struct rtnl_link_ops *ops,
- struct net *tgt_net,
+ struct net *tgt_net, struct net *link_net,
const struct nlmsghdr *nlh,
struct nlattr **tb, struct nlattr **data,
struct netlink_ext_ack *extack)
@@ -3644,7 +3644,6 @@ static int rtnl_newlink_create(struct sk_buff *skb, struct ifinfomsg *ifm,
u32 portid = NETLINK_CB(skb).portid;
struct net_device *dev;
char ifname[IFNAMSIZ];
- struct net *link_net;
int err;
if (!ops->alloc && !ops->setup)
@@ -3657,22 +3656,6 @@ static int rtnl_newlink_create(struct sk_buff *skb, struct ifinfomsg *ifm,
name_assign_type = NET_NAME_ENUM;
}
- if (tb[IFLA_LINK_NETNSID]) {
- int id = nla_get_s32(tb[IFLA_LINK_NETNSID]);
-
- link_net = get_net_ns_by_id(tgt_net, id);
- if (!link_net) {
- NL_SET_ERR_MSG(extack, "Unknown network namespace id");
- err = -EINVAL;
- goto out;
- }
- err = -EPERM;
- if (!netlink_ns_capable(skb, link_net->user_ns, CAP_NET_ADMIN))
- goto out;
- } else {
- link_net = NULL;
- }
-
dev = rtnl_create_link(link_net ? : tgt_net, ifname,
name_assign_type, ops, tb, extack);
if (IS_ERR(dev)) {
@@ -3705,9 +3688,6 @@ static int rtnl_newlink_create(struct sk_buff *skb, struct ifinfomsg *ifm,
goto out_unregister;
}
out:
- if (link_net)
- put_net(link_net);
-
return err;
out_unregister:
if (ops->newlink) {
@@ -3723,7 +3703,7 @@ static int rtnl_newlink_create(struct sk_buff *skb, struct ifinfomsg *ifm,
static int __rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
const struct rtnl_link_ops *ops,
- struct net *tgt_net,
+ struct net *tgt_net, struct net *link_net,
struct rtnl_newlink_tbs *tbs,
struct nlattr **data,
struct netlink_ext_ack *extack)
@@ -3772,16 +3752,16 @@ static int __rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
return -EOPNOTSUPP;
}
- return rtnl_newlink_create(skb, ifm, ops, tgt_net, nlh, tb, data, extack);
+ return rtnl_newlink_create(skb, ifm, ops, tgt_net, link_net, nlh, tb, data, extack);
}
static int rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
struct netlink_ext_ack *extack)
{
struct nlattr **tb, **linkinfo, **data = NULL;
+ struct net *tgt_net, *link_net = NULL;
struct rtnl_link_ops *ops = NULL;
struct rtnl_newlink_tbs *tbs;
- struct net *tgt_net;
int ops_srcu_index;
int ret;
@@ -3852,8 +3832,27 @@ static int rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
goto put_ops;
}
- ret = __rtnl_newlink(skb, nlh, ops, tgt_net, tbs, data, extack);
+ if (tb[IFLA_LINK_NETNSID]) {
+ int id = nla_get_s32(tb[IFLA_LINK_NETNSID]);
+
+ link_net = get_net_ns_by_id(tgt_net, id);
+ if (!link_net) {
+ NL_SET_ERR_MSG(extack, "Unknown network namespace id");
+ ret = -EINVAL;
+ goto put_net;
+ }
+
+ if (!netlink_ns_capable(skb, link_net->user_ns, CAP_NET_ADMIN)) {
+ ret = -EPERM;
+ goto put_net;
+ }
+ }
+
+ ret = __rtnl_newlink(skb, nlh, ops, tgt_net, link_net, tbs, data, extack);
+put_net:
+ if (link_net)
+ put_net(link_net);
put_net(tgt_net);
put_ops:
if (ops)
--
2.39.5 (Apple Git-154)
^ permalink raw reply related [flat|nested] 24+ messages in thread* [PATCH v2 net-next 10/14] rtnetlink: Clean up rtnl_dellink().
2024-10-16 18:53 [PATCH v2 net-next 00/14] rtnetlink: Refactor rtnl_{new,del,set}link() for per-netns RTNL Kuniyuki Iwashima
` (8 preceding siblings ...)
2024-10-16 18:53 ` [PATCH v2 net-next 09/14] rtnetlink: Fetch IFLA_LINK_NETNSID " Kuniyuki Iwashima
@ 2024-10-16 18:53 ` Kuniyuki Iwashima
2024-10-16 18:53 ` [PATCH v2 net-next 11/14] rtnetlink: Clean up rtnl_setlink() Kuniyuki Iwashima
` (4 subsequent siblings)
14 siblings, 0 replies; 24+ messages in thread
From: Kuniyuki Iwashima @ 2024-10-16 18:53 UTC (permalink / raw)
To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev
We will push RTNL down to rtnl_delink().
Let's unify the error path to make it easy to place rtnl_net_lock().
While at it, keep the variables in reverse xmas order.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
---
net/core/rtnetlink.c | 27 ++++++++++-----------------
1 file changed, 10 insertions(+), 17 deletions(-)
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index eee0f820ddf6..a19b2eb2727e 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -3368,14 +3368,14 @@ EXPORT_SYMBOL_GPL(rtnl_delete_link);
static int rtnl_dellink(struct sk_buff *skb, struct nlmsghdr *nlh,
struct netlink_ext_ack *extack)
{
+ struct ifinfomsg *ifm = nlmsg_data(nlh);
struct net *net = sock_net(skb->sk);
u32 portid = NETLINK_CB(skb).portid;
- struct net *tgt_net = net;
- struct net_device *dev = NULL;
- struct ifinfomsg *ifm;
struct nlattr *tb[IFLA_MAX+1];
- int err;
+ struct net_device *dev = NULL;
+ struct net *tgt_net = net;
int netnsid = -1;
+ int err;
err = nlmsg_parse_deprecated(nlh, sizeof(*ifm), tb, IFLA_MAX,
ifla_policy, extack);
@@ -3393,27 +3393,20 @@ static int rtnl_dellink(struct sk_buff *skb, struct nlmsghdr *nlh,
return PTR_ERR(tgt_net);
}
- err = -EINVAL;
- ifm = nlmsg_data(nlh);
if (ifm->ifi_index > 0)
dev = __dev_get_by_index(tgt_net, ifm->ifi_index);
else if (tb[IFLA_IFNAME] || tb[IFLA_ALT_IFNAME])
dev = rtnl_dev_get(tgt_net, tb);
+
+ if (dev)
+ err = rtnl_delete_link(dev, portid, nlh);
+ else if (ifm->ifi_index > 0 || tb[IFLA_IFNAME] || tb[IFLA_ALT_IFNAME])
+ err = -ENODEV;
else if (tb[IFLA_GROUP])
err = rtnl_group_dellink(tgt_net, nla_get_u32(tb[IFLA_GROUP]));
else
- goto out;
-
- if (!dev) {
- if (tb[IFLA_IFNAME] || tb[IFLA_ALT_IFNAME] || ifm->ifi_index > 0)
- err = -ENODEV;
-
- goto out;
- }
-
- err = rtnl_delete_link(dev, portid, nlh);
+ err = -EINVAL;
-out:
if (netnsid >= 0)
put_net(tgt_net);
--
2.39.5 (Apple Git-154)
^ permalink raw reply related [flat|nested] 24+ messages in thread* [PATCH v2 net-next 11/14] rtnetlink: Clean up rtnl_setlink().
2024-10-16 18:53 [PATCH v2 net-next 00/14] rtnetlink: Refactor rtnl_{new,del,set}link() for per-netns RTNL Kuniyuki Iwashima
` (9 preceding siblings ...)
2024-10-16 18:53 ` [PATCH v2 net-next 10/14] rtnetlink: Clean up rtnl_dellink() Kuniyuki Iwashima
@ 2024-10-16 18:53 ` Kuniyuki Iwashima
2024-10-16 18:53 ` [PATCH v2 net-next 12/14] rtnetlink: Call rtnl_link_get_net_capable() in do_setlink() Kuniyuki Iwashima
` (3 subsequent siblings)
14 siblings, 0 replies; 24+ messages in thread
From: Kuniyuki Iwashima @ 2024-10-16 18:53 UTC (permalink / raw)
To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev
We will push RTNL down to rtnl_setlink().
Let's unify the error path to make it easy to place rtnl_net_lock().
While at it, keep the variables in reverse xmas order.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
---
net/core/rtnetlink.c | 17 +++++++----------
1 file changed, 7 insertions(+), 10 deletions(-)
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index a19b2eb2727e..f89a902458d6 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -3279,11 +3279,11 @@ static struct net_device *rtnl_dev_get(struct net *net,
static int rtnl_setlink(struct sk_buff *skb, struct nlmsghdr *nlh,
struct netlink_ext_ack *extack)
{
+ struct ifinfomsg *ifm = nlmsg_data(nlh);
struct net *net = sock_net(skb->sk);
- struct ifinfomsg *ifm;
- struct net_device *dev;
- int err;
struct nlattr *tb[IFLA_MAX+1];
+ struct net_device *dev = NULL;
+ int err;
err = nlmsg_parse_deprecated(nlh, sizeof(*ifm), tb, IFLA_MAX,
ifla_policy, extack);
@@ -3294,21 +3294,18 @@ static int rtnl_setlink(struct sk_buff *skb, struct nlmsghdr *nlh,
if (err < 0)
goto errout;
- err = -EINVAL;
- ifm = nlmsg_data(nlh);
if (ifm->ifi_index > 0)
dev = __dev_get_by_index(net, ifm->ifi_index);
else if (tb[IFLA_IFNAME] || tb[IFLA_ALT_IFNAME])
dev = rtnl_dev_get(net, tb);
else
- goto errout;
+ err = -EINVAL;
- if (dev == NULL) {
+ if (dev)
+ err = do_setlink(skb, dev, ifm, extack, tb, 0);
+ else if (!err)
err = -ENODEV;
- goto errout;
- }
- err = do_setlink(skb, dev, ifm, extack, tb, 0);
errout:
return err;
}
--
2.39.5 (Apple Git-154)
^ permalink raw reply related [flat|nested] 24+ messages in thread* [PATCH v2 net-next 12/14] rtnetlink: Call rtnl_link_get_net_capable() in do_setlink().
2024-10-16 18:53 [PATCH v2 net-next 00/14] rtnetlink: Refactor rtnl_{new,del,set}link() for per-netns RTNL Kuniyuki Iwashima
` (10 preceding siblings ...)
2024-10-16 18:53 ` [PATCH v2 net-next 11/14] rtnetlink: Clean up rtnl_setlink() Kuniyuki Iwashima
@ 2024-10-16 18:53 ` Kuniyuki Iwashima
2024-10-16 18:53 ` [PATCH v2 net-next 13/14] rtnetlink: Return int from rtnl_af_register() Kuniyuki Iwashima
` (2 subsequent siblings)
14 siblings, 0 replies; 24+ messages in thread
From: Kuniyuki Iwashima @ 2024-10-16 18:53 UTC (permalink / raw)
To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev
We will push RTNL down to rtnl_setlink().
RTM_SETLINK could call rtnl_link_get_net_capable() in do_setlink()
to move a dev to a new netns, but the netns needs to be fetched before
holding rtnl_net_lock().
Let's move it to rtnl_setlink() and pass the netns to do_setlink().
Now, RTM_NEWLINK paths (rtnl_changelink() and rtnl_group_changelink())
can pass the prefetched netns to do_setlink().
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
v2:
* Move put_net() above the errorout label in rtnl_setlink()
---
net/core/rtnetlink.c | 32 ++++++++++++++++----------------
1 file changed, 16 insertions(+), 16 deletions(-)
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index f89a902458d6..445e6ffed75e 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -2881,8 +2881,8 @@ static int do_set_proto_down(struct net_device *dev,
#define DO_SETLINK_MODIFIED 0x01
/* notify flag means notify + modified. */
#define DO_SETLINK_NOTIFY 0x03
-static int do_setlink(const struct sk_buff *skb,
- struct net_device *dev, struct ifinfomsg *ifm,
+static int do_setlink(const struct sk_buff *skb, struct net_device *dev,
+ struct net *tgt_net, struct ifinfomsg *ifm,
struct netlink_ext_ack *extack,
struct nlattr **tb, int status)
{
@@ -2899,27 +2899,19 @@ static int do_setlink(const struct sk_buff *skb,
else
ifname[0] = '\0';
- if (tb[IFLA_NET_NS_PID] || tb[IFLA_NET_NS_FD] || tb[IFLA_TARGET_NETNSID]) {
+ if (!net_eq(tgt_net, dev_net(dev))) {
const char *pat = ifname[0] ? ifname : NULL;
- struct net *net;
int new_ifindex;
- net = rtnl_link_get_net_capable(skb, dev_net(dev),
- tb, CAP_NET_ADMIN);
- if (IS_ERR(net)) {
- err = PTR_ERR(net);
- goto errout;
- }
-
if (tb[IFLA_NEW_IFINDEX])
new_ifindex = nla_get_s32(tb[IFLA_NEW_IFINDEX]);
else
new_ifindex = 0;
- err = __dev_change_net_namespace(dev, net, pat, new_ifindex);
- put_net(net);
+ err = __dev_change_net_namespace(dev, tgt_net, pat, new_ifindex);
if (err)
goto errout;
+
status |= DO_SETLINK_MODIFIED;
}
@@ -3283,6 +3275,7 @@ static int rtnl_setlink(struct sk_buff *skb, struct nlmsghdr *nlh,
struct net *net = sock_net(skb->sk);
struct nlattr *tb[IFLA_MAX+1];
struct net_device *dev = NULL;
+ struct net *tgt_net;
int err;
err = nlmsg_parse_deprecated(nlh, sizeof(*ifm), tb, IFLA_MAX,
@@ -3294,6 +3287,12 @@ static int rtnl_setlink(struct sk_buff *skb, struct nlmsghdr *nlh,
if (err < 0)
goto errout;
+ tgt_net = rtnl_link_get_net_capable(skb, net, tb, CAP_NET_ADMIN);
+ if (IS_ERR(tgt_net)) {
+ err = PTR_ERR(tgt_net);
+ goto errout;
+ }
+
if (ifm->ifi_index > 0)
dev = __dev_get_by_index(net, ifm->ifi_index);
else if (tb[IFLA_IFNAME] || tb[IFLA_ALT_IFNAME])
@@ -3302,10 +3301,11 @@ static int rtnl_setlink(struct sk_buff *skb, struct nlmsghdr *nlh,
err = -EINVAL;
if (dev)
- err = do_setlink(skb, dev, ifm, extack, tb, 0);
+ err = do_setlink(skb, dev, tgt_net, ifm, extack, tb, 0);
else if (!err)
err = -ENODEV;
+ put_net(tgt_net);
errout:
return err;
}
@@ -3599,7 +3599,7 @@ static int rtnl_changelink(const struct sk_buff *skb, struct nlmsghdr *nlh,
status |= DO_SETLINK_NOTIFY;
}
- return do_setlink(skb, dev, nlmsg_data(nlh), extack, tb, status);
+ return do_setlink(skb, dev, tgt_net, nlmsg_data(nlh), extack, tb, status);
}
static int rtnl_group_changelink(const struct sk_buff *skb,
@@ -3613,7 +3613,7 @@ static int rtnl_group_changelink(const struct sk_buff *skb,
for_each_netdev_safe(net, dev, aux) {
if (dev->group == group) {
- err = do_setlink(skb, dev, ifm, extack, tb, 0);
+ err = do_setlink(skb, dev, tgt_net, ifm, extack, tb, 0);
if (err < 0)
return err;
}
--
2.39.5 (Apple Git-154)
^ permalink raw reply related [flat|nested] 24+ messages in thread* [PATCH v2 net-next 13/14] rtnetlink: Return int from rtnl_af_register().
2024-10-16 18:53 [PATCH v2 net-next 00/14] rtnetlink: Refactor rtnl_{new,del,set}link() for per-netns RTNL Kuniyuki Iwashima
` (11 preceding siblings ...)
2024-10-16 18:53 ` [PATCH v2 net-next 12/14] rtnetlink: Call rtnl_link_get_net_capable() in do_setlink() Kuniyuki Iwashima
@ 2024-10-16 18:53 ` Kuniyuki Iwashima
2024-10-16 19:09 ` David Ahern
` (2 more replies)
2024-10-16 18:53 ` [PATCH v2 net-next 14/14] rtnetlink: Protect struct rtnl_af_ops with SRCU Kuniyuki Iwashima
2024-10-22 9:10 ` [PATCH v2 net-next 00/14] rtnetlink: Refactor rtnl_{new,del,set}link() for per-netns RTNL patchwork-bot+netdevbpf
14 siblings, 3 replies; 24+ messages in thread
From: Kuniyuki Iwashima @ 2024-10-16 18:53 UTC (permalink / raw)
To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev, Roopa Prabhu,
Nikolay Aleksandrov, David Ahern, Jeremy Kerr, Matt Johnston
The next patch will add init_srcu_struct() in rtnl_af_register(),
then we need to handle its error.
Let's add the error handling in advance to make the following
patch cleaner.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
Cc: Roopa Prabhu <roopa@nvidia.com>
Cc: Nikolay Aleksandrov <razor@blackwall.org>
Cc: David Ahern <dsahern@kernel.org>
Cc: Jeremy Kerr <jk@codeconstruct.com.au>
Cc: Matt Johnston <matt@codeconstruct.com.au>
---
include/net/rtnetlink.h | 2 +-
net/bridge/br_netlink.c | 6 +++++-
net/core/rtnetlink.c | 4 +++-
net/ipv4/devinet.c | 3 ++-
net/ipv6/addrconf.c | 5 ++++-
net/mctp/device.c | 16 +++++++++++-----
net/mpls/af_mpls.c | 5 ++++-
7 files changed, 30 insertions(+), 11 deletions(-)
diff --git a/include/net/rtnetlink.h b/include/net/rtnetlink.h
index 1a6aa5ca74f3..969138ae2f4b 100644
--- a/include/net/rtnetlink.h
+++ b/include/net/rtnetlink.h
@@ -204,7 +204,7 @@ struct rtnl_af_ops {
size_t (*get_stats_af_size)(const struct net_device *dev);
};
-void rtnl_af_register(struct rtnl_af_ops *ops);
+int rtnl_af_register(struct rtnl_af_ops *ops);
void rtnl_af_unregister(struct rtnl_af_ops *ops);
struct net *rtnl_link_get_net(struct net *src_net, struct nlattr *tb[]);
diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
index 6b97ae47f855..3e0f47203f2a 100644
--- a/net/bridge/br_netlink.c
+++ b/net/bridge/br_netlink.c
@@ -1924,7 +1924,9 @@ int __init br_netlink_init(void)
if (err)
goto out;
- rtnl_af_register(&br_af_ops);
+ err = rtnl_af_register(&br_af_ops);
+ if (err)
+ goto out_vlan;
err = rtnl_link_register(&br_link_ops);
if (err)
@@ -1934,6 +1936,8 @@ int __init br_netlink_init(void)
out_af:
rtnl_af_unregister(&br_af_ops);
+out_vlan:
+ br_vlan_rtnl_uninit();
out:
return err;
}
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 445e6ffed75e..70b663aca209 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -686,11 +686,13 @@ static const struct rtnl_af_ops *rtnl_af_lookup(const int family)
*
* Returns 0 on success or a negative error code.
*/
-void rtnl_af_register(struct rtnl_af_ops *ops)
+int rtnl_af_register(struct rtnl_af_ops *ops)
{
rtnl_lock();
list_add_tail_rcu(&ops->list, &rtnl_af_ops);
rtnl_unlock();
+
+ return 0;
}
EXPORT_SYMBOL_GPL(rtnl_af_register);
diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
index d81fff93d208..2152d8cfa2dc 100644
--- a/net/ipv4/devinet.c
+++ b/net/ipv4/devinet.c
@@ -2812,7 +2812,8 @@ void __init devinet_init(void)
register_pernet_subsys(&devinet_ops);
register_netdevice_notifier(&ip_netdev_notifier);
- rtnl_af_register(&inet_af_ops);
+ if (rtnl_af_register(&inet_af_ops))
+ panic("Unable to register inet_af_ops\n");
rtnl_register_many(devinet_rtnl_msg_handlers);
}
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index ac8645ad2537..d0a99710d65d 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -7468,7 +7468,9 @@ int __init addrconf_init(void)
addrconf_verify(&init_net);
- rtnl_af_register(&inet6_ops);
+ err = rtnl_af_register(&inet6_ops);
+ if (err)
+ goto erraf;
err = rtnl_register_many(addrconf_rtnl_msg_handlers);
if (err)
@@ -7482,6 +7484,7 @@ int __init addrconf_init(void)
errout:
rtnl_unregister_all(PF_INET6);
rtnl_af_unregister(&inet6_ops);
+erraf:
unregister_netdevice_notifier(&ipv6_dev_notf);
errlo:
destroy_workqueue(addrconf_wq);
diff --git a/net/mctp/device.c b/net/mctp/device.c
index 85cc5f31f1e7..3d75b919995d 100644
--- a/net/mctp/device.c
+++ b/net/mctp/device.c
@@ -535,14 +535,20 @@ int __init mctp_device_init(void)
int err;
register_netdevice_notifier(&mctp_dev_nb);
- rtnl_af_register(&mctp_af_ops);
+
+ err = rtnl_af_register(&mctp_af_ops);
+ if (err)
+ goto err_notifier;
err = rtnl_register_many(mctp_device_rtnl_msg_handlers);
- if (err) {
- rtnl_af_unregister(&mctp_af_ops);
- unregister_netdevice_notifier(&mctp_dev_nb);
- }
+ if (err)
+ goto err_af;
+ return 0;
+err_af:
+ rtnl_af_unregister(&mctp_af_ops);
+err_notifier:
+ unregister_netdevice_notifier(&mctp_dev_nb);
return err;
}
diff --git a/net/mpls/af_mpls.c b/net/mpls/af_mpls.c
index a0573847bc55..1f63b32d76d6 100644
--- a/net/mpls/af_mpls.c
+++ b/net/mpls/af_mpls.c
@@ -2753,7 +2753,9 @@ static int __init mpls_init(void)
dev_add_pack(&mpls_packet_type);
- rtnl_af_register(&mpls_af_ops);
+ err = rtnl_af_register(&mpls_af_ops);
+ if (err)
+ goto out_unregister_dev_type;
err = rtnl_register_many(mpls_rtnl_msg_handlers);
if (err)
@@ -2773,6 +2775,7 @@ static int __init mpls_init(void)
rtnl_unregister_many(mpls_rtnl_msg_handlers);
out_unregister_rtnl_af:
rtnl_af_unregister(&mpls_af_ops);
+out_unregister_dev_type:
dev_remove_pack(&mpls_packet_type);
out_unregister_pernet:
unregister_pernet_subsys(&mpls_net_ops);
--
2.39.5 (Apple Git-154)
^ permalink raw reply related [flat|nested] 24+ messages in thread* Re: [PATCH v2 net-next 13/14] rtnetlink: Return int from rtnl_af_register().
2024-10-16 18:53 ` [PATCH v2 net-next 13/14] rtnetlink: Return int from rtnl_af_register() Kuniyuki Iwashima
@ 2024-10-16 19:09 ` David Ahern
2024-10-16 23:11 ` Kuniyuki Iwashima
2024-10-18 4:10 ` Matt Johnston
2024-10-22 8:53 ` Paolo Abeni
2 siblings, 1 reply; 24+ messages in thread
From: David Ahern @ 2024-10-16 19:09 UTC (permalink / raw)
To: Kuniyuki Iwashima, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni
Cc: Kuniyuki Iwashima, netdev, Roopa Prabhu, Nikolay Aleksandrov,
Jeremy Kerr, Matt Johnston
On 10/16/24 12:53 PM, Kuniyuki Iwashima wrote:
> diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
> index d81fff93d208..2152d8cfa2dc 100644
> --- a/net/ipv4/devinet.c
> +++ b/net/ipv4/devinet.c
> @@ -2812,7 +2812,8 @@ void __init devinet_init(void)
> register_pernet_subsys(&devinet_ops);
> register_netdevice_notifier(&ip_netdev_notifier);
>
> - rtnl_af_register(&inet_af_ops);
> + if (rtnl_af_register(&inet_af_ops))
> + panic("Unable to register inet_af_ops\n");
why panic for IPv4 AF?
^ permalink raw reply [flat|nested] 24+ messages in thread* Re: [PATCH v2 net-next 13/14] rtnetlink: Return int from rtnl_af_register().
2024-10-16 19:09 ` David Ahern
@ 2024-10-16 23:11 ` Kuniyuki Iwashima
0 siblings, 0 replies; 24+ messages in thread
From: Kuniyuki Iwashima @ 2024-10-16 23:11 UTC (permalink / raw)
To: dsahern
Cc: davem, edumazet, jk, kuba, kuni1840, kuniyu, matt, netdev, pabeni,
razor, roopa
From: David Ahern <dsahern@kernel.org>
Date: Wed, 16 Oct 2024 13:09:31 -0600
> On 10/16/24 12:53 PM, Kuniyuki Iwashima wrote:
> > diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
> > index d81fff93d208..2152d8cfa2dc 100644
> > --- a/net/ipv4/devinet.c
> > +++ b/net/ipv4/devinet.c
> > @@ -2812,7 +2812,8 @@ void __init devinet_init(void)
> > register_pernet_subsys(&devinet_ops);
> > register_netdevice_notifier(&ip_netdev_notifier);
> >
> > - rtnl_af_register(&inet_af_ops);
> > + if (rtnl_af_register(&inet_af_ops))
> > + panic("Unable to register inet_af_ops\n");
>
> why panic for IPv4 AF?
It's unlikely to fail to allocate memory for builtin during boot,
and I recently changed rtnl_register() to panic too, instead of
ignoring the errors.
https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=09aec57d8379
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v2 net-next 13/14] rtnetlink: Return int from rtnl_af_register().
2024-10-16 18:53 ` [PATCH v2 net-next 13/14] rtnetlink: Return int from rtnl_af_register() Kuniyuki Iwashima
2024-10-16 19:09 ` David Ahern
@ 2024-10-18 4:10 ` Matt Johnston
2024-10-22 8:53 ` Paolo Abeni
2 siblings, 0 replies; 24+ messages in thread
From: Matt Johnston @ 2024-10-18 4:10 UTC (permalink / raw)
To: Kuniyuki Iwashima, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni
Cc: Kuniyuki Iwashima, netdev, Roopa Prabhu, Nikolay Aleksandrov,
David Ahern, Jeremy Kerr
On Wed, 2024-10-16 at 11:53 -0700, Kuniyuki Iwashima wrote:
> The next patch will add init_srcu_struct() in rtnl_af_register(),
> then we need to handle its error.
>
> Let's add the error handling in advance to make the following
> patch cleaner.
>
> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
The error cleanup changes look correct to me.
Reviewed-by: Matt Johnston <matt@codeconstruct.com.au>
> ---
> Cc: Roopa Prabhu <roopa@nvidia.com>
> Cc: Nikolay Aleksandrov <razor@blackwall.org>
> Cc: David Ahern <dsahern@kernel.org>
> Cc: Jeremy Kerr <jk@codeconstruct.com.au>
> Cc: Matt Johnston <matt@codeconstruct.com.au>
> ---
> include/net/rtnetlink.h | 2 +-
> net/bridge/br_netlink.c | 6 +++++-
> net/core/rtnetlink.c | 4 +++-
> net/ipv4/devinet.c | 3 ++-
> net/ipv6/addrconf.c | 5 ++++-
> net/mctp/device.c | 16 +++++++++++-----
> net/mpls/af_mpls.c | 5 ++++-
> 7 files changed, 30 insertions(+), 11 deletions(-)
>
> diff --git a/include/net/rtnetlink.h b/include/net/rtnetlink.h
> index 1a6aa5ca74f3..969138ae2f4b 100644
> --- a/include/net/rtnetlink.h
> +++ b/include/net/rtnetlink.h
> @@ -204,7 +204,7 @@ struct rtnl_af_ops {
> size_t (*get_stats_af_size)(const struct net_device *dev);
> };
>
> -void rtnl_af_register(struct rtnl_af_ops *ops);
> +int rtnl_af_register(struct rtnl_af_ops *ops);
> void rtnl_af_unregister(struct rtnl_af_ops *ops);
>
> struct net *rtnl_link_get_net(struct net *src_net, struct nlattr *tb[]);
> diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
> index 6b97ae47f855..3e0f47203f2a 100644
> --- a/net/bridge/br_netlink.c
> +++ b/net/bridge/br_netlink.c
> @@ -1924,7 +1924,9 @@ int __init br_netlink_init(void)
> if (err)
> goto out;
>
> - rtnl_af_register(&br_af_ops);
> + err = rtnl_af_register(&br_af_ops);
> + if (err)
> + goto out_vlan;
>
> err = rtnl_link_register(&br_link_ops);
> if (err)
> @@ -1934,6 +1936,8 @@ int __init br_netlink_init(void)
>
> out_af:
> rtnl_af_unregister(&br_af_ops);
> +out_vlan:
> + br_vlan_rtnl_uninit();
> out:
> return err;
> }
> diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
> index 445e6ffed75e..70b663aca209 100644
> --- a/net/core/rtnetlink.c
> +++ b/net/core/rtnetlink.c
> @@ -686,11 +686,13 @@ static const struct rtnl_af_ops *rtnl_af_lookup(const int family)
> *
> * Returns 0 on success or a negative error code.
> */
> -void rtnl_af_register(struct rtnl_af_ops *ops)
> +int rtnl_af_register(struct rtnl_af_ops *ops)
> {
> rtnl_lock();
> list_add_tail_rcu(&ops->list, &rtnl_af_ops);
> rtnl_unlock();
> +
> + return 0;
> }
> EXPORT_SYMBOL_GPL(rtnl_af_register);
>
> diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
> index d81fff93d208..2152d8cfa2dc 100644
> --- a/net/ipv4/devinet.c
> +++ b/net/ipv4/devinet.c
> @@ -2812,7 +2812,8 @@ void __init devinet_init(void)
> register_pernet_subsys(&devinet_ops);
> register_netdevice_notifier(&ip_netdev_notifier);
>
> - rtnl_af_register(&inet_af_ops);
> + if (rtnl_af_register(&inet_af_ops))
> + panic("Unable to register inet_af_ops\n");
>
> rtnl_register_many(devinet_rtnl_msg_handlers);
> }
> diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
> index ac8645ad2537..d0a99710d65d 100644
> --- a/net/ipv6/addrconf.c
> +++ b/net/ipv6/addrconf.c
> @@ -7468,7 +7468,9 @@ int __init addrconf_init(void)
>
> addrconf_verify(&init_net);
>
> - rtnl_af_register(&inet6_ops);
> + err = rtnl_af_register(&inet6_ops);
> + if (err)
> + goto erraf;
>
> err = rtnl_register_many(addrconf_rtnl_msg_handlers);
> if (err)
> @@ -7482,6 +7484,7 @@ int __init addrconf_init(void)
> errout:
> rtnl_unregister_all(PF_INET6);
> rtnl_af_unregister(&inet6_ops);
> +erraf:
> unregister_netdevice_notifier(&ipv6_dev_notf);
> errlo:
> destroy_workqueue(addrconf_wq);
> diff --git a/net/mctp/device.c b/net/mctp/device.c
> index 85cc5f31f1e7..3d75b919995d 100644
> --- a/net/mctp/device.c
> +++ b/net/mctp/device.c
> @@ -535,14 +535,20 @@ int __init mctp_device_init(void)
> int err;
>
> register_netdevice_notifier(&mctp_dev_nb);
> - rtnl_af_register(&mctp_af_ops);
> +
> + err = rtnl_af_register(&mctp_af_ops);
> + if (err)
> + goto err_notifier;
>
> err = rtnl_register_many(mctp_device_rtnl_msg_handlers);
> - if (err) {
> - rtnl_af_unregister(&mctp_af_ops);
> - unregister_netdevice_notifier(&mctp_dev_nb);
> - }
> + if (err)
> + goto err_af;
>
> + return 0;
> +err_af:
> + rtnl_af_unregister(&mctp_af_ops);
> +err_notifier:
> + unregister_netdevice_notifier(&mctp_dev_nb);
> return err;
> }
>
> diff --git a/net/mpls/af_mpls.c b/net/mpls/af_mpls.c
> index a0573847bc55..1f63b32d76d6 100644
> --- a/net/mpls/af_mpls.c
> +++ b/net/mpls/af_mpls.c
> @@ -2753,7 +2753,9 @@ static int __init mpls_init(void)
>
> dev_add_pack(&mpls_packet_type);
>
> - rtnl_af_register(&mpls_af_ops);
> + err = rtnl_af_register(&mpls_af_ops);
> + if (err)
> + goto out_unregister_dev_type;
>
> err = rtnl_register_many(mpls_rtnl_msg_handlers);
> if (err)
> @@ -2773,6 +2775,7 @@ static int __init mpls_init(void)
> rtnl_unregister_many(mpls_rtnl_msg_handlers);
> out_unregister_rtnl_af:
> rtnl_af_unregister(&mpls_af_ops);
> +out_unregister_dev_type:
> dev_remove_pack(&mpls_packet_type);
> out_unregister_pernet:
> unregister_pernet_subsys(&mpls_net_ops);
^ permalink raw reply [flat|nested] 24+ messages in thread* Re: [PATCH v2 net-next 13/14] rtnetlink: Return int from rtnl_af_register().
2024-10-16 18:53 ` [PATCH v2 net-next 13/14] rtnetlink: Return int from rtnl_af_register() Kuniyuki Iwashima
2024-10-16 19:09 ` David Ahern
2024-10-18 4:10 ` Matt Johnston
@ 2024-10-22 8:53 ` Paolo Abeni
2024-10-22 8:59 ` Simon Horman
2 siblings, 1 reply; 24+ messages in thread
From: Paolo Abeni @ 2024-10-22 8:53 UTC (permalink / raw)
To: Kuniyuki Iwashima, David S. Miller, Eric Dumazet, Jakub Kicinski
Cc: Kuniyuki Iwashima, netdev, Roopa Prabhu, Nikolay Aleksandrov,
David Ahern, Jeremy Kerr, Matt Johnston
On 10/16/24 20:53, Kuniyuki Iwashima wrote:
> diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
> index 445e6ffed75e..70b663aca209 100644
> --- a/net/core/rtnetlink.c
> +++ b/net/core/rtnetlink.c
> @@ -686,11 +686,13 @@ static const struct rtnl_af_ops *rtnl_af_lookup(const int family)
> *
> * Returns 0 on success or a negative error code.
> */
> -void rtnl_af_register(struct rtnl_af_ops *ops)
> +int rtnl_af_register(struct rtnl_af_ops *ops)
> {
> rtnl_lock();
> list_add_tail_rcu(&ops->list, &rtnl_af_ops);
> rtnl_unlock();
> +
> + return 0;
> }
> EXPORT_SYMBOL_GPL(rtnl_af_register);
kdoc complains about the missing description for the return value. You
need to replace 'Returns' with '@return'.
Not blocking, but please follow-up.
Thanks!
Paolo
^ permalink raw reply [flat|nested] 24+ messages in thread* Re: [PATCH v2 net-next 13/14] rtnetlink: Return int from rtnl_af_register().
2024-10-22 8:53 ` Paolo Abeni
@ 2024-10-22 8:59 ` Simon Horman
2024-10-22 19:42 ` Kuniyuki Iwashima
0 siblings, 1 reply; 24+ messages in thread
From: Simon Horman @ 2024-10-22 8:59 UTC (permalink / raw)
To: Paolo Abeni
Cc: Kuniyuki Iwashima, David S. Miller, Eric Dumazet, Jakub Kicinski,
Kuniyuki Iwashima, netdev, Roopa Prabhu, Nikolay Aleksandrov,
David Ahern, Jeremy Kerr, Matt Johnston
On Tue, Oct 22, 2024 at 10:53:32AM +0200, Paolo Abeni wrote:
> On 10/16/24 20:53, Kuniyuki Iwashima wrote:
> > diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
> > index 445e6ffed75e..70b663aca209 100644
> > --- a/net/core/rtnetlink.c
> > +++ b/net/core/rtnetlink.c
> > @@ -686,11 +686,13 @@ static const struct rtnl_af_ops *rtnl_af_lookup(const int family)
> > *
> > * Returns 0 on success or a negative error code.
> > */
> > -void rtnl_af_register(struct rtnl_af_ops *ops)
> > +int rtnl_af_register(struct rtnl_af_ops *ops)
> > {
> > rtnl_lock();
> > list_add_tail_rcu(&ops->list, &rtnl_af_ops);
> > rtnl_unlock();
> > +
> > + return 0;
> > }
> > EXPORT_SYMBOL_GPL(rtnl_af_register);
>
> kdoc complains about the missing description for the return value. You
> need to replace 'Returns' with '@return'.
>
> Not blocking, but please follow-up.
FWIIW, I think "Return: " or "Returns: " also works.
^ permalink raw reply [flat|nested] 24+ messages in thread* Re: [PATCH v2 net-next 13/14] rtnetlink: Return int from rtnl_af_register().
2024-10-22 8:59 ` Simon Horman
@ 2024-10-22 19:42 ` Kuniyuki Iwashima
0 siblings, 0 replies; 24+ messages in thread
From: Kuniyuki Iwashima @ 2024-10-22 19:42 UTC (permalink / raw)
To: horms
Cc: davem, dsahern, edumazet, jk, kuba, kuni1840, kuniyu, matt,
netdev, pabeni, razor, roopa
From: Simon Horman <horms@kernel.org>
Date: Tue, 22 Oct 2024 09:59:20 +0100
> On Tue, Oct 22, 2024 at 10:53:32AM +0200, Paolo Abeni wrote:
> > On 10/16/24 20:53, Kuniyuki Iwashima wrote:
> > > diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
> > > index 445e6ffed75e..70b663aca209 100644
> > > --- a/net/core/rtnetlink.c
> > > +++ b/net/core/rtnetlink.c
> > > @@ -686,11 +686,13 @@ static const struct rtnl_af_ops *rtnl_af_lookup(const int family)
> > > *
> > > * Returns 0 on success or a negative error code.
> > > */
> > > -void rtnl_af_register(struct rtnl_af_ops *ops)
> > > +int rtnl_af_register(struct rtnl_af_ops *ops)
> > > {
> > > rtnl_lock();
> > > list_add_tail_rcu(&ops->list, &rtnl_af_ops);
> > > rtnl_unlock();
> > > +
> > > + return 0;
> > > }
> > > EXPORT_SYMBOL_GPL(rtnl_af_register);
> >
> > kdoc complains about the missing description for the return value. You
> > need to replace 'Returns' with '@return'.
> >
> > Not blocking, but please follow-up.
>
> FWIIW, I think "Return: " or "Returns: " also works.
Sure, will post a followup shortly.
Thanks!
^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH v2 net-next 14/14] rtnetlink: Protect struct rtnl_af_ops with SRCU.
2024-10-16 18:53 [PATCH v2 net-next 00/14] rtnetlink: Refactor rtnl_{new,del,set}link() for per-netns RTNL Kuniyuki Iwashima
` (12 preceding siblings ...)
2024-10-16 18:53 ` [PATCH v2 net-next 13/14] rtnetlink: Return int from rtnl_af_register() Kuniyuki Iwashima
@ 2024-10-16 18:53 ` Kuniyuki Iwashima
2024-10-22 9:10 ` [PATCH v2 net-next 00/14] rtnetlink: Refactor rtnl_{new,del,set}link() for per-netns RTNL patchwork-bot+netdevbpf
14 siblings, 0 replies; 24+ messages in thread
From: Kuniyuki Iwashima @ 2024-10-16 18:53 UTC (permalink / raw)
To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev
Once RTNL is replaced with rtnl_net_lock(), we need a mechanism to
guarantee that rtnl_af_ops is alive during inflight RTM_SETLINK
even when its module is being unloaded.
Let's use SRCU to protect ops.
rtnl_af_lookup() now iterates rtnl_af_ops under RCU and returns
SRCU-protected ops pointer. The caller must call rtnl_af_put()
to release the pointer after the use.
Also, rtnl_af_unregister() unlinks the ops first and calls
synchronize_srcu() to wait for inflight RTM_SETLINK requests to
complete.
Note that rtnl_af_ops needs to be protected by its dedicated lock
when RTNL is removed.
Note also that BUG_ON() in do_setlink() is changed to the normal
error handling as a different af_ops might be found after
validate_linkmsg().
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
v2:
* Handle error of init_srcu_struct().
* Call cleanup_srcu_struct() after synchronize_srcu().
---
include/net/rtnetlink.h | 5 +++-
net/core/rtnetlink.c | 63 ++++++++++++++++++++++++++++++-----------
2 files changed, 51 insertions(+), 17 deletions(-)
diff --git a/include/net/rtnetlink.h b/include/net/rtnetlink.h
index 969138ae2f4b..e0d9a8eae6b6 100644
--- a/include/net/rtnetlink.h
+++ b/include/net/rtnetlink.h
@@ -172,7 +172,8 @@ void rtnl_link_unregister(struct rtnl_link_ops *ops);
/**
* struct rtnl_af_ops - rtnetlink address family operations
*
- * @list: Used internally
+ * @list: Used internally, protected by RTNL and SRCU
+ * @srcu: Used internally
* @family: Address family
* @fill_link_af: Function to fill IFLA_AF_SPEC with address family
* specific netlink attributes.
@@ -185,6 +186,8 @@ void rtnl_link_unregister(struct rtnl_link_ops *ops);
*/
struct rtnl_af_ops {
struct list_head list;
+ struct srcu_struct srcu;
+
int family;
int (*fill_link_af)(struct sk_buff *skb,
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 70b663aca209..194a81e5f608 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -666,18 +666,31 @@ static size_t rtnl_link_get_size(const struct net_device *dev)
static LIST_HEAD(rtnl_af_ops);
-static const struct rtnl_af_ops *rtnl_af_lookup(const int family)
+static struct rtnl_af_ops *rtnl_af_lookup(const int family, int *srcu_index)
{
- const struct rtnl_af_ops *ops;
+ struct rtnl_af_ops *ops;
ASSERT_RTNL();
- list_for_each_entry(ops, &rtnl_af_ops, list) {
- if (ops->family == family)
- return ops;
+ rcu_read_lock();
+
+ list_for_each_entry_rcu(ops, &rtnl_af_ops, list) {
+ if (ops->family == family) {
+ *srcu_index = srcu_read_lock(&ops->srcu);
+ goto unlock;
+ }
}
- return NULL;
+ ops = NULL;
+unlock:
+ rcu_read_unlock();
+
+ return ops;
+}
+
+static void rtnl_af_put(struct rtnl_af_ops *ops, int srcu_index)
+{
+ srcu_read_unlock(&ops->srcu, srcu_index);
}
/**
@@ -688,6 +701,11 @@ static const struct rtnl_af_ops *rtnl_af_lookup(const int family)
*/
int rtnl_af_register(struct rtnl_af_ops *ops)
{
+ int err = init_srcu_struct(&ops->srcu);
+
+ if (err)
+ return err;
+
rtnl_lock();
list_add_tail_rcu(&ops->list, &rtnl_af_ops);
rtnl_unlock();
@@ -707,6 +725,8 @@ void rtnl_af_unregister(struct rtnl_af_ops *ops)
rtnl_unlock();
synchronize_rcu();
+ synchronize_srcu(&ops->srcu);
+ cleanup_srcu_struct(&ops->srcu);
}
EXPORT_SYMBOL_GPL(rtnl_af_unregister);
@@ -2579,20 +2599,24 @@ static int validate_linkmsg(struct net_device *dev, struct nlattr *tb[],
int rem, err;
nla_for_each_nested(af, tb[IFLA_AF_SPEC], rem) {
- const struct rtnl_af_ops *af_ops;
+ struct rtnl_af_ops *af_ops;
+ int af_ops_srcu_index;
- af_ops = rtnl_af_lookup(nla_type(af));
+ af_ops = rtnl_af_lookup(nla_type(af), &af_ops_srcu_index);
if (!af_ops)
return -EAFNOSUPPORT;
if (!af_ops->set_link_af)
- return -EOPNOTSUPP;
-
- if (af_ops->validate_link_af) {
+ err = -EOPNOTSUPP;
+ else if (af_ops->validate_link_af)
err = af_ops->validate_link_af(dev, af, extack);
- if (err < 0)
- return err;
- }
+ else
+ err = 0;
+
+ rtnl_af_put(af_ops, af_ops_srcu_index);
+
+ if (err < 0)
+ return err;
}
}
@@ -3172,11 +3196,18 @@ static int do_setlink(const struct sk_buff *skb, struct net_device *dev,
int rem;
nla_for_each_nested(af, tb[IFLA_AF_SPEC], rem) {
- const struct rtnl_af_ops *af_ops;
+ struct rtnl_af_ops *af_ops;
+ int af_ops_srcu_index;
- BUG_ON(!(af_ops = rtnl_af_lookup(nla_type(af))));
+ af_ops = rtnl_af_lookup(nla_type(af), &af_ops_srcu_index);
+ if (!af_ops) {
+ err = -EAFNOSUPPORT;
+ goto errout;
+ }
err = af_ops->set_link_af(dev, af, extack);
+ rtnl_af_put(af_ops, af_ops_srcu_index);
+
if (err < 0)
goto errout;
--
2.39.5 (Apple Git-154)
^ permalink raw reply related [flat|nested] 24+ messages in thread* Re: [PATCH v2 net-next 00/14] rtnetlink: Refactor rtnl_{new,del,set}link() for per-netns RTNL.
2024-10-16 18:53 [PATCH v2 net-next 00/14] rtnetlink: Refactor rtnl_{new,del,set}link() for per-netns RTNL Kuniyuki Iwashima
` (13 preceding siblings ...)
2024-10-16 18:53 ` [PATCH v2 net-next 14/14] rtnetlink: Protect struct rtnl_af_ops with SRCU Kuniyuki Iwashima
@ 2024-10-22 9:10 ` patchwork-bot+netdevbpf
14 siblings, 0 replies; 24+ messages in thread
From: patchwork-bot+netdevbpf @ 2024-10-22 9:10 UTC (permalink / raw)
To: Kuniyuki Iwashima; +Cc: davem, edumazet, kuba, pabeni, kuni1840, netdev
Hello:
This series was applied to netdev/net-next.git (main)
by Paolo Abeni <pabeni@redhat.com>:
On Wed, 16 Oct 2024 11:53:43 -0700 you wrote:
> This is a prep for the next series where we will push RTNL down to
> rtnl_{new,del,set}link().
>
> That means, for example, __rtnl_newlink() is always under RTNL, but
> rtnl_newlink() has a non-RTNL section.
>
> As a prerequisite for per-netns RTNL, we will move netns validation
> (and RTNL-independent validations if possible) to that section.
>
> [...]
Here is the summary with links:
- [v2,net-next,01/14] rtnetlink: Allocate linkinfo[] as struct rtnl_newlink_tbs.
https://git.kernel.org/netdev/net-next/c/fa8ef258da2b
- [v2,net-next,02/14] rtnetlink: Call validate_linkmsg() in do_setlink().
https://git.kernel.org/netdev/net-next/c/a5838cf9b2ee
- [v2,net-next,03/14] rtnetlink: Factorise do_setlink() path from __rtnl_newlink().
https://git.kernel.org/netdev/net-next/c/cc47bcdf0d2e
- [v2,net-next,04/14] rtnetlink: Move simple validation from __rtnl_newlink() to rtnl_newlink().
https://git.kernel.org/netdev/net-next/c/7fea1a8cb4df
- [v2,net-next,05/14] rtnetlink: Move rtnl_link_ops_get() and retry to rtnl_newlink().
https://git.kernel.org/netdev/net-next/c/331fe31c50ef
- [v2,net-next,06/14] rtnetlink: Move ops->validate to rtnl_newlink().
https://git.kernel.org/netdev/net-next/c/0d3008d1a9ae
- [v2,net-next,07/14] rtnetlink: Protect struct rtnl_link_ops with SRCU.
https://git.kernel.org/netdev/net-next/c/43c7ce69d28e
- [v2,net-next,08/14] rtnetlink: Call rtnl_link_get_net_capable() in rtnl_newlink().
https://git.kernel.org/netdev/net-next/c/0fef2a1212f1
- [v2,net-next,09/14] rtnetlink: Fetch IFLA_LINK_NETNSID in rtnl_newlink().
https://git.kernel.org/netdev/net-next/c/f7774eec20b4
- [v2,net-next,10/14] rtnetlink: Clean up rtnl_dellink().
https://git.kernel.org/netdev/net-next/c/175cfc5cd373
- [v2,net-next,11/14] rtnetlink: Clean up rtnl_setlink().
https://git.kernel.org/netdev/net-next/c/6e495fad88ef
- [v2,net-next,12/14] rtnetlink: Call rtnl_link_get_net_capable() in do_setlink().
https://git.kernel.org/netdev/net-next/c/a0b63c6457e1
- [v2,net-next,13/14] rtnetlink: Return int from rtnl_af_register().
https://git.kernel.org/netdev/net-next/c/26eebdc4b005
- [v2,net-next,14/14] rtnetlink: Protect struct rtnl_af_ops with SRCU.
https://git.kernel.org/netdev/net-next/c/6ab0f8669483
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 24+ messages in thread