* [PATCH v1 net-next 00/13] rtnetlink: Refactor rtnl_{new,del,set}link() for per-netns RTNL.
@ 2024-10-09 23:16 Kuniyuki Iwashima
2024-10-09 23:16 ` [PATCH v1 net-next 01/13] rtnetlink: Allocate linkinfo[] as struct rtnl_newlink_tbs Kuniyuki Iwashima
` (12 more replies)
0 siblings, 13 replies; 30+ messages in thread
From: Kuniyuki Iwashima @ 2024-10-09 23:16 UTC (permalink / raw)
To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev
This is a prep for the next series where we will push RTNL down to
rtnl_{new,del,set}link().
That means, for example, __rtnl_newlink() is always under RTNL, but
rtnl_newlink() has a non-RTNL section.
As a prerequisite for per-netns RTNL, we will move netns validation
(and RTNL-independent validations if possible) to that section.
rtnl_link_ops and rtnl_af_ops will be protected with SRCU not to
depend on RTNL.
Kuniyuki Iwashima (13):
rtnetlink: Allocate linkinfo[] as struct rtnl_newlink_tbs.
rtnetlink: Call validate_linkmsg() in do_setlink().
rtnetlink: Factorise do_setlink() path from __rtnl_newlink().
rtnetlink: Move simple validation from __rtnl_newlink() to
rtnl_newlink().
rtnetlink: Move rtnl_link_ops_get() and retry to rtnl_newlink().
rtnetlink: Move ops->validate to rtnl_newlink().
rtnetlink: Protect struct rtnl_link_ops with SRCU.
rtnetlink: Call rtnl_link_get_net_capable() in rtnl_newlink().
rtnetlink: Fetch IFLA_LINK_NETNSID in rtnl_newlink().
rtnetlink: Clean up rtnl_dellink().
rtnetlink: Clean up rtnl_setlink().
rtnetlink: Call rtnl_link_get_net_capable() in do_setlink().
rtnetlink: Protect struct rtnl_af_ops with SRCU.
include/net/rtnetlink.h | 10 +-
net/core/rtnetlink.c | 552 ++++++++++++++++++++++------------------
2 files changed, 310 insertions(+), 252 deletions(-)
--
2.39.5 (Apple Git-154)
^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH v1 net-next 01/13] rtnetlink: Allocate linkinfo[] as struct rtnl_newlink_tbs.
2024-10-09 23:16 [PATCH v1 net-next 00/13] rtnetlink: Refactor rtnl_{new,del,set}link() for per-netns RTNL Kuniyuki Iwashima
@ 2024-10-09 23:16 ` Kuniyuki Iwashima
2024-10-10 12:36 ` Eric Dumazet
2024-10-09 23:16 ` [PATCH v1 net-next 02/13] rtnetlink: Call validate_linkmsg() in do_setlink() Kuniyuki Iwashima
` (11 subsequent siblings)
12 siblings, 1 reply; 30+ messages in thread
From: Kuniyuki Iwashima @ 2024-10-09 23:16 UTC (permalink / raw)
To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev
We will move linkinfo to rtnl_newlink() and pass it down to other
functions.
Let's pack it into rtnl_newlink_tbs.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
net/core/rtnetlink.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 6d68247aea70..abc44ee018a0 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -3621,6 +3621,7 @@ static int rtnl_newlink_create(struct sk_buff *skb, struct ifinfomsg *ifm,
struct rtnl_newlink_tbs {
struct nlattr *tb[IFLA_MAX + 1];
+ struct nlattr *linkinfo[IFLA_INFO_MAX + 1];
struct nlattr *attr[RTNL_MAX_TYPE + 1];
struct nlattr *slave_attr[RTNL_SLAVE_MAX_TYPE + 1];
};
@@ -3629,7 +3630,7 @@ static int __rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
struct rtnl_newlink_tbs *tbs,
struct netlink_ext_ack *extack)
{
- struct nlattr *linkinfo[IFLA_INFO_MAX + 1];
+ struct nlattr ** const linkinfo = tbs->linkinfo;
struct nlattr ** const tb = tbs->tb;
const struct rtnl_link_ops *m_ops;
struct net_device *master_dev;
@@ -3684,8 +3685,9 @@ static int __rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
ifla_info_policy, NULL);
if (err < 0)
return err;
- } else
- memset(linkinfo, 0, sizeof(linkinfo));
+ } else {
+ memset(linkinfo, 0, sizeof(tbs->linkinfo));
+ }
if (linkinfo[IFLA_INFO_KIND]) {
nla_strscpy(kind, linkinfo[IFLA_INFO_KIND], sizeof(kind));
--
2.39.5 (Apple Git-154)
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH v1 net-next 02/13] rtnetlink: Call validate_linkmsg() in do_setlink().
2024-10-09 23:16 [PATCH v1 net-next 00/13] rtnetlink: Refactor rtnl_{new,del,set}link() for per-netns RTNL Kuniyuki Iwashima
2024-10-09 23:16 ` [PATCH v1 net-next 01/13] rtnetlink: Allocate linkinfo[] as struct rtnl_newlink_tbs Kuniyuki Iwashima
@ 2024-10-09 23:16 ` Kuniyuki Iwashima
2024-10-10 12:44 ` Eric Dumazet
2024-10-09 23:16 ` [PATCH v1 net-next 03/13] rtnetlink: Factorise do_setlink() path from __rtnl_newlink() Kuniyuki Iwashima
` (10 subsequent siblings)
12 siblings, 1 reply; 30+ messages in thread
From: Kuniyuki Iwashima @ 2024-10-09 23:16 UTC (permalink / raw)
To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev
There are 3 paths that finally call do_setlink(), and validate_linkmsg()
is called in each path.
1. RTM_NEWLINK
1-1. dev is found in __rtnl_newlink()
1-2. dev isn't found, but IFLA_GROUP is specified in
rtnl_group_changelink()
2. RTM_SETLINK
The next patch factorises 1-1 to a separate function.
As a preparation, let's move validate_linkmsg() calls to do_setlink().
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
net/core/rtnetlink.c | 15 ++++-----------
1 file changed, 4 insertions(+), 11 deletions(-)
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index abc44ee018a0..bb14ddf2901e 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -2854,6 +2854,10 @@ static int do_setlink(const struct sk_buff *skb,
char ifname[IFNAMSIZ];
int err;
+ err = validate_linkmsg(dev, tb, extack);
+ if (err < 0)
+ goto errout;
+
if (tb[IFLA_IFNAME])
nla_strscpy(ifname, tb[IFLA_IFNAME], IFNAMSIZ);
else
@@ -3268,10 +3272,6 @@ static int rtnl_setlink(struct sk_buff *skb, struct nlmsghdr *nlh,
goto errout;
}
- err = validate_linkmsg(dev, tb, extack);
- if (err < 0)
- goto errout;
-
err = do_setlink(skb, dev, ifm, extack, tb, 0);
errout:
return err;
@@ -3515,9 +3515,6 @@ static int rtnl_group_changelink(const struct sk_buff *skb,
for_each_netdev_safe(net, dev, aux) {
if (dev->group == group) {
- err = validate_linkmsg(dev, tb, extack);
- if (err < 0)
- return err;
err = do_setlink(skb, dev, ifm, extack, tb, 0);
if (err < 0)
return err;
@@ -3743,10 +3740,6 @@ static int __rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
if (nlh->nlmsg_flags & NLM_F_REPLACE)
return -EOPNOTSUPP;
- err = validate_linkmsg(dev, tb, extack);
- if (err < 0)
- return err;
-
if (linkinfo[IFLA_INFO_DATA]) {
if (!ops || ops != dev->rtnl_link_ops ||
!ops->changelink)
--
2.39.5 (Apple Git-154)
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH v1 net-next 03/13] rtnetlink: Factorise do_setlink() path from __rtnl_newlink().
2024-10-09 23:16 [PATCH v1 net-next 00/13] rtnetlink: Refactor rtnl_{new,del,set}link() for per-netns RTNL Kuniyuki Iwashima
2024-10-09 23:16 ` [PATCH v1 net-next 01/13] rtnetlink: Allocate linkinfo[] as struct rtnl_newlink_tbs Kuniyuki Iwashima
2024-10-09 23:16 ` [PATCH v1 net-next 02/13] rtnetlink: Call validate_linkmsg() in do_setlink() Kuniyuki Iwashima
@ 2024-10-09 23:16 ` Kuniyuki Iwashima
2024-10-10 12:39 ` Eric Dumazet
2024-10-09 23:16 ` [PATCH v1 net-next 04/13] rtnetlink: Move simple validation from __rtnl_newlink() to rtnl_newlink() Kuniyuki Iwashima
` (9 subsequent siblings)
12 siblings, 1 reply; 30+ messages in thread
From: Kuniyuki Iwashima @ 2024-10-09 23:16 UTC (permalink / raw)
To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev
__rtnl_newlink() got too long to maintain.
For example, netdev_master_upper_dev_get()->rtnl_link_ops is fetched even
when IFLA_INFO_SLAVE_DATA is not specified.
Let's factorise the single dev do_setlink() path to a separate function.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
net/core/rtnetlink.c | 142 ++++++++++++++++++++++---------------------
1 file changed, 74 insertions(+), 68 deletions(-)
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index bb14ddf2901e..1d214c76011d 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -3504,6 +3504,78 @@ struct net_device *rtnl_create_link(struct net *net, const char *ifname,
}
EXPORT_SYMBOL(rtnl_create_link);
+struct rtnl_newlink_tbs {
+ struct nlattr *tb[IFLA_MAX + 1];
+ struct nlattr *linkinfo[IFLA_INFO_MAX + 1];
+ struct nlattr *attr[RTNL_MAX_TYPE + 1];
+ struct nlattr *slave_attr[RTNL_SLAVE_MAX_TYPE + 1];
+};
+
+static int rtnl_changelink(const struct sk_buff *skb, struct nlmsghdr *nlh,
+ const struct rtnl_link_ops *ops,
+ struct net_device *dev,
+ struct rtnl_newlink_tbs *tbs,
+ struct nlattr **data,
+ struct netlink_ext_ack *extack)
+{
+ struct nlattr ** const linkinfo = tbs->linkinfo;
+ struct nlattr ** const tb = tbs->tb;
+ int status = 0;
+ int err;
+
+ if (nlh->nlmsg_flags & NLM_F_EXCL)
+ return -EEXIST;
+
+ if (nlh->nlmsg_flags & NLM_F_REPLACE)
+ return -EOPNOTSUPP;
+
+ if (linkinfo[IFLA_INFO_DATA]) {
+ if (!ops || ops != dev->rtnl_link_ops || !ops->changelink)
+ return -EOPNOTSUPP;
+
+ err = ops->changelink(dev, tb, data, extack);
+ if (err < 0)
+ return err;
+
+ status |= DO_SETLINK_NOTIFY;
+ }
+
+ if (linkinfo[IFLA_INFO_SLAVE_DATA]) {
+ const struct rtnl_link_ops *m_ops = NULL;
+ struct nlattr **slave_data = NULL;
+ struct net_device *master_dev;
+
+ master_dev = netdev_master_upper_dev_get(dev);
+ if (master_dev)
+ m_ops = master_dev->rtnl_link_ops;
+
+ if (!m_ops || !m_ops->slave_changelink)
+ return -EOPNOTSUPP;
+
+ if (m_ops->slave_maxtype > RTNL_SLAVE_MAX_TYPE)
+ return -EINVAL;
+
+ if (m_ops->slave_maxtype) {
+ err = nla_parse_nested_deprecated(tbs->slave_attr,
+ m_ops->slave_maxtype,
+ linkinfo[IFLA_INFO_SLAVE_DATA],
+ m_ops->slave_policy, extack);
+ if (err < 0)
+ return err;
+
+ slave_data = tbs->slave_attr;
+ }
+
+ err = m_ops->slave_changelink(master_dev, dev, tb, slave_data, extack);
+ if (err < 0)
+ return err;
+
+ status |= DO_SETLINK_NOTIFY;
+ }
+
+ return do_setlink(skb, dev, nlmsg_data(nlh), extack, tb, status);
+}
+
static int rtnl_group_changelink(const struct sk_buff *skb,
struct net *net, int group,
struct ifinfomsg *ifm,
@@ -3616,24 +3688,14 @@ static int rtnl_newlink_create(struct sk_buff *skb, struct ifinfomsg *ifm,
goto out;
}
-struct rtnl_newlink_tbs {
- struct nlattr *tb[IFLA_MAX + 1];
- struct nlattr *linkinfo[IFLA_INFO_MAX + 1];
- struct nlattr *attr[RTNL_MAX_TYPE + 1];
- struct nlattr *slave_attr[RTNL_SLAVE_MAX_TYPE + 1];
-};
-
static int __rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
struct rtnl_newlink_tbs *tbs,
struct netlink_ext_ack *extack)
{
struct nlattr ** const linkinfo = tbs->linkinfo;
struct nlattr ** const tb = tbs->tb;
- const struct rtnl_link_ops *m_ops;
- struct net_device *master_dev;
struct net *net = sock_net(skb->sk);
const struct rtnl_link_ops *ops;
- struct nlattr **slave_data;
char kind[MODULE_NAME_LEN];
struct net_device *dev;
struct ifinfomsg *ifm;
@@ -3668,14 +3730,6 @@ static int __rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
dev = NULL;
}
- master_dev = NULL;
- m_ops = NULL;
- if (dev) {
- master_dev = netdev_master_upper_dev_get(dev);
- if (master_dev)
- m_ops = master_dev->rtnl_link_ops;
- }
-
if (tb[IFLA_LINKINFO]) {
err = nla_parse_nested_deprecated(linkinfo, IFLA_INFO_MAX,
tb[IFLA_LINKINFO],
@@ -3714,56 +3768,8 @@ static int __rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
}
}
- slave_data = NULL;
- if (m_ops) {
- if (m_ops->slave_maxtype > RTNL_SLAVE_MAX_TYPE)
- return -EINVAL;
-
- if (m_ops->slave_maxtype &&
- linkinfo[IFLA_INFO_SLAVE_DATA]) {
- err = nla_parse_nested_deprecated(tbs->slave_attr,
- m_ops->slave_maxtype,
- linkinfo[IFLA_INFO_SLAVE_DATA],
- m_ops->slave_policy,
- extack);
- if (err < 0)
- return err;
- slave_data = tbs->slave_attr;
- }
- }
-
- if (dev) {
- int status = 0;
-
- if (nlh->nlmsg_flags & NLM_F_EXCL)
- return -EEXIST;
- if (nlh->nlmsg_flags & NLM_F_REPLACE)
- return -EOPNOTSUPP;
-
- if (linkinfo[IFLA_INFO_DATA]) {
- if (!ops || ops != dev->rtnl_link_ops ||
- !ops->changelink)
- return -EOPNOTSUPP;
-
- err = ops->changelink(dev, tb, data, extack);
- if (err < 0)
- return err;
- status |= DO_SETLINK_NOTIFY;
- }
-
- if (linkinfo[IFLA_INFO_SLAVE_DATA]) {
- if (!m_ops || !m_ops->slave_changelink)
- return -EOPNOTSUPP;
-
- err = m_ops->slave_changelink(master_dev, dev, tb,
- slave_data, extack);
- if (err < 0)
- return err;
- status |= DO_SETLINK_NOTIFY;
- }
-
- return do_setlink(skb, dev, ifm, extack, tb, status);
- }
+ if (dev)
+ return rtnl_changelink(skb, nlh, ops, dev, tbs, data, extack);
if (!(nlh->nlmsg_flags & NLM_F_CREATE)) {
/* No dev found and NLM_F_CREATE not set. Requested dev does not exist,
--
2.39.5 (Apple Git-154)
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH v1 net-next 04/13] rtnetlink: Move simple validation from __rtnl_newlink() to rtnl_newlink().
2024-10-09 23:16 [PATCH v1 net-next 00/13] rtnetlink: Refactor rtnl_{new,del,set}link() for per-netns RTNL Kuniyuki Iwashima
` (2 preceding siblings ...)
2024-10-09 23:16 ` [PATCH v1 net-next 03/13] rtnetlink: Factorise do_setlink() path from __rtnl_newlink() Kuniyuki Iwashima
@ 2024-10-09 23:16 ` Kuniyuki Iwashima
2024-10-10 12:40 ` Eric Dumazet
2024-10-09 23:16 ` [PATCH v1 net-next 05/13] rtnetlink: Move rtnl_link_ops_get() and retry " Kuniyuki Iwashima
` (8 subsequent siblings)
12 siblings, 1 reply; 30+ messages in thread
From: Kuniyuki Iwashima @ 2024-10-09 23:16 UTC (permalink / raw)
To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev
We will push RTNL down to rtnl_newlink().
Let's move RTNL-independent validation to rtnl_newlink().
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
net/core/rtnetlink.c | 43 ++++++++++++++++++++++++-------------------
1 file changed, 24 insertions(+), 19 deletions(-)
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 1d214c76011d..3416f364db83 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -3706,15 +3706,6 @@ static int __rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
#ifdef CONFIG_MODULES
replay:
#endif
- err = nlmsg_parse_deprecated(nlh, sizeof(*ifm), tb, IFLA_MAX,
- ifla_policy, extack);
- if (err < 0)
- return err;
-
- err = rtnl_ensure_unique_netns(tb, extack, false);
- if (err < 0)
- return err;
-
ifm = nlmsg_data(nlh);
if (ifm->ifi_index > 0) {
link_specified = true;
@@ -3730,16 +3721,6 @@ static int __rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
dev = NULL;
}
- if (tb[IFLA_LINKINFO]) {
- err = nla_parse_nested_deprecated(linkinfo, IFLA_INFO_MAX,
- tb[IFLA_LINKINFO],
- ifla_info_policy, NULL);
- if (err < 0)
- return err;
- } else {
- memset(linkinfo, 0, sizeof(tbs->linkinfo));
- }
-
if (linkinfo[IFLA_INFO_KIND]) {
nla_strscpy(kind, linkinfo[IFLA_INFO_KIND], sizeof(kind));
ops = rtnl_link_ops_get(kind);
@@ -3808,6 +3789,7 @@ static int __rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
static int rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
struct netlink_ext_ack *extack)
{
+ struct nlattr **tb, **linkinfo;
struct rtnl_newlink_tbs *tbs;
int ret;
@@ -3815,7 +3797,30 @@ static int rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
if (!tbs)
return -ENOMEM;
+ tb = tbs->tb;
+ ret = nlmsg_parse_deprecated(nlh, sizeof(struct ifinfomsg), tb,
+ IFLA_MAX, ifla_policy, extack);
+ if (ret < 0)
+ goto free;
+
+ ret = rtnl_ensure_unique_netns(tb, extack, false);
+ if (ret < 0)
+ goto free;
+
+ linkinfo = tbs->linkinfo;
+ if (tb[IFLA_LINKINFO]) {
+ ret = nla_parse_nested_deprecated(linkinfo, IFLA_INFO_MAX,
+ tb[IFLA_LINKINFO],
+ ifla_info_policy, NULL);
+ if (ret < 0)
+ goto free;
+ } else {
+ memset(linkinfo, 0, sizeof(tbs->linkinfo));
+ }
+
ret = __rtnl_newlink(skb, nlh, tbs, extack);
+
+free:
kfree(tbs);
return ret;
}
--
2.39.5 (Apple Git-154)
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH v1 net-next 05/13] rtnetlink: Move rtnl_link_ops_get() and retry to rtnl_newlink().
2024-10-09 23:16 [PATCH v1 net-next 00/13] rtnetlink: Refactor rtnl_{new,del,set}link() for per-netns RTNL Kuniyuki Iwashima
` (3 preceding siblings ...)
2024-10-09 23:16 ` [PATCH v1 net-next 04/13] rtnetlink: Move simple validation from __rtnl_newlink() to rtnl_newlink() Kuniyuki Iwashima
@ 2024-10-09 23:16 ` Kuniyuki Iwashima
2024-10-09 23:16 ` [PATCH v1 net-next 06/13] rtnetlink: Move ops->validate " Kuniyuki Iwashima
` (7 subsequent siblings)
12 siblings, 0 replies; 30+ messages in thread
From: Kuniyuki Iwashima @ 2024-10-09 23:16 UTC (permalink / raw)
To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev
Currently, if neither dev nor rtnl_link_ops is found in __rtnl_newlink(),
we release RTNL and redo the whole process after request_module(), which
complicates the logic.
The ops will be RTNL-independent later.
Let's move the ops lookup to rtnl_newlink() and do the retry earlier.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
net/core/rtnetlink.c | 42 ++++++++++++++++++------------------------
1 file changed, 18 insertions(+), 24 deletions(-)
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 3416f364db83..fe36d584136f 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -3689,23 +3689,19 @@ static int rtnl_newlink_create(struct sk_buff *skb, struct ifinfomsg *ifm,
}
static int __rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
+ const struct rtnl_link_ops *ops,
struct rtnl_newlink_tbs *tbs,
struct netlink_ext_ack *extack)
{
struct nlattr ** const linkinfo = tbs->linkinfo;
struct nlattr ** const tb = tbs->tb;
struct net *net = sock_net(skb->sk);
- const struct rtnl_link_ops *ops;
- char kind[MODULE_NAME_LEN];
struct net_device *dev;
struct ifinfomsg *ifm;
struct nlattr **data;
bool link_specified;
int err;
-#ifdef CONFIG_MODULES
-replay:
-#endif
ifm = nlmsg_data(nlh);
if (ifm->ifi_index > 0) {
link_specified = true;
@@ -3721,14 +3717,6 @@ static int __rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
dev = NULL;
}
- if (linkinfo[IFLA_INFO_KIND]) {
- nla_strscpy(kind, linkinfo[IFLA_INFO_KIND], sizeof(kind));
- ops = rtnl_link_ops_get(kind);
- } else {
- kind[0] = '\0';
- ops = NULL;
- }
-
data = NULL;
if (ops) {
if (ops->maxtype > RTNL_MAX_TYPE)
@@ -3769,16 +3757,6 @@ static int __rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
return -EOPNOTSUPP;
if (!ops) {
-#ifdef CONFIG_MODULES
- if (kind[0]) {
- __rtnl_unlock();
- request_module("rtnl-link-%s", kind);
- rtnl_lock();
- ops = rtnl_link_ops_get(kind);
- if (ops)
- goto replay;
- }
-#endif
NL_SET_ERR_MSG(extack, "Unknown device type");
return -EOPNOTSUPP;
}
@@ -3789,6 +3767,7 @@ static int __rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
static int rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
struct netlink_ext_ack *extack)
{
+ const struct rtnl_link_ops *ops = NULL;
struct nlattr **tb, **linkinfo;
struct rtnl_newlink_tbs *tbs;
int ret;
@@ -3818,7 +3797,22 @@ static int rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
memset(linkinfo, 0, sizeof(tbs->linkinfo));
}
- ret = __rtnl_newlink(skb, nlh, tbs, extack);
+ if (linkinfo[IFLA_INFO_KIND]) {
+ char kind[MODULE_NAME_LEN];
+
+ nla_strscpy(kind, linkinfo[IFLA_INFO_KIND], sizeof(kind));
+ ops = rtnl_link_ops_get(kind);
+#ifdef CONFIG_MODULES
+ if (!ops) {
+ __rtnl_unlock();
+ request_module("rtnl-link-%s", kind);
+ rtnl_lock();
+ ops = rtnl_link_ops_get(kind);
+ }
+#endif
+ }
+
+ ret = __rtnl_newlink(skb, nlh, ops, tbs, extack);
free:
kfree(tbs);
--
2.39.5 (Apple Git-154)
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH v1 net-next 06/13] rtnetlink: Move ops->validate to rtnl_newlink().
2024-10-09 23:16 [PATCH v1 net-next 00/13] rtnetlink: Refactor rtnl_{new,del,set}link() for per-netns RTNL Kuniyuki Iwashima
` (4 preceding siblings ...)
2024-10-09 23:16 ` [PATCH v1 net-next 05/13] rtnetlink: Move rtnl_link_ops_get() and retry " Kuniyuki Iwashima
@ 2024-10-09 23:16 ` Kuniyuki Iwashima
2024-10-10 12:48 ` Eric Dumazet
2024-10-09 23:16 ` [PATCH v1 net-next 07/13] rtnetlink: Protect struct rtnl_link_ops with SRCU Kuniyuki Iwashima
` (6 subsequent siblings)
12 siblings, 1 reply; 30+ messages in thread
From: Kuniyuki Iwashima @ 2024-10-09 23:16 UTC (permalink / raw)
To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev
ops->validate() does not require RTNL.
Let's move it to rtnl_newlink().
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
net/core/rtnetlink.c | 49 ++++++++++++++++++++++----------------------
1 file changed, 24 insertions(+), 25 deletions(-)
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index fe36d584136f..24545c5b7e48 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -3691,16 +3691,14 @@ static int rtnl_newlink_create(struct sk_buff *skb, struct ifinfomsg *ifm,
static int __rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
const struct rtnl_link_ops *ops,
struct rtnl_newlink_tbs *tbs,
+ struct nlattr **data,
struct netlink_ext_ack *extack)
{
- struct nlattr ** const linkinfo = tbs->linkinfo;
struct nlattr ** const tb = tbs->tb;
struct net *net = sock_net(skb->sk);
struct net_device *dev;
struct ifinfomsg *ifm;
- struct nlattr **data;
bool link_specified;
- int err;
ifm = nlmsg_data(nlh);
if (ifm->ifi_index > 0) {
@@ -3717,26 +3715,6 @@ static int __rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
dev = NULL;
}
- data = NULL;
- if (ops) {
- if (ops->maxtype > RTNL_MAX_TYPE)
- return -EINVAL;
-
- if (ops->maxtype && linkinfo[IFLA_INFO_DATA]) {
- err = nla_parse_nested_deprecated(tbs->attr, ops->maxtype,
- linkinfo[IFLA_INFO_DATA],
- ops->policy, extack);
- if (err < 0)
- return err;
- data = tbs->attr;
- }
- if (ops->validate) {
- err = ops->validate(tb, data, extack);
- if (err < 0)
- return err;
- }
- }
-
if (dev)
return rtnl_changelink(skb, nlh, ops, dev, tbs, data, extack);
@@ -3767,8 +3745,8 @@ static int __rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
static int rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
struct netlink_ext_ack *extack)
{
+ struct nlattr **tb, **linkinfo, **data = NULL;
const struct rtnl_link_ops *ops = NULL;
- struct nlattr **tb, **linkinfo;
struct rtnl_newlink_tbs *tbs;
int ret;
@@ -3812,7 +3790,28 @@ static int rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
#endif
}
- ret = __rtnl_newlink(skb, nlh, ops, tbs, extack);
+ if (ops) {
+ if (ops->maxtype > RTNL_MAX_TYPE)
+ return -EINVAL;
+
+ if (ops->maxtype && linkinfo[IFLA_INFO_DATA]) {
+ ret = nla_parse_nested_deprecated(tbs->attr, ops->maxtype,
+ linkinfo[IFLA_INFO_DATA],
+ ops->policy, extack);
+ if (ret < 0)
+ goto free;
+
+ data = tbs->attr;
+ }
+
+ if (ops->validate) {
+ ret = ops->validate(tb, data, extack);
+ if (ret < 0)
+ goto free;
+ }
+ }
+
+ ret = __rtnl_newlink(skb, nlh, ops, tbs, data, extack);
free:
kfree(tbs);
--
2.39.5 (Apple Git-154)
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH v1 net-next 07/13] rtnetlink: Protect struct rtnl_link_ops with SRCU.
2024-10-09 23:16 [PATCH v1 net-next 00/13] rtnetlink: Refactor rtnl_{new,del,set}link() for per-netns RTNL Kuniyuki Iwashima
` (5 preceding siblings ...)
2024-10-09 23:16 ` [PATCH v1 net-next 06/13] rtnetlink: Move ops->validate " Kuniyuki Iwashima
@ 2024-10-09 23:16 ` Kuniyuki Iwashima
2024-10-10 13:02 ` Eric Dumazet
2024-10-09 23:16 ` [PATCH v1 net-next 08/13] rtnetlink: Call rtnl_link_get_net_capable() in rtnl_newlink() Kuniyuki Iwashima
` (5 subsequent siblings)
12 siblings, 1 reply; 30+ messages in thread
From: Kuniyuki Iwashima @ 2024-10-09 23:16 UTC (permalink / raw)
To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev
Once RTNL is replaced with rtnl_net_lock(), we need a mechanism to
guarantee that rtnl_link_ops is alive during inflight RTM_NEWLINK
even when its module is being unloaded.
Let's use SRCU to protect rtnl_link_ops.
rtnl_link_ops_get() now iterates link_ops under RCU and returns
SRCU-protected ops pointer. The caller must call rtnl_link_ops_put()
to release the pointer after the use.
Also, __rtnl_link_unregister() unlinks the ops first and calls
synchronize_srcu() to wait for inflight RTM_NEWLINK requests to
complete.
Note that link_ops needs to be protected by its dedicated lock
when RTNL is removed.
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
include/net/rtnetlink.h | 5 ++-
net/core/rtnetlink.c | 78 +++++++++++++++++++++++++++++------------
2 files changed, 60 insertions(+), 23 deletions(-)
diff --git a/include/net/rtnetlink.h b/include/net/rtnetlink.h
index b45d57b5968a..c873fd6193ed 100644
--- a/include/net/rtnetlink.h
+++ b/include/net/rtnetlink.h
@@ -3,6 +3,7 @@
#define __NET_RTNETLINK_H
#include <linux/rtnetlink.h>
+#include <linux/srcu.h>
#include <net/netlink.h>
typedef int (*rtnl_doit_func)(struct sk_buff *, struct nlmsghdr *,
@@ -47,7 +48,8 @@ static inline int rtnl_msg_family(const struct nlmsghdr *nlh)
/**
* struct rtnl_link_ops - rtnetlink link operations
*
- * @list: Used internally
+ * @list: Used internally, protected by RTNL and SRCU
+ * @srcu: Used internally
* @kind: Identifier
* @netns_refund: Physical device, move to init_net on netns exit
* @maxtype: Highest device specific netlink attribute number
@@ -78,6 +80,7 @@ static inline int rtnl_msg_family(const struct nlmsghdr *nlh)
*/
struct rtnl_link_ops {
struct list_head list;
+ struct srcu_struct srcu;
const char *kind;
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 24545c5b7e48..7f464554d881 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -456,15 +456,29 @@ EXPORT_SYMBOL_GPL(rtnl_unregister_all);
static LIST_HEAD(link_ops);
-static const struct rtnl_link_ops *rtnl_link_ops_get(const char *kind)
+static struct rtnl_link_ops *rtnl_link_ops_get(const char *kind, int *srcu_index)
{
- const struct rtnl_link_ops *ops;
+ struct rtnl_link_ops *ops;
- list_for_each_entry(ops, &link_ops, list) {
- if (!strcmp(ops->kind, kind))
- return ops;
+ rcu_read_lock();
+
+ list_for_each_entry_rcu(ops, &link_ops, list) {
+ if (!strcmp(ops->kind, kind)) {
+ *srcu_index = srcu_read_lock(&ops->srcu);
+ goto unlock;
+ }
}
- return NULL;
+
+ ops = NULL;
+unlock:
+ rcu_read_unlock();
+
+ return ops;
+}
+
+static void rtnl_link_ops_put(struct rtnl_link_ops *ops, int srcu_index)
+{
+ srcu_read_unlock(&ops->srcu, srcu_index);
}
/**
@@ -479,8 +493,15 @@ static const struct rtnl_link_ops *rtnl_link_ops_get(const char *kind)
*/
int __rtnl_link_register(struct rtnl_link_ops *ops)
{
- if (rtnl_link_ops_get(ops->kind))
- return -EEXIST;
+ struct rtnl_link_ops *tmp;
+
+ /* When RTNL is removed, add lock for link_ops. */
+ ASSERT_RTNL();
+
+ list_for_each_entry(tmp, &link_ops, list) {
+ if (!strcmp(ops->kind, tmp->kind))
+ return -EEXIST;
+ }
/* The check for alloc/setup is here because if ops
* does not have that filled up, it is not possible
@@ -490,7 +511,9 @@ int __rtnl_link_register(struct rtnl_link_ops *ops)
if ((ops->alloc || ops->setup) && !ops->dellink)
ops->dellink = unregister_netdevice_queue;
- list_add_tail(&ops->list, &link_ops);
+ init_srcu_struct(&ops->srcu);
+ list_add_tail_rcu(&ops->list, &link_ops);
+
return 0;
}
EXPORT_SYMBOL_GPL(__rtnl_link_register);
@@ -541,10 +564,11 @@ void __rtnl_link_unregister(struct rtnl_link_ops *ops)
{
struct net *net;
- for_each_net(net) {
+ list_del_rcu(&ops->list);
+ synchronize_srcu(&ops->srcu);
+
+ for_each_net(net)
__rtnl_kill_links(net, ops);
- }
- list_del(&ops->list);
}
EXPORT_SYMBOL_GPL(__rtnl_link_unregister);
@@ -2157,10 +2181,11 @@ static const struct nla_policy ifla_xdp_policy[IFLA_XDP_MAX + 1] = {
[IFLA_XDP_PROG_ID] = { .type = NLA_U32 },
};
-static const struct rtnl_link_ops *linkinfo_to_kind_ops(const struct nlattr *nla)
+static struct rtnl_link_ops *linkinfo_to_kind_ops(const struct nlattr *nla,
+ int *ops_srcu_index)
{
- const struct rtnl_link_ops *ops = NULL;
struct nlattr *linfo[IFLA_INFO_MAX + 1];
+ struct rtnl_link_ops *ops = NULL;
if (nla_parse_nested_deprecated(linfo, IFLA_INFO_MAX, nla, ifla_info_policy, NULL) < 0)
return NULL;
@@ -2169,7 +2194,7 @@ static const struct rtnl_link_ops *linkinfo_to_kind_ops(const struct nlattr *nla
char kind[MODULE_NAME_LEN];
nla_strscpy(kind, linfo[IFLA_INFO_KIND], sizeof(kind));
- ops = rtnl_link_ops_get(kind);
+ ops = rtnl_link_ops_get(kind, ops_srcu_index);
}
return ops;
@@ -2289,8 +2314,8 @@ static int rtnl_valid_dump_ifinfo_req(const struct nlmsghdr *nlh,
static int rtnl_dump_ifinfo(struct sk_buff *skb, struct netlink_callback *cb)
{
- const struct rtnl_link_ops *kind_ops = NULL;
struct netlink_ext_ack *extack = cb->extack;
+ struct rtnl_link_ops *kind_ops = NULL;
const struct nlmsghdr *nlh = cb->nlh;
struct net *net = sock_net(skb->sk);
unsigned int flags = NLM_F_MULTI;
@@ -2301,6 +2326,7 @@ static int rtnl_dump_ifinfo(struct sk_buff *skb, struct netlink_callback *cb)
struct net *tgt_net = net;
u32 ext_filter_mask = 0;
struct net_device *dev;
+ int ops_srcu_index;
int master_idx = 0;
int netnsid = -1;
int err, i;
@@ -2334,7 +2360,7 @@ static int rtnl_dump_ifinfo(struct sk_buff *skb, struct netlink_callback *cb)
master_idx = nla_get_u32(tb[i]);
break;
case IFLA_LINKINFO:
- kind_ops = linkinfo_to_kind_ops(tb[i]);
+ kind_ops = linkinfo_to_kind_ops(tb[i], &ops_srcu_index);
break;
default:
if (cb->strict_check) {
@@ -2360,6 +2386,10 @@ static int rtnl_dump_ifinfo(struct sk_buff *skb, struct netlink_callback *cb)
if (err < 0)
break;
}
+
+ if (kind_ops)
+ rtnl_link_ops_put(kind_ops, ops_srcu_index);
+
cb->seq = tgt_net->dev_base_seq;
nl_dump_check_consistent(cb, nlmsg_hdr(skb));
if (netnsid >= 0)
@@ -3746,8 +3776,9 @@ static int rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
struct netlink_ext_ack *extack)
{
struct nlattr **tb, **linkinfo, **data = NULL;
- const struct rtnl_link_ops *ops = NULL;
+ struct rtnl_link_ops *ops = NULL;
struct rtnl_newlink_tbs *tbs;
+ int ops_srcu_index;
int ret;
tbs = kmalloc(sizeof(*tbs), GFP_KERNEL);
@@ -3779,13 +3810,13 @@ static int rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
char kind[MODULE_NAME_LEN];
nla_strscpy(kind, linkinfo[IFLA_INFO_KIND], sizeof(kind));
- ops = rtnl_link_ops_get(kind);
+ ops = rtnl_link_ops_get(kind, &ops_srcu_index);
#ifdef CONFIG_MODULES
if (!ops) {
__rtnl_unlock();
request_module("rtnl-link-%s", kind);
rtnl_lock();
- ops = rtnl_link_ops_get(kind);
+ ops = rtnl_link_ops_get(kind, &ops_srcu_index);
}
#endif
}
@@ -3799,7 +3830,7 @@ static int rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
linkinfo[IFLA_INFO_DATA],
ops->policy, extack);
if (ret < 0)
- goto free;
+ goto put_ops;
data = tbs->attr;
}
@@ -3807,12 +3838,15 @@ static int rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
if (ops->validate) {
ret = ops->validate(tb, data, extack);
if (ret < 0)
- goto free;
+ goto put_ops;
}
}
ret = __rtnl_newlink(skb, nlh, ops, tbs, data, extack);
+put_ops:
+ if (ops)
+ rtnl_link_ops_put(ops, ops_srcu_index);
free:
kfree(tbs);
return ret;
--
2.39.5 (Apple Git-154)
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH v1 net-next 08/13] rtnetlink: Call rtnl_link_get_net_capable() in rtnl_newlink().
2024-10-09 23:16 [PATCH v1 net-next 00/13] rtnetlink: Refactor rtnl_{new,del,set}link() for per-netns RTNL Kuniyuki Iwashima
` (6 preceding siblings ...)
2024-10-09 23:16 ` [PATCH v1 net-next 07/13] rtnetlink: Protect struct rtnl_link_ops with SRCU Kuniyuki Iwashima
@ 2024-10-09 23:16 ` Kuniyuki Iwashima
2024-10-10 13:08 ` Eric Dumazet
2024-10-09 23:16 ` [PATCH v1 net-next 09/13] rtnetlink: Fetch IFLA_LINK_NETNSID " Kuniyuki Iwashima
` (4 subsequent siblings)
12 siblings, 1 reply; 30+ messages in thread
From: Kuniyuki Iwashima @ 2024-10-09 23:16 UTC (permalink / raw)
To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev
As a prerequisite of per-netns RTNL, we must fetch netns before
looking up dev or moving it to another netns.
rtnl_link_get_net_capable() is called in rtnl_newlink_create() and
do_setlink(), but both of them need to be moved to the RTNL-independent
region, which will be rtnl_newlink().
Let's call rtnl_link_get_net_capable() in rtnl_newlink() and pass the
netns down to where needed.
Note that the latter two have not passed the nets to do_setlink() yet
but will do so after the remaining rtnl_link_get_net_capable() is moved
to rtnl_setlink() later.
While at it, dest_net is renamed to tgt_net in rtnl_newlink_create() to
align with rtnl_{del,set}link().
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
net/core/rtnetlink.c | 51 ++++++++++++++++++++++++--------------------
1 file changed, 28 insertions(+), 23 deletions(-)
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 7f464554d881..fc62f23d2647 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -3543,7 +3543,7 @@ struct rtnl_newlink_tbs {
static int rtnl_changelink(const struct sk_buff *skb, struct nlmsghdr *nlh,
const struct rtnl_link_ops *ops,
- struct net_device *dev,
+ struct net_device *dev, struct net *tgt_net,
struct rtnl_newlink_tbs *tbs,
struct nlattr **data,
struct netlink_ext_ack *extack)
@@ -3607,10 +3607,10 @@ static int rtnl_changelink(const struct sk_buff *skb, struct nlmsghdr *nlh,
}
static int rtnl_group_changelink(const struct sk_buff *skb,
- struct net *net, int group,
- struct ifinfomsg *ifm,
- struct netlink_ext_ack *extack,
- struct nlattr **tb)
+ struct net *net, struct net *tgt_net,
+ int group, struct ifinfomsg *ifm,
+ struct netlink_ext_ack *extack,
+ struct nlattr **tb)
{
struct net_device *dev, *aux;
int err;
@@ -3628,6 +3628,7 @@ static int rtnl_group_changelink(const struct sk_buff *skb,
static int rtnl_newlink_create(struct sk_buff *skb, struct ifinfomsg *ifm,
const struct rtnl_link_ops *ops,
+ struct net *tgt_net,
const struct nlmsghdr *nlh,
struct nlattr **tb, struct nlattr **data,
struct netlink_ext_ack *extack)
@@ -3635,9 +3636,9 @@ static int rtnl_newlink_create(struct sk_buff *skb, struct ifinfomsg *ifm,
unsigned char name_assign_type = NET_NAME_USER;
struct net *net = sock_net(skb->sk);
u32 portid = NETLINK_CB(skb).portid;
- struct net *dest_net, *link_net;
struct net_device *dev;
char ifname[IFNAMSIZ];
+ struct net *link_net;
int err;
if (!ops->alloc && !ops->setup)
@@ -3650,14 +3651,10 @@ static int rtnl_newlink_create(struct sk_buff *skb, struct ifinfomsg *ifm,
name_assign_type = NET_NAME_ENUM;
}
- dest_net = rtnl_link_get_net_capable(skb, net, tb, CAP_NET_ADMIN);
- if (IS_ERR(dest_net))
- return PTR_ERR(dest_net);
-
if (tb[IFLA_LINK_NETNSID]) {
int id = nla_get_s32(tb[IFLA_LINK_NETNSID]);
- link_net = get_net_ns_by_id(dest_net, id);
+ link_net = get_net_ns_by_id(tgt_net, id);
if (!link_net) {
NL_SET_ERR_MSG(extack, "Unknown network namespace id");
err = -EINVAL;
@@ -3670,7 +3667,7 @@ static int rtnl_newlink_create(struct sk_buff *skb, struct ifinfomsg *ifm,
link_net = NULL;
}
- dev = rtnl_create_link(link_net ? : dest_net, ifname,
+ dev = rtnl_create_link(link_net ? : tgt_net, ifname,
name_assign_type, ops, tb, extack);
if (IS_ERR(dev)) {
err = PTR_ERR(dev);
@@ -3692,7 +3689,7 @@ static int rtnl_newlink_create(struct sk_buff *skb, struct ifinfomsg *ifm,
if (err < 0)
goto out_unregister;
if (link_net) {
- err = dev_change_net_namespace(dev, dest_net, ifname);
+ err = dev_change_net_namespace(dev, tgt_net, ifname);
if (err < 0)
goto out_unregister;
}
@@ -3704,7 +3701,7 @@ static int rtnl_newlink_create(struct sk_buff *skb, struct ifinfomsg *ifm,
out:
if (link_net)
put_net(link_net);
- put_net(dest_net);
+
return err;
out_unregister:
if (ops->newlink) {
@@ -3720,6 +3717,7 @@ static int rtnl_newlink_create(struct sk_buff *skb, struct ifinfomsg *ifm,
static int __rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
const struct rtnl_link_ops *ops,
+ struct net *tgt_net,
struct rtnl_newlink_tbs *tbs,
struct nlattr **data,
struct netlink_ext_ack *extack)
@@ -3746,19 +3744,18 @@ static int __rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
}
if (dev)
- return rtnl_changelink(skb, nlh, ops, dev, tbs, data, extack);
+ return rtnl_changelink(skb, nlh, ops, dev, tgt_net, tbs, data, extack);
if (!(nlh->nlmsg_flags & NLM_F_CREATE)) {
/* No dev found and NLM_F_CREATE not set. Requested dev does not exist,
* or it's for a group
*/
- if (link_specified)
+ if (link_specified || !tb[IFLA_GROUP])
return -ENODEV;
- if (tb[IFLA_GROUP])
- return rtnl_group_changelink(skb, net,
- nla_get_u32(tb[IFLA_GROUP]),
- ifm, extack, tb);
- return -ENODEV;
+
+ return rtnl_group_changelink(skb, net, tgt_net,
+ nla_get_u32(tb[IFLA_GROUP]),
+ ifm, extack, tb);
}
if (tb[IFLA_MAP] || tb[IFLA_PROTINFO])
@@ -3769,7 +3766,7 @@ static int __rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
return -EOPNOTSUPP;
}
- return rtnl_newlink_create(skb, ifm, ops, nlh, tb, data, extack);
+ return rtnl_newlink_create(skb, ifm, ops, tgt_net, nlh, tb, data, extack);
}
static int rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
@@ -3778,6 +3775,7 @@ static int rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
struct nlattr **tb, **linkinfo, **data = NULL;
struct rtnl_link_ops *ops = NULL;
struct rtnl_newlink_tbs *tbs;
+ struct net *tgt_net;
int ops_srcu_index;
int ret;
@@ -3842,8 +3840,15 @@ static int rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
}
}
- ret = __rtnl_newlink(skb, nlh, ops, tbs, data, extack);
+ tgt_net = rtnl_link_get_net_capable(skb, sock_net(skb->sk), tb, CAP_NET_ADMIN);
+ if (IS_ERR(tgt_net)) {
+ ret = PTR_ERR(tgt_net);
+ goto put_ops;
+ }
+
+ ret = __rtnl_newlink(skb, nlh, ops, tgt_net, tbs, data, extack);
+ put_net(tgt_net);
put_ops:
if (ops)
rtnl_link_ops_put(ops, ops_srcu_index);
--
2.39.5 (Apple Git-154)
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH v1 net-next 09/13] rtnetlink: Fetch IFLA_LINK_NETNSID in rtnl_newlink().
2024-10-09 23:16 [PATCH v1 net-next 00/13] rtnetlink: Refactor rtnl_{new,del,set}link() for per-netns RTNL Kuniyuki Iwashima
` (7 preceding siblings ...)
2024-10-09 23:16 ` [PATCH v1 net-next 08/13] rtnetlink: Call rtnl_link_get_net_capable() in rtnl_newlink() Kuniyuki Iwashima
@ 2024-10-09 23:16 ` Kuniyuki Iwashima
2024-10-10 13:09 ` Eric Dumazet
2024-10-09 23:16 ` [PATCH v1 net-next 10/13] rtnetlink: Clean up rtnl_dellink() Kuniyuki Iwashima
` (3 subsequent siblings)
12 siblings, 1 reply; 30+ messages in thread
From: Kuniyuki Iwashima @ 2024-10-09 23:16 UTC (permalink / raw)
To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev
Another netns option for RTM_NEWLINK is IFLA_LINK_NETNSID and
is fetched in rtnl_newlink_create().
This must be done before holding rtnl_net_lock().
Let's move IFLA_LINK_NETNSID processing to rtnl_newlink().
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
net/core/rtnetlink.c | 49 ++++++++++++++++++++++----------------------
1 file changed, 24 insertions(+), 25 deletions(-)
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index fc62f23d2647..70a3a9f411d8 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -3628,7 +3628,7 @@ static int rtnl_group_changelink(const struct sk_buff *skb,
static int rtnl_newlink_create(struct sk_buff *skb, struct ifinfomsg *ifm,
const struct rtnl_link_ops *ops,
- struct net *tgt_net,
+ struct net *tgt_net, struct net *link_net,
const struct nlmsghdr *nlh,
struct nlattr **tb, struct nlattr **data,
struct netlink_ext_ack *extack)
@@ -3638,7 +3638,6 @@ static int rtnl_newlink_create(struct sk_buff *skb, struct ifinfomsg *ifm,
u32 portid = NETLINK_CB(skb).portid;
struct net_device *dev;
char ifname[IFNAMSIZ];
- struct net *link_net;
int err;
if (!ops->alloc && !ops->setup)
@@ -3651,22 +3650,6 @@ static int rtnl_newlink_create(struct sk_buff *skb, struct ifinfomsg *ifm,
name_assign_type = NET_NAME_ENUM;
}
- if (tb[IFLA_LINK_NETNSID]) {
- int id = nla_get_s32(tb[IFLA_LINK_NETNSID]);
-
- link_net = get_net_ns_by_id(tgt_net, id);
- if (!link_net) {
- NL_SET_ERR_MSG(extack, "Unknown network namespace id");
- err = -EINVAL;
- goto out;
- }
- err = -EPERM;
- if (!netlink_ns_capable(skb, link_net->user_ns, CAP_NET_ADMIN))
- goto out;
- } else {
- link_net = NULL;
- }
-
dev = rtnl_create_link(link_net ? : tgt_net, ifname,
name_assign_type, ops, tb, extack);
if (IS_ERR(dev)) {
@@ -3699,9 +3682,6 @@ static int rtnl_newlink_create(struct sk_buff *skb, struct ifinfomsg *ifm,
goto out_unregister;
}
out:
- if (link_net)
- put_net(link_net);
-
return err;
out_unregister:
if (ops->newlink) {
@@ -3717,7 +3697,7 @@ static int rtnl_newlink_create(struct sk_buff *skb, struct ifinfomsg *ifm,
static int __rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
const struct rtnl_link_ops *ops,
- struct net *tgt_net,
+ struct net *tgt_net, struct net *link_net,
struct rtnl_newlink_tbs *tbs,
struct nlattr **data,
struct netlink_ext_ack *extack)
@@ -3766,16 +3746,16 @@ static int __rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
return -EOPNOTSUPP;
}
- return rtnl_newlink_create(skb, ifm, ops, tgt_net, nlh, tb, data, extack);
+ return rtnl_newlink_create(skb, ifm, ops, tgt_net, link_net, nlh, tb, data, extack);
}
static int rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
struct netlink_ext_ack *extack)
{
struct nlattr **tb, **linkinfo, **data = NULL;
+ struct net *tgt_net, *link_net = NULL;
struct rtnl_link_ops *ops = NULL;
struct rtnl_newlink_tbs *tbs;
- struct net *tgt_net;
int ops_srcu_index;
int ret;
@@ -3846,8 +3826,27 @@ static int rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
goto put_ops;
}
- ret = __rtnl_newlink(skb, nlh, ops, tgt_net, tbs, data, extack);
+ if (tb[IFLA_LINK_NETNSID]) {
+ int id = nla_get_s32(tb[IFLA_LINK_NETNSID]);
+
+ link_net = get_net_ns_by_id(tgt_net, id);
+ if (!link_net) {
+ NL_SET_ERR_MSG(extack, "Unknown network namespace id");
+ ret = -EINVAL;
+ goto put_net;
+ }
+
+ if (!netlink_ns_capable(skb, link_net->user_ns, CAP_NET_ADMIN)) {
+ ret = -EPERM;
+ goto put_net;
+ }
+ }
+
+ ret = __rtnl_newlink(skb, nlh, ops, tgt_net, link_net, tbs, data, extack);
+put_net:
+ if (link_net)
+ put_net(link_net);
put_net(tgt_net);
put_ops:
if (ops)
--
2.39.5 (Apple Git-154)
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH v1 net-next 10/13] rtnetlink: Clean up rtnl_dellink().
2024-10-09 23:16 [PATCH v1 net-next 00/13] rtnetlink: Refactor rtnl_{new,del,set}link() for per-netns RTNL Kuniyuki Iwashima
` (8 preceding siblings ...)
2024-10-09 23:16 ` [PATCH v1 net-next 09/13] rtnetlink: Fetch IFLA_LINK_NETNSID " Kuniyuki Iwashima
@ 2024-10-09 23:16 ` Kuniyuki Iwashima
2024-10-10 13:10 ` Eric Dumazet
2024-10-09 23:16 ` [PATCH v1 net-next 11/13] rtnetlink: Clean up rtnl_setlink() Kuniyuki Iwashima
` (2 subsequent siblings)
12 siblings, 1 reply; 30+ messages in thread
From: Kuniyuki Iwashima @ 2024-10-09 23:16 UTC (permalink / raw)
To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev
We will push RTNL down to rtnl_delink().
Let's unify the error path to make it easy to place rtnl_net_lock().
While at it, keep the variables in reverse xmas order.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
net/core/rtnetlink.c | 27 ++++++++++-----------------
1 file changed, 10 insertions(+), 17 deletions(-)
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 70a3a9f411d8..59a83fd52a92 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -3362,14 +3362,14 @@ EXPORT_SYMBOL_GPL(rtnl_delete_link);
static int rtnl_dellink(struct sk_buff *skb, struct nlmsghdr *nlh,
struct netlink_ext_ack *extack)
{
+ struct ifinfomsg *ifm = nlmsg_data(nlh);
struct net *net = sock_net(skb->sk);
u32 portid = NETLINK_CB(skb).portid;
- struct net *tgt_net = net;
- struct net_device *dev = NULL;
- struct ifinfomsg *ifm;
struct nlattr *tb[IFLA_MAX+1];
- int err;
+ struct net_device *dev = NULL;
+ struct net *tgt_net = net;
int netnsid = -1;
+ int err;
err = nlmsg_parse_deprecated(nlh, sizeof(*ifm), tb, IFLA_MAX,
ifla_policy, extack);
@@ -3387,27 +3387,20 @@ static int rtnl_dellink(struct sk_buff *skb, struct nlmsghdr *nlh,
return PTR_ERR(tgt_net);
}
- err = -EINVAL;
- ifm = nlmsg_data(nlh);
if (ifm->ifi_index > 0)
dev = __dev_get_by_index(tgt_net, ifm->ifi_index);
else if (tb[IFLA_IFNAME] || tb[IFLA_ALT_IFNAME])
dev = rtnl_dev_get(tgt_net, tb);
+
+ if (dev)
+ err = rtnl_delete_link(dev, portid, nlh);
+ else if (ifm->ifi_index > 0 || tb[IFLA_IFNAME] || tb[IFLA_ALT_IFNAME])
+ err = -ENODEV;
else if (tb[IFLA_GROUP])
err = rtnl_group_dellink(tgt_net, nla_get_u32(tb[IFLA_GROUP]));
else
- goto out;
-
- if (!dev) {
- if (tb[IFLA_IFNAME] || tb[IFLA_ALT_IFNAME] || ifm->ifi_index > 0)
- err = -ENODEV;
-
- goto out;
- }
-
- err = rtnl_delete_link(dev, portid, nlh);
+ err = -EINVAL;
-out:
if (netnsid >= 0)
put_net(tgt_net);
--
2.39.5 (Apple Git-154)
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH v1 net-next 11/13] rtnetlink: Clean up rtnl_setlink().
2024-10-09 23:16 [PATCH v1 net-next 00/13] rtnetlink: Refactor rtnl_{new,del,set}link() for per-netns RTNL Kuniyuki Iwashima
` (9 preceding siblings ...)
2024-10-09 23:16 ` [PATCH v1 net-next 10/13] rtnetlink: Clean up rtnl_dellink() Kuniyuki Iwashima
@ 2024-10-09 23:16 ` Kuniyuki Iwashima
2024-10-10 13:11 ` Eric Dumazet
2024-10-09 23:16 ` [PATCH v1 net-next 12/13] rtnetlink: Call rtnl_link_get_net_capable() in do_setlink() Kuniyuki Iwashima
2024-10-09 23:16 ` [PATCH v1 net-next 13/13] rtnetlink: Protect struct rtnl_af_ops with SRCU Kuniyuki Iwashima
12 siblings, 1 reply; 30+ messages in thread
From: Kuniyuki Iwashima @ 2024-10-09 23:16 UTC (permalink / raw)
To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev
We will push RTNL down to rtnl_setlink().
Let's unify the error path to make it easy to place rtnl_net_lock().
While at it, keep the variables in reverse xmas order.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
net/core/rtnetlink.c | 17 +++++++----------
1 file changed, 7 insertions(+), 10 deletions(-)
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 59a83fd52a92..de693a88986e 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -3273,11 +3273,11 @@ static struct net_device *rtnl_dev_get(struct net *net,
static int rtnl_setlink(struct sk_buff *skb, struct nlmsghdr *nlh,
struct netlink_ext_ack *extack)
{
+ struct ifinfomsg *ifm = nlmsg_data(nlh);
struct net *net = sock_net(skb->sk);
- struct ifinfomsg *ifm;
- struct net_device *dev;
- int err;
struct nlattr *tb[IFLA_MAX+1];
+ struct net_device *dev = NULL;
+ int err;
err = nlmsg_parse_deprecated(nlh, sizeof(*ifm), tb, IFLA_MAX,
ifla_policy, extack);
@@ -3288,21 +3288,18 @@ static int rtnl_setlink(struct sk_buff *skb, struct nlmsghdr *nlh,
if (err < 0)
goto errout;
- err = -EINVAL;
- ifm = nlmsg_data(nlh);
if (ifm->ifi_index > 0)
dev = __dev_get_by_index(net, ifm->ifi_index);
else if (tb[IFLA_IFNAME] || tb[IFLA_ALT_IFNAME])
dev = rtnl_dev_get(net, tb);
else
- goto errout;
+ err = -EINVAL;
- if (dev == NULL) {
+ if (dev)
+ err = do_setlink(skb, dev, ifm, extack, tb, 0);
+ else if (!err)
err = -ENODEV;
- goto errout;
- }
- err = do_setlink(skb, dev, ifm, extack, tb, 0);
errout:
return err;
}
--
2.39.5 (Apple Git-154)
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH v1 net-next 12/13] rtnetlink: Call rtnl_link_get_net_capable() in do_setlink().
2024-10-09 23:16 [PATCH v1 net-next 00/13] rtnetlink: Refactor rtnl_{new,del,set}link() for per-netns RTNL Kuniyuki Iwashima
` (10 preceding siblings ...)
2024-10-09 23:16 ` [PATCH v1 net-next 11/13] rtnetlink: Clean up rtnl_setlink() Kuniyuki Iwashima
@ 2024-10-09 23:16 ` Kuniyuki Iwashima
2024-10-10 13:15 ` Eric Dumazet
2024-10-11 7:36 ` kernel test robot
2024-10-09 23:16 ` [PATCH v1 net-next 13/13] rtnetlink: Protect struct rtnl_af_ops with SRCU Kuniyuki Iwashima
12 siblings, 2 replies; 30+ messages in thread
From: Kuniyuki Iwashima @ 2024-10-09 23:16 UTC (permalink / raw)
To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev
We will push RTNL down to rtnl_setlink().
RTM_SETLINK could call rtnl_link_get_net_capable() in do_setlink()
to move a dev to a new netns, but the netns needs to be fetched before
holding rtnl_net_lock().
Let's move it to rtnl_setlink() and pass the netns to do_setlink().
Now, RTM_NEWLINK paths (rtnl_changelink() and rtnl_group_changelink())
can pass the prefetched netns to do_setlink().
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
net/core/rtnetlink.c | 31 +++++++++++++++----------------
1 file changed, 15 insertions(+), 16 deletions(-)
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index de693a88986e..a0702e531331 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -2875,8 +2875,8 @@ static int do_set_proto_down(struct net_device *dev,
#define DO_SETLINK_MODIFIED 0x01
/* notify flag means notify + modified. */
#define DO_SETLINK_NOTIFY 0x03
-static int do_setlink(const struct sk_buff *skb,
- struct net_device *dev, struct ifinfomsg *ifm,
+static int do_setlink(const struct sk_buff *skb, struct net_device *dev,
+ struct net *tgt_net, struct ifinfomsg *ifm,
struct netlink_ext_ack *extack,
struct nlattr **tb, int status)
{
@@ -2893,27 +2893,19 @@ static int do_setlink(const struct sk_buff *skb,
else
ifname[0] = '\0';
- if (tb[IFLA_NET_NS_PID] || tb[IFLA_NET_NS_FD] || tb[IFLA_TARGET_NETNSID]) {
+ if (!net_eq(tgt_net, dev_net(dev))) {
const char *pat = ifname[0] ? ifname : NULL;
- struct net *net;
int new_ifindex;
- net = rtnl_link_get_net_capable(skb, dev_net(dev),
- tb, CAP_NET_ADMIN);
- if (IS_ERR(net)) {
- err = PTR_ERR(net);
- goto errout;
- }
-
if (tb[IFLA_NEW_IFINDEX])
new_ifindex = nla_get_s32(tb[IFLA_NEW_IFINDEX]);
else
new_ifindex = 0;
- err = __dev_change_net_namespace(dev, net, pat, new_ifindex);
- put_net(net);
+ err = __dev_change_net_namespace(dev, tgt_net, pat, new_ifindex);
if (err)
goto errout;
+
status |= DO_SETLINK_MODIFIED;
}
@@ -3277,6 +3269,7 @@ static int rtnl_setlink(struct sk_buff *skb, struct nlmsghdr *nlh,
struct net *net = sock_net(skb->sk);
struct nlattr *tb[IFLA_MAX+1];
struct net_device *dev = NULL;
+ struct net *tgt_net;
int err;
err = nlmsg_parse_deprecated(nlh, sizeof(*ifm), tb, IFLA_MAX,
@@ -3288,6 +3281,10 @@ static int rtnl_setlink(struct sk_buff *skb, struct nlmsghdr *nlh,
if (err < 0)
goto errout;
+ tgt_net = rtnl_link_get_net_capable(skb, net, tb, CAP_NET_ADMIN);
+ if (IS_ERR(tgt_net))
+ return PTR_ERR(tgt_net);
+
if (ifm->ifi_index > 0)
dev = __dev_get_by_index(net, ifm->ifi_index);
else if (tb[IFLA_IFNAME] || tb[IFLA_ALT_IFNAME])
@@ -3296,11 +3293,13 @@ static int rtnl_setlink(struct sk_buff *skb, struct nlmsghdr *nlh,
err = -EINVAL;
if (dev)
- err = do_setlink(skb, dev, ifm, extack, tb, 0);
+ err = do_setlink(skb, dev, tgt_net, ifm, extack, tb, 0);
else if (!err)
err = -ENODEV;
errout:
+ put_net(tgt_net);
+
return err;
}
@@ -3593,7 +3592,7 @@ static int rtnl_changelink(const struct sk_buff *skb, struct nlmsghdr *nlh,
status |= DO_SETLINK_NOTIFY;
}
- return do_setlink(skb, dev, nlmsg_data(nlh), extack, tb, status);
+ return do_setlink(skb, dev, tgt_net, nlmsg_data(nlh), extack, tb, status);
}
static int rtnl_group_changelink(const struct sk_buff *skb,
@@ -3607,7 +3606,7 @@ static int rtnl_group_changelink(const struct sk_buff *skb,
for_each_netdev_safe(net, dev, aux) {
if (dev->group == group) {
- err = do_setlink(skb, dev, ifm, extack, tb, 0);
+ err = do_setlink(skb, dev, tgt_net, ifm, extack, tb, 0);
if (err < 0)
return err;
}
--
2.39.5 (Apple Git-154)
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH v1 net-next 13/13] rtnetlink: Protect struct rtnl_af_ops with SRCU.
2024-10-09 23:16 [PATCH v1 net-next 00/13] rtnetlink: Refactor rtnl_{new,del,set}link() for per-netns RTNL Kuniyuki Iwashima
` (11 preceding siblings ...)
2024-10-09 23:16 ` [PATCH v1 net-next 12/13] rtnetlink: Call rtnl_link_get_net_capable() in do_setlink() Kuniyuki Iwashima
@ 2024-10-09 23:16 ` Kuniyuki Iwashima
2024-10-10 13:16 ` Eric Dumazet
12 siblings, 1 reply; 30+ messages in thread
From: Kuniyuki Iwashima @ 2024-10-09 23:16 UTC (permalink / raw)
To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev
Once RTNL is replaced with rtnl_net_lock(), we need a mechanism to
guarantee that rtnl_af_ops is alive during inflight RTM_SETLINK
even when its module is being unloaded.
Let's use SRCU to protect rtnl_af_ops.
rtnl_af_lookup() now iterates rtnl_af_ops under RCU and returns
SRCU-protected ops pointer. The caller must call rtnl_af_put()
to release the pointer after the use.
Also, rtnl_af_unregister() unlinks the ops first and calls
synchronize_srcu() to wait for inflight RTM_SETLINK requests to
complete.
Note that rtnl_af_ops needs to be protected by its dedicated lock
when RTNL is removed.
Note also that BUG_ON() in do_setlink() is changed to the normal
error handling as a different af_ops might be found after
validate_linkmsg().
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
include/net/rtnetlink.h | 5 +++-
net/core/rtnetlink.c | 58 +++++++++++++++++++++++++++++------------
2 files changed, 46 insertions(+), 17 deletions(-)
diff --git a/include/net/rtnetlink.h b/include/net/rtnetlink.h
index c873fd6193ed..407a2f56f00a 100644
--- a/include/net/rtnetlink.h
+++ b/include/net/rtnetlink.h
@@ -150,7 +150,8 @@ void rtnl_link_unregister(struct rtnl_link_ops *ops);
/**
* struct rtnl_af_ops - rtnetlink address family operations
*
- * @list: Used internally
+ * @list: Used internally, protected by RTNL and SRCU
+ * @srcu: Used internally
* @family: Address family
* @fill_link_af: Function to fill IFLA_AF_SPEC with address family
* specific netlink attributes.
@@ -163,6 +164,8 @@ void rtnl_link_unregister(struct rtnl_link_ops *ops);
*/
struct rtnl_af_ops {
struct list_head list;
+ struct srcu_struct srcu;
+
int family;
int (*fill_link_af)(struct sk_buff *skb,
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index a0702e531331..817165f6d5ef 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -660,18 +660,31 @@ static size_t rtnl_link_get_size(const struct net_device *dev)
static LIST_HEAD(rtnl_af_ops);
-static const struct rtnl_af_ops *rtnl_af_lookup(const int family)
+static struct rtnl_af_ops *rtnl_af_lookup(const int family, int *srcu_index)
{
- const struct rtnl_af_ops *ops;
+ struct rtnl_af_ops *ops;
ASSERT_RTNL();
- list_for_each_entry(ops, &rtnl_af_ops, list) {
- if (ops->family == family)
- return ops;
+ rcu_read_lock();
+
+ list_for_each_entry_rcu(ops, &rtnl_af_ops, list) {
+ if (ops->family == family) {
+ *srcu_index = srcu_read_lock(&ops->srcu);
+ goto unlock;
+ }
}
- return NULL;
+ ops = NULL;
+unlock:
+ rcu_read_unlock();
+
+ return ops;
+}
+
+static void rtnl_af_put(struct rtnl_af_ops *ops, int srcu_index)
+{
+ srcu_read_unlock(&ops->srcu, srcu_index);
}
/**
@@ -683,6 +696,7 @@ static const struct rtnl_af_ops *rtnl_af_lookup(const int family)
void rtnl_af_register(struct rtnl_af_ops *ops)
{
rtnl_lock();
+ init_srcu_struct(&ops->srcu);
list_add_tail_rcu(&ops->list, &rtnl_af_ops);
rtnl_unlock();
}
@@ -699,6 +713,7 @@ void rtnl_af_unregister(struct rtnl_af_ops *ops)
rtnl_unlock();
synchronize_rcu();
+ synchronize_srcu(&ops->srcu);
}
EXPORT_SYMBOL_GPL(rtnl_af_unregister);
@@ -2571,20 +2586,24 @@ static int validate_linkmsg(struct net_device *dev, struct nlattr *tb[],
int rem, err;
nla_for_each_nested(af, tb[IFLA_AF_SPEC], rem) {
- const struct rtnl_af_ops *af_ops;
+ struct rtnl_af_ops *af_ops;
+ int srcu_ops_index;
- af_ops = rtnl_af_lookup(nla_type(af));
+ af_ops = rtnl_af_lookup(nla_type(af), &srcu_ops_index);
if (!af_ops)
return -EAFNOSUPPORT;
if (!af_ops->set_link_af)
- return -EOPNOTSUPP;
-
- if (af_ops->validate_link_af) {
+ err = -EOPNOTSUPP;
+ else if (af_ops->validate_link_af)
err = af_ops->validate_link_af(dev, af, extack);
- if (err < 0)
- return err;
- }
+ else
+ err = 0;
+
+ rtnl_af_put(af_ops, srcu_ops_index);
+
+ if (err < 0)
+ return err;
}
}
@@ -3164,11 +3183,18 @@ static int do_setlink(const struct sk_buff *skb, struct net_device *dev,
int rem;
nla_for_each_nested(af, tb[IFLA_AF_SPEC], rem) {
- const struct rtnl_af_ops *af_ops;
+ struct rtnl_af_ops *af_ops;
+ int srcu_ops_index;
- BUG_ON(!(af_ops = rtnl_af_lookup(nla_type(af))));
+ af_ops = rtnl_af_lookup(nla_type(af), &srcu_ops_index);
+ if (!af_ops) {
+ err = -EAFNOSUPPORT;
+ goto errout;
+ }
err = af_ops->set_link_af(dev, af, extack);
+ rtnl_af_put(af_ops, srcu_ops_index);
+
if (err < 0)
goto errout;
--
2.39.5 (Apple Git-154)
^ permalink raw reply related [flat|nested] 30+ messages in thread
* Re: [PATCH v1 net-next 01/13] rtnetlink: Allocate linkinfo[] as struct rtnl_newlink_tbs.
2024-10-09 23:16 ` [PATCH v1 net-next 01/13] rtnetlink: Allocate linkinfo[] as struct rtnl_newlink_tbs Kuniyuki Iwashima
@ 2024-10-10 12:36 ` Eric Dumazet
0 siblings, 0 replies; 30+ messages in thread
From: Eric Dumazet @ 2024-10-10 12:36 UTC (permalink / raw)
To: Kuniyuki Iwashima
Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, Kuniyuki Iwashima,
netdev
On Thu, Oct 10, 2024 at 1:17 AM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
>
> We will move linkinfo to rtnl_newlink() and pass it down to other
> functions.
>
> Let's pack it into rtnl_newlink_tbs.
>
> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v1 net-next 03/13] rtnetlink: Factorise do_setlink() path from __rtnl_newlink().
2024-10-09 23:16 ` [PATCH v1 net-next 03/13] rtnetlink: Factorise do_setlink() path from __rtnl_newlink() Kuniyuki Iwashima
@ 2024-10-10 12:39 ` Eric Dumazet
0 siblings, 0 replies; 30+ messages in thread
From: Eric Dumazet @ 2024-10-10 12:39 UTC (permalink / raw)
To: Kuniyuki Iwashima
Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, Kuniyuki Iwashima,
netdev
On Thu, Oct 10, 2024 at 1:18 AM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
>
> __rtnl_newlink() got too long to maintain.
>
> For example, netdev_master_upper_dev_get()->rtnl_link_ops is fetched even
> when IFLA_INFO_SLAVE_DATA is not specified.
>
> Let's factorise the single dev do_setlink() path to a separate function.
>
> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
> ---
Reviewed-by: Eric Dumazet <edumazet@google.com>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v1 net-next 04/13] rtnetlink: Move simple validation from __rtnl_newlink() to rtnl_newlink().
2024-10-09 23:16 ` [PATCH v1 net-next 04/13] rtnetlink: Move simple validation from __rtnl_newlink() to rtnl_newlink() Kuniyuki Iwashima
@ 2024-10-10 12:40 ` Eric Dumazet
0 siblings, 0 replies; 30+ messages in thread
From: Eric Dumazet @ 2024-10-10 12:40 UTC (permalink / raw)
To: Kuniyuki Iwashima
Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, Kuniyuki Iwashima,
netdev
On Thu, Oct 10, 2024 at 1:18 AM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
>
> We will push RTNL down to rtnl_newlink().
>
> Let's move RTNL-independent validation to rtnl_newlink().
>
> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v1 net-next 02/13] rtnetlink: Call validate_linkmsg() in do_setlink().
2024-10-09 23:16 ` [PATCH v1 net-next 02/13] rtnetlink: Call validate_linkmsg() in do_setlink() Kuniyuki Iwashima
@ 2024-10-10 12:44 ` Eric Dumazet
0 siblings, 0 replies; 30+ messages in thread
From: Eric Dumazet @ 2024-10-10 12:44 UTC (permalink / raw)
To: Kuniyuki Iwashima
Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, Kuniyuki Iwashima,
netdev
On Thu, Oct 10, 2024 at 1:17 AM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
>
> There are 3 paths that finally call do_setlink(), and validate_linkmsg()
> is called in each path.
>
> 1. RTM_NEWLINK
> 1-1. dev is found in __rtnl_newlink()
> 1-2. dev isn't found, but IFLA_GROUP is specified in
> rtnl_group_changelink()
> 2. RTM_SETLINK
>
> The next patch factorises 1-1 to a separate function.
>
> As a preparation, let's move validate_linkmsg() calls to do_setlink().
>
> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v1 net-next 06/13] rtnetlink: Move ops->validate to rtnl_newlink().
2024-10-09 23:16 ` [PATCH v1 net-next 06/13] rtnetlink: Move ops->validate " Kuniyuki Iwashima
@ 2024-10-10 12:48 ` Eric Dumazet
0 siblings, 0 replies; 30+ messages in thread
From: Eric Dumazet @ 2024-10-10 12:48 UTC (permalink / raw)
To: Kuniyuki Iwashima
Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, Kuniyuki Iwashima,
netdev
On Thu, Oct 10, 2024 at 1:19 AM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
>
> ops->validate() does not require RTNL.
>
> Let's move it to rtnl_newlink().
>
> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
>
Reviewed-by: Eric Dumazet <edumazet@google.com>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v1 net-next 07/13] rtnetlink: Protect struct rtnl_link_ops with SRCU.
2024-10-09 23:16 ` [PATCH v1 net-next 07/13] rtnetlink: Protect struct rtnl_link_ops with SRCU Kuniyuki Iwashima
@ 2024-10-10 13:02 ` Eric Dumazet
2024-10-10 16:20 ` Kuniyuki Iwashima
0 siblings, 1 reply; 30+ messages in thread
From: Eric Dumazet @ 2024-10-10 13:02 UTC (permalink / raw)
To: Kuniyuki Iwashima
Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, Kuniyuki Iwashima,
netdev
On Thu, Oct 10, 2024 at 1:19 AM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
>
> Once RTNL is replaced with rtnl_net_lock(), we need a mechanism to
> guarantee that rtnl_link_ops is alive during inflight RTM_NEWLINK
> even when its module is being unloaded.
>
> Let's use SRCU to protect rtnl_link_ops.
>
> rtnl_link_ops_get() now iterates link_ops under RCU and returns
> SRCU-protected ops pointer. The caller must call rtnl_link_ops_put()
> to release the pointer after the use.
>
> Also, __rtnl_link_unregister() unlinks the ops first and calls
> synchronize_srcu() to wait for inflight RTM_NEWLINK requests to
> complete.
>
> Note that link_ops needs to be protected by its dedicated lock
> when RTNL is removed.
>
> Suggested-by: Eric Dumazet <edumazet@google.com>
> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
> ---
> include/net/rtnetlink.h | 5 ++-
> net/core/rtnetlink.c | 78 +++++++++++++++++++++++++++++------------
> 2 files changed, 60 insertions(+), 23 deletions(-)
>
> diff --git a/include/net/rtnetlink.h b/include/net/rtnetlink.h
> index b45d57b5968a..c873fd6193ed 100644
> --- a/include/net/rtnetlink.h
> +++ b/include/net/rtnetlink.h
> @@ -3,6 +3,7 @@
> #define __NET_RTNETLINK_H
>
> #include <linux/rtnetlink.h>
> +#include <linux/srcu.h>
> #include <net/netlink.h>
>
> typedef int (*rtnl_doit_func)(struct sk_buff *, struct nlmsghdr *,
> @@ -47,7 +48,8 @@ static inline int rtnl_msg_family(const struct nlmsghdr *nlh)
> /**
> * struct rtnl_link_ops - rtnetlink link operations
> *
> - * @list: Used internally
> + * @list: Used internally, protected by RTNL and SRCU
> + * @srcu: Used internally
> * @kind: Identifier
> * @netns_refund: Physical device, move to init_net on netns exit
> * @maxtype: Highest device specific netlink attribute number
> @@ -78,6 +80,7 @@ static inline int rtnl_msg_family(const struct nlmsghdr *nlh)
> */
> struct rtnl_link_ops {
> struct list_head list;
> + struct srcu_struct srcu;
>
> const char *kind;
>
> diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
> index 24545c5b7e48..7f464554d881 100644
> --- a/net/core/rtnetlink.c
> +++ b/net/core/rtnetlink.c
> @@ -456,15 +456,29 @@ EXPORT_SYMBOL_GPL(rtnl_unregister_all);
>
> static LIST_HEAD(link_ops);
>
> -static const struct rtnl_link_ops *rtnl_link_ops_get(const char *kind)
> +static struct rtnl_link_ops *rtnl_link_ops_get(const char *kind, int *srcu_index)
> {
> - const struct rtnl_link_ops *ops;
> + struct rtnl_link_ops *ops;
>
> - list_for_each_entry(ops, &link_ops, list) {
> - if (!strcmp(ops->kind, kind))
> - return ops;
> + rcu_read_lock();
> +
> + list_for_each_entry_rcu(ops, &link_ops, list) {
> + if (!strcmp(ops->kind, kind)) {
> + *srcu_index = srcu_read_lock(&ops->srcu);
> + goto unlock;
> + }
> }
> - return NULL;
> +
> + ops = NULL;
> +unlock:
> + rcu_read_unlock();
> +
> + return ops;
> +}
> +
> +static void rtnl_link_ops_put(struct rtnl_link_ops *ops, int srcu_index)
> +{
> + srcu_read_unlock(&ops->srcu, srcu_index);
> }
>
> /**
> @@ -479,8 +493,15 @@ static const struct rtnl_link_ops *rtnl_link_ops_get(const char *kind)
> */
> int __rtnl_link_register(struct rtnl_link_ops *ops)
> {
> - if (rtnl_link_ops_get(ops->kind))
> - return -EEXIST;
> + struct rtnl_link_ops *tmp;
> +
> + /* When RTNL is removed, add lock for link_ops. */
> + ASSERT_RTNL();
> +
> + list_for_each_entry(tmp, &link_ops, list) {
> + if (!strcmp(ops->kind, tmp->kind))
> + return -EEXIST;
> + }
>
> /* The check for alloc/setup is here because if ops
> * does not have that filled up, it is not possible
> @@ -490,7 +511,9 @@ int __rtnl_link_register(struct rtnl_link_ops *ops)
> if ((ops->alloc || ops->setup) && !ops->dellink)
> ops->dellink = unregister_netdevice_queue;
>
> - list_add_tail(&ops->list, &link_ops);
> + init_srcu_struct(&ops->srcu);
init_srcu_struct() could fail.
> + list_add_tail_rcu(&ops->list, &link_ops);
> +
> return 0;
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v1 net-next 08/13] rtnetlink: Call rtnl_link_get_net_capable() in rtnl_newlink().
2024-10-09 23:16 ` [PATCH v1 net-next 08/13] rtnetlink: Call rtnl_link_get_net_capable() in rtnl_newlink() Kuniyuki Iwashima
@ 2024-10-10 13:08 ` Eric Dumazet
0 siblings, 0 replies; 30+ messages in thread
From: Eric Dumazet @ 2024-10-10 13:08 UTC (permalink / raw)
To: Kuniyuki Iwashima
Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, Kuniyuki Iwashima,
netdev
On Thu, Oct 10, 2024 at 1:19 AM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
>
> As a prerequisite of per-netns RTNL, we must fetch netns before
> looking up dev or moving it to another netns.
>
> rtnl_link_get_net_capable() is called in rtnl_newlink_create() and
> do_setlink(), but both of them need to be moved to the RTNL-independent
> region, which will be rtnl_newlink().
>
> Let's call rtnl_link_get_net_capable() in rtnl_newlink() and pass the
> netns down to where needed.
>
> Note that the latter two have not passed the nets to do_setlink() yet
> but will do so after the remaining rtnl_link_get_net_capable() is moved
> to rtnl_setlink() later.
>
> While at it, dest_net is renamed to tgt_net in rtnl_newlink_create() to
> align with rtnl_{del,set}link().
>
> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v1 net-next 09/13] rtnetlink: Fetch IFLA_LINK_NETNSID in rtnl_newlink().
2024-10-09 23:16 ` [PATCH v1 net-next 09/13] rtnetlink: Fetch IFLA_LINK_NETNSID " Kuniyuki Iwashima
@ 2024-10-10 13:09 ` Eric Dumazet
0 siblings, 0 replies; 30+ messages in thread
From: Eric Dumazet @ 2024-10-10 13:09 UTC (permalink / raw)
To: Kuniyuki Iwashima
Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, Kuniyuki Iwashima,
netdev
On Thu, Oct 10, 2024 at 1:20 AM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
>
> Another netns option for RTM_NEWLINK is IFLA_LINK_NETNSID and
> is fetched in rtnl_newlink_create().
>
> This must be done before holding rtnl_net_lock().
>
> Let's move IFLA_LINK_NETNSID processing to rtnl_newlink().
>
> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v1 net-next 10/13] rtnetlink: Clean up rtnl_dellink().
2024-10-09 23:16 ` [PATCH v1 net-next 10/13] rtnetlink: Clean up rtnl_dellink() Kuniyuki Iwashima
@ 2024-10-10 13:10 ` Eric Dumazet
0 siblings, 0 replies; 30+ messages in thread
From: Eric Dumazet @ 2024-10-10 13:10 UTC (permalink / raw)
To: Kuniyuki Iwashima
Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, Kuniyuki Iwashima,
netdev
On Thu, Oct 10, 2024 at 1:20 AM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
>
> We will push RTNL down to rtnl_delink().
>
> Let's unify the error path to make it easy to place rtnl_net_lock().
>
> While at it, keep the variables in reverse xmas order.
>
> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
> ---
Reviewed-by: Eric Dumazet <edumazet@google.com>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v1 net-next 11/13] rtnetlink: Clean up rtnl_setlink().
2024-10-09 23:16 ` [PATCH v1 net-next 11/13] rtnetlink: Clean up rtnl_setlink() Kuniyuki Iwashima
@ 2024-10-10 13:11 ` Eric Dumazet
0 siblings, 0 replies; 30+ messages in thread
From: Eric Dumazet @ 2024-10-10 13:11 UTC (permalink / raw)
To: Kuniyuki Iwashima
Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, Kuniyuki Iwashima,
netdev
On Thu, Oct 10, 2024 at 1:20 AM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
>
> We will push RTNL down to rtnl_setlink().
>
> Let's unify the error path to make it easy to place rtnl_net_lock().
>
> While at it, keep the variables in reverse xmas order.
>
> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
> ---
Reviewed-by: Eric Dumazet <edumazet@google.com>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v1 net-next 12/13] rtnetlink: Call rtnl_link_get_net_capable() in do_setlink().
2024-10-09 23:16 ` [PATCH v1 net-next 12/13] rtnetlink: Call rtnl_link_get_net_capable() in do_setlink() Kuniyuki Iwashima
@ 2024-10-10 13:15 ` Eric Dumazet
2024-10-11 7:36 ` kernel test robot
1 sibling, 0 replies; 30+ messages in thread
From: Eric Dumazet @ 2024-10-10 13:15 UTC (permalink / raw)
To: Kuniyuki Iwashima
Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, Kuniyuki Iwashima,
netdev
On Thu, Oct 10, 2024 at 1:21 AM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
>
> We will push RTNL down to rtnl_setlink().
>
> RTM_SETLINK could call rtnl_link_get_net_capable() in do_setlink()
> to move a dev to a new netns, but the netns needs to be fetched before
> holding rtnl_net_lock().
>
> Let's move it to rtnl_setlink() and pass the netns to do_setlink().
>
> Now, RTM_NEWLINK paths (rtnl_changelink() and rtnl_group_changelink())
> can pass the prefetched netns to do_setlink().
>
> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v1 net-next 13/13] rtnetlink: Protect struct rtnl_af_ops with SRCU.
2024-10-09 23:16 ` [PATCH v1 net-next 13/13] rtnetlink: Protect struct rtnl_af_ops with SRCU Kuniyuki Iwashima
@ 2024-10-10 13:16 ` Eric Dumazet
2024-10-10 16:27 ` Kuniyuki Iwashima
0 siblings, 1 reply; 30+ messages in thread
From: Eric Dumazet @ 2024-10-10 13:16 UTC (permalink / raw)
To: Kuniyuki Iwashima
Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, Kuniyuki Iwashima,
netdev
On Thu, Oct 10, 2024 at 1:21 AM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
>
> Once RTNL is replaced with rtnl_net_lock(), we need a mechanism to
> guarantee that rtnl_af_ops is alive during inflight RTM_SETLINK
> even when its module is being unloaded.
>
> Let's use SRCU to protect rtnl_af_ops.
>
> rtnl_af_lookup() now iterates rtnl_af_ops under RCU and returns
> SRCU-protected ops pointer. The caller must call rtnl_af_put()
> to release the pointer after the use.
>
> Also, rtnl_af_unregister() unlinks the ops first and calls
> synchronize_srcu() to wait for inflight RTM_SETLINK requests to
> complete.
>
> Note that rtnl_af_ops needs to be protected by its dedicated lock
> when RTNL is removed.
>
> Note also that BUG_ON() in do_setlink() is changed to the normal
> error handling as a different af_ops might be found after
> validate_linkmsg().
>
> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
> ---
> include/net/rtnetlink.h | 5 +++-
> net/core/rtnetlink.c | 58 +++++++++++++++++++++++++++++------------
> 2 files changed, 46 insertions(+), 17 deletions(-)
>
> diff --git a/include/net/rtnetlink.h b/include/net/rtnetlink.h
> index c873fd6193ed..407a2f56f00a 100644
> --- a/include/net/rtnetlink.h
> +++ b/include/net/rtnetlink.h
> @@ -150,7 +150,8 @@ void rtnl_link_unregister(struct rtnl_link_ops *ops);
> /**
> * struct rtnl_af_ops - rtnetlink address family operations
> *
> - * @list: Used internally
> + * @list: Used internally, protected by RTNL and SRCU
> + * @srcu: Used internally
> * @family: Address family
> * @fill_link_af: Function to fill IFLA_AF_SPEC with address family
> * specific netlink attributes.
> @@ -163,6 +164,8 @@ void rtnl_link_unregister(struct rtnl_link_ops *ops);
> */
> struct rtnl_af_ops {
> struct list_head list;
> + struct srcu_struct srcu;
> +
> int family;
>
> int (*fill_link_af)(struct sk_buff *skb,
> diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
> index a0702e531331..817165f6d5ef 100644
> --- a/net/core/rtnetlink.c
> +++ b/net/core/rtnetlink.c
> @@ -660,18 +660,31 @@ static size_t rtnl_link_get_size(const struct net_device *dev)
>
> static LIST_HEAD(rtnl_af_ops);
>
> -static const struct rtnl_af_ops *rtnl_af_lookup(const int family)
> +static struct rtnl_af_ops *rtnl_af_lookup(const int family, int *srcu_index)
> {
> - const struct rtnl_af_ops *ops;
> + struct rtnl_af_ops *ops;
>
> ASSERT_RTNL();
>
> - list_for_each_entry(ops, &rtnl_af_ops, list) {
> - if (ops->family == family)
> - return ops;
> + rcu_read_lock();
> +
> + list_for_each_entry_rcu(ops, &rtnl_af_ops, list) {
> + if (ops->family == family) {
> + *srcu_index = srcu_read_lock(&ops->srcu);
> + goto unlock;
> + }
> }
>
> - return NULL;
> + ops = NULL;
> +unlock:
> + rcu_read_unlock();
> +
> + return ops;
> +}
> +
> +static void rtnl_af_put(struct rtnl_af_ops *ops, int srcu_index)
> +{
> + srcu_read_unlock(&ops->srcu, srcu_index);
> }
>
> /**
> @@ -683,6 +696,7 @@ static const struct rtnl_af_ops *rtnl_af_lookup(const int family)
> void rtnl_af_register(struct rtnl_af_ops *ops)
> {
> rtnl_lock();
> + init_srcu_struct(&ops->srcu);
Same remark, this could fail.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v1 net-next 07/13] rtnetlink: Protect struct rtnl_link_ops with SRCU.
2024-10-10 13:02 ` Eric Dumazet
@ 2024-10-10 16:20 ` Kuniyuki Iwashima
0 siblings, 0 replies; 30+ messages in thread
From: Kuniyuki Iwashima @ 2024-10-10 16:20 UTC (permalink / raw)
To: edumazet; +Cc: davem, kuba, kuni1840, kuniyu, netdev, pabeni
From: Eric Dumazet <edumazet@google.com>
Date: Thu, 10 Oct 2024 15:02:39 +0200
> > @@ -490,7 +511,9 @@ int __rtnl_link_register(struct rtnl_link_ops *ops)
> > if ((ops->alloc || ops->setup) && !ops->dellink)
> > ops->dellink = unregister_netdevice_queue;
> >
> > - list_add_tail(&ops->list, &link_ops);
> > + init_srcu_struct(&ops->srcu);
>
> init_srcu_struct() could fail.
Oh, I somehow assumed init wouldn't fail.
Will fix in v2.
Thanks!
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v1 net-next 13/13] rtnetlink: Protect struct rtnl_af_ops with SRCU.
2024-10-10 13:16 ` Eric Dumazet
@ 2024-10-10 16:27 ` Kuniyuki Iwashima
0 siblings, 0 replies; 30+ messages in thread
From: Kuniyuki Iwashima @ 2024-10-10 16:27 UTC (permalink / raw)
To: edumazet; +Cc: davem, kuba, kuni1840, kuniyu, netdev, pabeni
From: Eric Dumazet <edumazet@google.com>
Date: Thu, 10 Oct 2024 15:16:20 +0200
> > @@ -683,6 +696,7 @@ static const struct rtnl_af_ops *rtnl_af_lookup(const int family)
> > void rtnl_af_register(struct rtnl_af_ops *ops)
> > {
> > rtnl_lock();
> > + init_srcu_struct(&ops->srcu);
>
> Same remark, this could fail.
This requries rtnl_af_register() to return int, and that needs
few more changes conflicting with rtnl_register_many() changes.
I'll see if it'd be a bit noisy, and if so, I'll split this patch
to another series.
Thanks!
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v1 net-next 12/13] rtnetlink: Call rtnl_link_get_net_capable() in do_setlink().
2024-10-09 23:16 ` [PATCH v1 net-next 12/13] rtnetlink: Call rtnl_link_get_net_capable() in do_setlink() Kuniyuki Iwashima
2024-10-10 13:15 ` Eric Dumazet
@ 2024-10-11 7:36 ` kernel test robot
2024-10-11 16:17 ` Kuniyuki Iwashima
1 sibling, 1 reply; 30+ messages in thread
From: kernel test robot @ 2024-10-11 7:36 UTC (permalink / raw)
To: Kuniyuki Iwashima, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni
Cc: llvm, oe-kbuild-all, netdev, Kuniyuki Iwashima
Hi Kuniyuki,
kernel test robot noticed the following build warnings:
[auto build test WARNING on net-next/main]
url: https://github.com/intel-lab-lkp/linux/commits/Kuniyuki-Iwashima/rtnetlink-Allocate-linkinfo-as-struct-rtnl_newlink_tbs/20241010-072158
base: net-next/main
patch link: https://lore.kernel.org/r/20241009231656.57830-13-kuniyu%40amazon.com
patch subject: [PATCH v1 net-next 12/13] rtnetlink: Call rtnl_link_get_net_capable() in do_setlink().
config: x86_64-buildonly-randconfig-003-20241011 (https://download.01.org/0day-ci/archive/20241011/202410111515.TbOH4hSS-lkp@intel.com/config)
compiler: clang version 18.1.8 (https://github.com/llvm/llvm-project 3b5b5c1ec4a3095ab096dd780e84d7ab81f3d7ff)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241011/202410111515.TbOH4hSS-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202410111515.TbOH4hSS-lkp@intel.com/
All warnings (new ones prefixed by >>):
>> net/core/rtnetlink.c:3281:6: warning: variable 'tgt_net' is used uninitialized whenever 'if' condition is true [-Wsometimes-uninitialized]
3281 | if (err < 0)
| ^~~~~~~
net/core/rtnetlink.c:3301:10: note: uninitialized use occurs here
3301 | put_net(tgt_net);
| ^~~~~~~
net/core/rtnetlink.c:3281:2: note: remove the 'if' if its condition is always false
3281 | if (err < 0)
| ^~~~~~~~~~~~
3282 | goto errout;
| ~~~~~~~~~~~
net/core/rtnetlink.c:3277:6: warning: variable 'tgt_net' is used uninitialized whenever 'if' condition is true [-Wsometimes-uninitialized]
3277 | if (err < 0)
| ^~~~~~~
net/core/rtnetlink.c:3301:10: note: uninitialized use occurs here
3301 | put_net(tgt_net);
| ^~~~~~~
net/core/rtnetlink.c:3277:2: note: remove the 'if' if its condition is always false
3277 | if (err < 0)
| ^~~~~~~~~~~~
3278 | goto errout;
| ~~~~~~~~~~~
net/core/rtnetlink.c:3272:21: note: initialize the variable 'tgt_net' to silence this warning
3272 | struct net *tgt_net;
| ^
| = NULL
2 warnings generated.
vim +3281 net/core/rtnetlink.c
cc6090e985d7d6 Jiri Pirko 2019-09-30 3264
c21ef3e343ae91 David Ahern 2017-04-16 3265 static int rtnl_setlink(struct sk_buff *skb, struct nlmsghdr *nlh,
c21ef3e343ae91 David Ahern 2017-04-16 3266 struct netlink_ext_ack *extack)
0157f60c0caea2 Patrick McHardy 2007-06-13 3267 {
3a6cb17da69fbf Kuniyuki Iwashima 2024-10-09 3268 struct ifinfomsg *ifm = nlmsg_data(nlh);
3b1e0a655f8eba YOSHIFUJI Hideaki 2008-03-26 3269 struct net *net = sock_net(skb->sk);
0157f60c0caea2 Patrick McHardy 2007-06-13 3270 struct nlattr *tb[IFLA_MAX+1];
3a6cb17da69fbf Kuniyuki Iwashima 2024-10-09 3271 struct net_device *dev = NULL;
b27f78e2575aa2 Kuniyuki Iwashima 2024-10-09 3272 struct net *tgt_net;
3a6cb17da69fbf Kuniyuki Iwashima 2024-10-09 3273 int err;
0157f60c0caea2 Patrick McHardy 2007-06-13 3274
8cb081746c031f Johannes Berg 2019-04-26 3275 err = nlmsg_parse_deprecated(nlh, sizeof(*ifm), tb, IFLA_MAX,
8cb081746c031f Johannes Berg 2019-04-26 3276 ifla_policy, extack);
0157f60c0caea2 Patrick McHardy 2007-06-13 3277 if (err < 0)
0157f60c0caea2 Patrick McHardy 2007-06-13 3278 goto errout;
0157f60c0caea2 Patrick McHardy 2007-06-13 3279
4ff66cae7f10b6 Christian Brauner 2018-02-07 3280 err = rtnl_ensure_unique_netns(tb, extack, false);
4ff66cae7f10b6 Christian Brauner 2018-02-07 @3281 if (err < 0)
4ff66cae7f10b6 Christian Brauner 2018-02-07 3282 goto errout;
4ff66cae7f10b6 Christian Brauner 2018-02-07 3283
b27f78e2575aa2 Kuniyuki Iwashima 2024-10-09 3284 tgt_net = rtnl_link_get_net_capable(skb, net, tb, CAP_NET_ADMIN);
b27f78e2575aa2 Kuniyuki Iwashima 2024-10-09 3285 if (IS_ERR(tgt_net))
b27f78e2575aa2 Kuniyuki Iwashima 2024-10-09 3286 return PTR_ERR(tgt_net);
b27f78e2575aa2 Kuniyuki Iwashima 2024-10-09 3287
0157f60c0caea2 Patrick McHardy 2007-06-13 3288 if (ifm->ifi_index > 0)
a3d1289126e7b1 Eric Dumazet 2009-10-21 3289 dev = __dev_get_by_index(net, ifm->ifi_index);
76c9ac0ee878f6 Jiri Pirko 2019-09-30 3290 else if (tb[IFLA_IFNAME] || tb[IFLA_ALT_IFNAME])
5ea08b5286f66e Florent Fourcot 2022-04-15 3291 dev = rtnl_dev_get(net, tb);
0157f60c0caea2 Patrick McHardy 2007-06-13 3292 else
3a6cb17da69fbf Kuniyuki Iwashima 2024-10-09 3293 err = -EINVAL;
0157f60c0caea2 Patrick McHardy 2007-06-13 3294
3a6cb17da69fbf Kuniyuki Iwashima 2024-10-09 3295 if (dev)
b27f78e2575aa2 Kuniyuki Iwashima 2024-10-09 3296 err = do_setlink(skb, dev, tgt_net, ifm, extack, tb, 0);
3a6cb17da69fbf Kuniyuki Iwashima 2024-10-09 3297 else if (!err)
0157f60c0caea2 Patrick McHardy 2007-06-13 3298 err = -ENODEV;
0157f60c0caea2 Patrick McHardy 2007-06-13 3299
da5e0494c542dd Thomas Graf 2006-08-10 3300 errout:
b27f78e2575aa2 Kuniyuki Iwashima 2024-10-09 3301 put_net(tgt_net);
b27f78e2575aa2 Kuniyuki Iwashima 2024-10-09 3302
^1da177e4c3f41 Linus Torvalds 2005-04-16 3303 return err;
^1da177e4c3f41 Linus Torvalds 2005-04-16 3304 }
^1da177e4c3f41 Linus Torvalds 2005-04-16 3305
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v1 net-next 12/13] rtnetlink: Call rtnl_link_get_net_capable() in do_setlink().
2024-10-11 7:36 ` kernel test robot
@ 2024-10-11 16:17 ` Kuniyuki Iwashima
0 siblings, 0 replies; 30+ messages in thread
From: Kuniyuki Iwashima @ 2024-10-11 16:17 UTC (permalink / raw)
To: lkp; +Cc: davem, edumazet, kuba, kuniyu, llvm, netdev, oe-kbuild-all,
pabeni
From: kernel test robot <lkp@intel.com>
Date: Fri, 11 Oct 2024 15:36:55 +0800
> cc6090e985d7d6 Jiri Pirko 2019-09-30 3264
> c21ef3e343ae91 David Ahern 2017-04-16 3265 static int rtnl_setlink(struct sk_buff *skb, struct nlmsghdr *nlh,
> c21ef3e343ae91 David Ahern 2017-04-16 3266 struct netlink_ext_ack *extack)
> 0157f60c0caea2 Patrick McHardy 2007-06-13 3267 {
> 3a6cb17da69fbf Kuniyuki Iwashima 2024-10-09 3268 struct ifinfomsg *ifm = nlmsg_data(nlh);
> 3b1e0a655f8eba YOSHIFUJI Hideaki 2008-03-26 3269 struct net *net = sock_net(skb->sk);
> 0157f60c0caea2 Patrick McHardy 2007-06-13 3270 struct nlattr *tb[IFLA_MAX+1];
> 3a6cb17da69fbf Kuniyuki Iwashima 2024-10-09 3271 struct net_device *dev = NULL;
> b27f78e2575aa2 Kuniyuki Iwashima 2024-10-09 3272 struct net *tgt_net;
> 3a6cb17da69fbf Kuniyuki Iwashima 2024-10-09 3273 int err;
> 0157f60c0caea2 Patrick McHardy 2007-06-13 3274
> 8cb081746c031f Johannes Berg 2019-04-26 3275 err = nlmsg_parse_deprecated(nlh, sizeof(*ifm), tb, IFLA_MAX,
> 8cb081746c031f Johannes Berg 2019-04-26 3276 ifla_policy, extack);
> 0157f60c0caea2 Patrick McHardy 2007-06-13 3277 if (err < 0)
> 0157f60c0caea2 Patrick McHardy 2007-06-13 3278 goto errout;
> 0157f60c0caea2 Patrick McHardy 2007-06-13 3279
> 4ff66cae7f10b6 Christian Brauner 2018-02-07 3280 err = rtnl_ensure_unique_netns(tb, extack, false);
> 4ff66cae7f10b6 Christian Brauner 2018-02-07 @3281 if (err < 0)
> 4ff66cae7f10b6 Christian Brauner 2018-02-07 3282 goto errout;
Oops, I'll simply remove the errout label in v2.
^ permalink raw reply [flat|nested] 30+ messages in thread
end of thread, other threads:[~2024-10-11 16:18 UTC | newest]
Thread overview: 30+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-09 23:16 [PATCH v1 net-next 00/13] rtnetlink: Refactor rtnl_{new,del,set}link() for per-netns RTNL Kuniyuki Iwashima
2024-10-09 23:16 ` [PATCH v1 net-next 01/13] rtnetlink: Allocate linkinfo[] as struct rtnl_newlink_tbs Kuniyuki Iwashima
2024-10-10 12:36 ` Eric Dumazet
2024-10-09 23:16 ` [PATCH v1 net-next 02/13] rtnetlink: Call validate_linkmsg() in do_setlink() Kuniyuki Iwashima
2024-10-10 12:44 ` Eric Dumazet
2024-10-09 23:16 ` [PATCH v1 net-next 03/13] rtnetlink: Factorise do_setlink() path from __rtnl_newlink() Kuniyuki Iwashima
2024-10-10 12:39 ` Eric Dumazet
2024-10-09 23:16 ` [PATCH v1 net-next 04/13] rtnetlink: Move simple validation from __rtnl_newlink() to rtnl_newlink() Kuniyuki Iwashima
2024-10-10 12:40 ` Eric Dumazet
2024-10-09 23:16 ` [PATCH v1 net-next 05/13] rtnetlink: Move rtnl_link_ops_get() and retry " Kuniyuki Iwashima
2024-10-09 23:16 ` [PATCH v1 net-next 06/13] rtnetlink: Move ops->validate " Kuniyuki Iwashima
2024-10-10 12:48 ` Eric Dumazet
2024-10-09 23:16 ` [PATCH v1 net-next 07/13] rtnetlink: Protect struct rtnl_link_ops with SRCU Kuniyuki Iwashima
2024-10-10 13:02 ` Eric Dumazet
2024-10-10 16:20 ` Kuniyuki Iwashima
2024-10-09 23:16 ` [PATCH v1 net-next 08/13] rtnetlink: Call rtnl_link_get_net_capable() in rtnl_newlink() Kuniyuki Iwashima
2024-10-10 13:08 ` Eric Dumazet
2024-10-09 23:16 ` [PATCH v1 net-next 09/13] rtnetlink: Fetch IFLA_LINK_NETNSID " Kuniyuki Iwashima
2024-10-10 13:09 ` Eric Dumazet
2024-10-09 23:16 ` [PATCH v1 net-next 10/13] rtnetlink: Clean up rtnl_dellink() Kuniyuki Iwashima
2024-10-10 13:10 ` Eric Dumazet
2024-10-09 23:16 ` [PATCH v1 net-next 11/13] rtnetlink: Clean up rtnl_setlink() Kuniyuki Iwashima
2024-10-10 13:11 ` Eric Dumazet
2024-10-09 23:16 ` [PATCH v1 net-next 12/13] rtnetlink: Call rtnl_link_get_net_capable() in do_setlink() Kuniyuki Iwashima
2024-10-10 13:15 ` Eric Dumazet
2024-10-11 7:36 ` kernel test robot
2024-10-11 16:17 ` Kuniyuki Iwashima
2024-10-09 23:16 ` [PATCH v1 net-next 13/13] rtnetlink: Protect struct rtnl_af_ops with SRCU Kuniyuki Iwashima
2024-10-10 13:16 ` Eric Dumazet
2024-10-10 16:27 ` Kuniyuki Iwashima
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).