* Re: [PATCH net] tcp: ulp: fix possible crash in tcp_diag_get_aux_size()
From: David Miller @ 2019-09-07 15:34 UTC (permalink / raw)
To: edumazet; +Cc: netdev, eric.dumazet, lukehsiao, ncardwell, dcaratti
In-Reply-To: <20190905202041.138085-1-edumazet@google.com>
From: Eric Dumazet <edumazet@google.com>
Date: Thu, 5 Sep 2019 13:20:41 -0700
> tcp_diag_get_aux_size() can be called with sockets in any state.
>
> icsk_ulp_ops is only present for full sockets.
>
> For SYN_RECV or TIME_WAIT ones we would access garbage.
>
> Fixes: 61723b393292 ("tcp: ulp: add functions to dump ulp-specific information")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Reported-by: Luke Hsiao <lukehsiao@google.com>
> Reported-by: Neal Cardwell <ncardwell@google.com>
Applied to net-next, thanks Eric.
^ permalink raw reply
* Re: [patch net-next] net: fib_notifier: move fib_notifier_ops from struct net into per-net struct
From: David Miller @ 2019-09-07 15:28 UTC (permalink / raw)
To: jiri; +Cc: netdev, idosch, dsahern, mlxsw
In-Reply-To: <20190905180656.4756-1-jiri@resnulli.us>
From: Jiri Pirko <jiri@resnulli.us>
Date: Thu, 5 Sep 2019 20:06:56 +0200
> From: Jiri Pirko <jiri@mellanox.com>
>
> No need for fib_notifier_ops to be in struct net. It is used only by
> fib_notifier as a private data. Use net_generic to introduce per-net
> fib_notifier struct and move fib_notifier_ops there.
>
> Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Applied.
^ permalink raw reply
* Re: [PATCH] net: phylink: Fix flow control resolution
From: David Miller @ 2019-09-07 15:27 UTC (permalink / raw)
To: stefanc
Cc: linux, andrew, netdev, linux-kernel, shaulb, nadavh, ymarkman,
marcin
In-Reply-To: <1567701978-16056-1-git-send-email-stefanc@marvell.com>
From: <stefanc@marvell.com>
Date: Thu, 5 Sep 2019 19:46:18 +0300
> From: Stefan Chulski <stefanc@marvell.com>
>
> Regarding to IEEE 802.3-2015 standard section 2
> 28B.3 Priority resolution - Table 28-3 - Pause resolution
>
> In case of Local device Pause=1 AsymDir=0, Link partner
> Pause=1 AsymDir=1, Local device resolution should be enable PAUSE
> transmit, disable PAUSE receive.
> And in case of Local device Pause=1 AsymDir=1, Link partner
> Pause=1 AsymDir=0, Local device resolution should be enable PAUSE
> receive, disable PAUSE transmit.
>
> Signed-off-by: Stefan Chulski <stefanc@marvell.com>
> Reported-by: Shaul Ben-Mayor <shaulb@marvell.com>
Applied and queued up for -stable, thanks.
^ permalink raw reply
* Re: [PATCH 1/2] net: phy: dp83867: Add documentation for SGMII mode type
From: David Miller @ 2019-09-07 15:24 UTC (permalink / raw)
To: vitaly.gaiduk
Cc: robh+dt, f.fainelli, mark.rutland, tpiepho, andrew, netdev,
devicetree, linux-kernel
In-Reply-To: <1567700761-14195-2-git-send-email-vitaly.gaiduk@cloudbear.ru>
From: Vitaly Gaiduk <vitaly.gaiduk@cloudbear.ru>
Date: Thu, 5 Sep 2019 19:26:00 +0300
> + - ti,sgmii-type - This denotes the fact which SGMII mode is used (4 or 6-wire).
You need to document this more sufficiently as per Andrew's feedback.
^ permalink raw reply
* [PATCH] libertas: use mesh_wdev->ssid instead of priv->mesh_ssid
From: Lubomir Rintel @ 2019-09-07 15:18 UTC (permalink / raw)
To: Kalle Valo
Cc: libertas-dev, linux-wireless, netdev, linux-kernel,
Lubomir Rintel
With the commit e86dc1ca4676 ("Libertas: cfg80211 support") we've lost
the ability to actually set the Mesh SSID from userspace.
NL80211_CMD_SET_INTERFACE with NL80211_ATTR_MESH_ID sets the mesh point
interface's ssid field. Let's use that one for the Libertas Mesh
operation
Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
---
drivers/net/wireless/marvell/libertas/dev.h | 2 --
drivers/net/wireless/marvell/libertas/mesh.c | 31 +++++++++++++-------
drivers/net/wireless/marvell/libertas/mesh.h | 3 +-
3 files changed, 21 insertions(+), 15 deletions(-)
diff --git a/drivers/net/wireless/marvell/libertas/dev.h b/drivers/net/wireless/marvell/libertas/dev.h
index 4691349300265..4b6e05a8e5d54 100644
--- a/drivers/net/wireless/marvell/libertas/dev.h
+++ b/drivers/net/wireless/marvell/libertas/dev.h
@@ -58,8 +58,6 @@ struct lbs_private {
#ifdef CONFIG_LIBERTAS_MESH
struct lbs_mesh_stats mstats;
uint16_t mesh_tlv;
- u8 mesh_ssid[IEEE80211_MAX_SSID_LEN + 1];
- u8 mesh_ssid_len;
u8 mesh_channel;
#endif
diff --git a/drivers/net/wireless/marvell/libertas/mesh.c b/drivers/net/wireless/marvell/libertas/mesh.c
index 2315fdff56c2f..2747c957d18c9 100644
--- a/drivers/net/wireless/marvell/libertas/mesh.c
+++ b/drivers/net/wireless/marvell/libertas/mesh.c
@@ -86,6 +86,7 @@ static int lbs_mesh_config_send(struct lbs_private *priv,
static int lbs_mesh_config(struct lbs_private *priv, uint16_t action,
uint16_t chan)
{
+ struct wireless_dev *mesh_wdev;
struct cmd_ds_mesh_config cmd;
struct mrvl_meshie *ie;
@@ -105,10 +106,17 @@ static int lbs_mesh_config(struct lbs_private *priv, uint16_t action,
ie->val.active_protocol_id = MARVELL_MESH_PROTO_ID_HWMP;
ie->val.active_metric_id = MARVELL_MESH_METRIC_ID;
ie->val.mesh_capability = MARVELL_MESH_CAPABILITY;
- ie->val.mesh_id_len = priv->mesh_ssid_len;
- memcpy(ie->val.mesh_id, priv->mesh_ssid, priv->mesh_ssid_len);
+
+ if (priv->mesh_dev) {
+ mesh_wdev = priv->mesh_dev->ieee80211_ptr;
+ ie->val.mesh_id_len = mesh_wdev->mesh_id_up_len;
+ memcpy(ie->val.mesh_id, mesh_wdev->ssid,
+ mesh_wdev->mesh_id_up_len);
+ }
+
ie->len = sizeof(struct mrvl_meshie_val) -
- IEEE80211_MAX_SSID_LEN + priv->mesh_ssid_len;
+ IEEE80211_MAX_SSID_LEN + ie->val.mesh_id_len;
+
cmd.length = cpu_to_le16(sizeof(struct mrvl_meshie_val));
break;
case CMD_ACT_MESH_CONFIG_STOP:
@@ -117,8 +125,8 @@ static int lbs_mesh_config(struct lbs_private *priv, uint16_t action,
return -1;
}
lbs_deb_cmd("mesh config action %d type %x channel %d SSID %*pE\n",
- action, priv->mesh_tlv, chan, priv->mesh_ssid_len,
- priv->mesh_ssid);
+ action, priv->mesh_tlv, chan, ie->val.mesh_id_len,
+ ie->val.mesh_id);
return __lbs_mesh_config_send(priv, &cmd, action, priv->mesh_tlv);
}
@@ -863,12 +871,6 @@ int lbs_init_mesh(struct lbs_private *priv)
/* Stop meshing until interface is brought up */
lbs_mesh_config(priv, CMD_ACT_MESH_CONFIG_STOP, 1);
- if (priv->mesh_tlv) {
- sprintf(priv->mesh_ssid, "mesh");
- priv->mesh_ssid_len = 4;
- ret = 1;
- }
-
return ret;
}
@@ -997,6 +999,13 @@ static int lbs_add_mesh(struct lbs_private *priv)
mesh_wdev->iftype = NL80211_IFTYPE_MESH_POINT;
mesh_wdev->wiphy = priv->wdev->wiphy;
+
+ if (priv->mesh_tlv) {
+ sprintf(mesh_wdev->ssid, "mesh");
+ mesh_wdev->mesh_id_up_len = 4;
+ ret = 1;
+ }
+
mesh_wdev->netdev = mesh_dev;
mesh_dev->ml_priv = priv;
diff --git a/drivers/net/wireless/marvell/libertas/mesh.h b/drivers/net/wireless/marvell/libertas/mesh.h
index dfe22c91aade0..1561018f226fd 100644
--- a/drivers/net/wireless/marvell/libertas/mesh.h
+++ b/drivers/net/wireless/marvell/libertas/mesh.h
@@ -24,8 +24,7 @@ void lbs_remove_mesh(struct lbs_private *priv);
static inline bool lbs_mesh_activated(struct lbs_private *priv)
{
- /* Mesh SSID is only programmed after successful init */
- return priv->mesh_ssid_len != 0;
+ return !!priv->mesh_tlv;
}
int lbs_mesh_set_channel(struct lbs_private *priv, u8 channel);
--
2.21.0
^ permalink raw reply related
* Re: [PATCH] ethernet: micrel: Use DIV_ROUND_CLOSEST directly to make it readable
From: David Miller @ 2019-09-07 15:17 UTC (permalink / raw)
To: zhongjiang; +Cc: kstewart, gregkh, netdev, linux-kernel
In-Reply-To: <1567698828-26825-1-git-send-email-zhongjiang@huawei.com>
From: zhong jiang <zhongjiang@huawei.com>
Date: Thu, 5 Sep 2019 23:53:48 +0800
> The kernel.h macro DIV_ROUND_CLOSEST performs the computation (x + d/2)/d
> but is perhaps more readable.
>
> Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Applied to net-next.
^ permalink raw reply
* Re: [patch net-next 3/3] net: devlink: move reload fail indication to devlink core and expose to user
From: David Ahern @ 2019-09-07 15:08 UTC (permalink / raw)
To: Jiri Pirko, netdev; +Cc: davem, idosch, jakub.kicinski, tariqt, mlxsw
In-Reply-To: <20190906184419.5101-4-jiri@resnulli.us>
On 9/6/19 7:44 PM, Jiri Pirko wrote:
> From: Jiri Pirko <jiri@mellanox.com>
>
> Currently the fact that devlink failed is stored in drivers. Move this
> flag into devlink core. Also, expose it to the user.
you mean 'reload failed', not 'devlink failed'?
^ permalink raw reply
* Re: [PATCH 0/2] Revert and rework on the metadata accelreation
From: Jason Gunthorpe @ 2019-09-07 15:03 UTC (permalink / raw)
To: Jason Wang
Cc: mst@redhat.com, kvm@vger.kernel.org,
virtualization@lists.linux-foundation.org, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org, aarcange@redhat.com,
jglisse@redhat.com, linux-mm@kvack.org
In-Reply-To: <7785d39b-b4e7-8165-516c-ee6a08ac9c4e@redhat.com>
On Fri, Sep 06, 2019 at 06:02:35PM +0800, Jason Wang wrote:
>
> On 2019/9/5 下午9:59, Jason Gunthorpe wrote:
> > On Thu, Sep 05, 2019 at 08:27:34PM +0800, Jason Wang wrote:
> > > Hi:
> > >
> > > Per request from Michael and Jason, the metadata accelreation is
> > > reverted in this version and rework in next version.
> > >
> > > Please review.
> > >
> > > Thanks
> > >
> > > Jason Wang (2):
> > > Revert "vhost: access vq metadata through kernel virtual address"
> > > vhost: re-introducing metadata acceleration through kernel virtual
> > > address
> > There are a bunch of patches in the queue already that will help
> > vhost, and I a working on one for next cycle that will help alot more
> > too.
>
>
> I will check those patches, but if you can give me some pointers or keywords
> it would be much appreciated.
You can look here:
https://github.com/jgunthorpe/linux/commits/mmu_notifier
The first parts, the get/put are in the hmm tree, and the last part,
the interval tree in the last commit is still a WIP, but it would
remove alot of that code from vhost as well.
Jason
^ permalink raw reply
* Re: [PATCH v1 net-next 00/15] tc-taprio offload for SJA1105 DSA
From: Andrew Lunn @ 2019-09-07 14:45 UTC (permalink / raw)
To: David Miller, olteanv
Cc: olteanv, f.fainelli, vivien.didelot, vinicius.gomes, vedang.patel,
richardcochran, weifeng.voon, jiri, m-karicheri2, Jose.Abreu,
ilias.apalodimas, jhs, xiyou.wangcong, kurt.kanzenbach, netdev
In-Reply-To: <20190906.145403.657322945046640538.davem@davemloft.net>
On Fri, Sep 06, 2019 at 02:54:03PM +0200, David Miller wrote:
> From: Vladimir Oltean <olteanv@gmail.com>
> Date: Mon, 2 Sep 2019 19:25:29 +0300
>
> > This is the first attempt to submit the tc-taprio offload model for
> > inclusion in the net tree.
>
> Someone really needs to review this.
Hi Vladimir
You might have more chance getting this reviewed if you split it up
into a number of smaller series. Richard could probably review the
plain PTP changes. Who else has worked on tc-taprio recently? A series
purely about tc-taprio might be more likely reviewed by a tc-taprio
person, if it does not contain PTP changes.
Andrew
^ permalink raw reply
* Re: [PATCH 0/8] Netfilter updates for net-next
From: David Miller @ 2019-09-07 14:34 UTC (permalink / raw)
To: pablo; +Cc: netfilter-devel, netdev
In-Reply-To: <20190905160400.25399-1-pablo@netfilter.org>
From: Pablo Neira Ayuso <pablo@netfilter.org>
Date: Thu, 5 Sep 2019 18:03:52 +0200
> The following patchset contains Netfilter updates for net-next:
>
> 1) Add nft_reg_store64() and nft_reg_load64() helpers, from Ander Juaristi.
>
> 2) Time matching support, also from Ander Juaristi.
>
> 3) VLAN support for nfnetlink_log, from Michael Braun.
>
> 4) Support for set element deletions from the packet path, also from Ander.
>
> 5) Remove __read_mostly from conntrack spinlock, from Li RongQing.
>
> 6) Support for updating stateful objects, this also includes the initial
> client for this infrastructure: the quota extension. A follow up fix
> for the control plane also comes in this batch. Patches from
> Fernando Fernandez Mancera.
>
> You can pull these changes from:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next.git
Pulled, thanks.
^ permalink raw reply
* Re: iproute2: tc: potential buffer overflow
From: tomaspaukrt @ 2019-09-07 13:43 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: netdev
In-Reply-To: <20190831083751.3814ee37@hermes.lan>
[-- Attachment #1: Type: text/plain, Size: 619 bytes --]
The updated patch is in the attachment.
---------- Původní e-mail ----------
Od: Stephen Hemminger <stephen@networkplumber.org>
Komu: tomaspaukrt@email.cz
Datum: 31. 8. 2019 17:38:01
Předmět: Re: iproute2: tc: potential buffer overflow
On Sat, 31 Aug 2019 15:13:27 +0200 (CEST)
<tomaspaukrt@email.cz> wrote:
> Hi,
>
> there are two potentially dangerous calls of strcpy function in the program "tc". In the attachment is a patch that fixes this issue.
>
> Tomas
This looks correct.
Please fix with strlcpy() instead; that is clearer.
Plus you can use XT_EXTENSION_MAX_NAMELEN here (optional).
[-- Attachment #2: iproute2-overflow-fix.patch --]
[-- Type: text/x-diff, Size: 1109 bytes --]
commit 46be35fbded24c75786ce178c516d7fba991a90a
Author: Tomas Paukrt <tomaspaukrt@email.cz>
Date: Sat Sep 7 15:34:30 2019 +0200
tc: fix potential buffer overflow
diff --git a/tc/m_ipt.c b/tc/m_ipt.c
index cc95eab..e47ae6b 100644
--- a/tc/m_ipt.c
+++ b/tc/m_ipt.c
@@ -269,7 +269,8 @@ static int build_st(struct xtables_target *target, struct ipt_entry_target *t)
} else {
target->t = t;
}
- strcpy(target->t->u.user.name, target->name);
+ strlcpy(target->t->u.user.name, target->name,
+ sizeof(target->t->u.user.name));
return 0;
}
diff --git a/tc/m_xt_old.c b/tc/m_xt_old.c
index 6a4509a..dd27adf 100644
--- a/tc/m_xt_old.c
+++ b/tc/m_xt_old.c
@@ -177,7 +177,8 @@ build_st(struct xtables_target *target, struct xt_entry_target *t)
if (t == NULL) {
target->t = fw_calloc(1, size);
target->t->u.target_size = size;
- strcpy(target->t->u.user.name, target->name);
+ strlcpy(target->t->u.user.name, target->name,
+ sizeof(target->t->u.user.name));
set_revision(target->t->u.user.name, target->revision);
if (target->init != NULL)
^ permalink raw reply related
* Re: [PATCH v1 net-next 00/15] tc-taprio offload for SJA1105 DSA
From: David Miller @ 2019-09-07 13:55 UTC (permalink / raw)
To: olteanv
Cc: f.fainelli, vivien.didelot, andrew, vinicius.gomes, vedang.patel,
richardcochran, weifeng.voon, jiri, m-karicheri2, Jose.Abreu,
ilias.apalodimas, jhs, xiyou.wangcong, kurt.kanzenbach, netdev
In-Reply-To: <20190902162544.24613-1-olteanv@gmail.com>
This is a warning that I will toss this patch series if it receives no series
review in the next couple of days.
Thank you.
^ permalink raw reply
* Re: [PATCH v3] ipv6: Not to probe neighbourless routes
From: David Miller @ 2019-09-07 13:54 UTC (permalink / raw)
To: wang.yi59
Cc: kuznet, yoshfuji, netdev, linux-kernel, xue.zhihong, wang.liang82,
cheng.lin130
In-Reply-To: <1567145476-33802-1-git-send-email-wang.yi59@zte.com.cn>
From: Cheng Lin <wang.yi59@zte.com.cn>
Date: Fri, 30 Aug 2019 14:11:16 +0800
> Originally, Router Reachability Probing require a neighbour entry
> existed. Commit 2152caea7196 ("ipv6: Do not depend on rt->n in
> rt6_probe().") removed the requirement for a neighbour entry. And
> commit f547fac624be ("ipv6: rate-limit probes for neighbourless
> routes") adds rate-limiting for neighbourless routes.
I am not going to apply this patch.
The reason we handle neighbourless routes is because due to the
disconnect between routes and neighbour entries, we would lose
information with your suggested change.
Originally, all routes held a reference to a neighbour entry.
Therefore we'd always have a neigh entry for any neigh message
matching a route.
But these two object pools (routes and neigh entries) are completely
disconnected. We only look up a neigh entry when sending a packet
on behalf of a route.
Therfore, neigh entries can be purged arbitrarily even if hundreds of
routes refer to them. And this means it is very important to accept
and process probes even for neighbourless routes.
I would also not recommend, in the future, reading RFC requirements
literally without taking into consideration the details of Linux's
specific implementation of ipv6 routing and neighbours.
Thank you.
^ permalink raw reply
* [PATCH net v2 11/11] net: remove unnecessary variables and callback
From: Taehee Yoo @ 2019-09-07 13:48 UTC (permalink / raw)
To: davem, netdev, j.vosburgh, vfalico, andy, jiri, sd, roopa, saeedm,
manishc, rahulv, kys, haiyangz, sthemmin, sashal, hare, varun,
ubraun, kgraul, jay.vosburgh
Cc: ap420073
This patch removes variables and callback these are related to the nested
device structure.
devices that can be nested have their own nest_level variable that
represents the depth of nested devices.
In the previous patch, new {lower/upper}_level variables are added and
they replace old private nest_level variable.
So, this patch removes all 'nest_level' variables.
In order to avoid lockdep warning, ->ndo_get_lock_subclass() was added
to get lockdep subclass value, which is actually lower nested depth value.
But now, they use the dynamic lockdep key to avoid lockdep warning instead
of the subclass.
So, this patch removes ->ndo_get_lock_subclass() callback.
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
---
v1 -> v2 : this patch isn't changed
drivers/net/bonding/bond_alb.c | 2 +-
drivers/net/bonding/bond_main.c | 14 -------------
.../net/ethernet/mellanox/mlx5/core/en_tc.c | 2 +-
drivers/net/macsec.c | 9 ---------
drivers/net/macvlan.c | 7 -------
include/linux/if_macvlan.h | 1 -
include/linux/if_vlan.h | 12 -----------
include/linux/netdevice.h | 12 -----------
include/net/bonding.h | 1 -
net/8021q/vlan.c | 1 -
net/8021q/vlan_dev.c | 6 ------
net/core/dev.c | 20 -------------------
net/core/dev_addr_lists.c | 12 +++++------
net/smc/smc_core.c | 2 +-
net/smc/smc_pnet.c | 2 +-
15 files changed, 10 insertions(+), 93 deletions(-)
diff --git a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c
index 8c79bad2a9a5..4f2e6910c623 100644
--- a/drivers/net/bonding/bond_alb.c
+++ b/drivers/net/bonding/bond_alb.c
@@ -952,7 +952,7 @@ static int alb_upper_dev_walk(struct net_device *upper, void *_data)
struct bond_vlan_tag *tags;
if (is_vlan_dev(upper) &&
- bond->nest_level == vlan_get_encap_level(upper) - 1) {
+ bond->dev->lower_level == upper->lower_level - 1) {
if (upper->addr_assign_type == NET_ADDR_STOLEN) {
alb_send_lp_vid(slave, mac_addr,
vlan_dev_vlan_proto(upper),
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 7f574e74ed78..69eb61466fbe 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -1733,8 +1733,6 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev,
goto err_upper_unlink;
}
- bond->nest_level = dev_get_nest_level(bond_dev) + 1;
-
/* If the mode uses primary, then the following is handled by
* bond_change_active_slave().
*/
@@ -1983,9 +1981,6 @@ static int __bond_release_one(struct net_device *bond_dev,
if (!bond_has_slaves(bond)) {
bond_set_carrier(bond);
eth_hw_addr_random(bond_dev);
- bond->nest_level = SINGLE_DEPTH_NESTING;
- } else {
- bond->nest_level = dev_get_nest_level(bond_dev) + 1;
}
unblock_netpoll_tx();
@@ -3472,13 +3467,6 @@ static void bond_fold_stats(struct rtnl_link_stats64 *_res,
}
}
-static int bond_get_nest_level(struct net_device *bond_dev)
-{
- struct bonding *bond = netdev_priv(bond_dev);
-
- return bond->nest_level;
-}
-
static void bond_get_stats(struct net_device *bond_dev,
struct rtnl_link_stats64 *stats)
{
@@ -4298,7 +4286,6 @@ static const struct net_device_ops bond_netdev_ops = {
.ndo_neigh_setup = bond_neigh_setup,
.ndo_vlan_rx_add_vid = bond_vlan_rx_add_vid,
.ndo_vlan_rx_kill_vid = bond_vlan_rx_kill_vid,
- .ndo_get_lock_subclass = bond_get_nest_level,
#ifdef CONFIG_NET_POLL_CONTROLLER
.ndo_netpoll_setup = bond_netpoll_setup,
.ndo_netpoll_cleanup = bond_netpoll_cleanup,
@@ -4822,7 +4809,6 @@ static int bond_init(struct net_device *bond_dev)
if (!bond->wq)
return -ENOMEM;
- bond->nest_level = SINGLE_DEPTH_NESTING;
bond_dev_set_lockdep_class(bond_dev);
list_add_tail(&bond->bond_list, &bn->dev_list);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
index 00b2d4a86159..e056f9aad8df 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
@@ -2797,7 +2797,7 @@ static int add_vlan_pop_action(struct mlx5e_priv *priv,
struct mlx5_esw_flow_attr *attr,
u32 *action)
{
- int nest_level = vlan_get_encap_level(attr->parse_attr->filter_dev);
+ int nest_level = attr->parse_attr->filter_dev->lower_level;
struct flow_action_entry vlan_act = {
.id = FLOW_ACTION_VLAN_POP,
};
diff --git a/drivers/net/macsec.c b/drivers/net/macsec.c
index 41ec1ed0d545..c0cb595f2bba 100644
--- a/drivers/net/macsec.c
+++ b/drivers/net/macsec.c
@@ -269,7 +269,6 @@ struct macsec_dev {
struct gro_cells gro_cells;
struct lock_class_key xmit_lock_key;
struct lock_class_key addr_lock_key;
- unsigned int nest_level;
};
/**
@@ -2988,11 +2987,6 @@ static int macsec_get_iflink(const struct net_device *dev)
return macsec_priv(dev)->real_dev->ifindex;
}
-static int macsec_get_nest_level(struct net_device *dev)
-{
- return macsec_priv(dev)->nest_level;
-}
-
static const struct net_device_ops macsec_netdev_ops = {
.ndo_init = macsec_dev_init,
.ndo_uninit = macsec_dev_uninit,
@@ -3006,7 +3000,6 @@ static const struct net_device_ops macsec_netdev_ops = {
.ndo_start_xmit = macsec_start_xmit,
.ndo_get_stats64 = macsec_get_stats64,
.ndo_get_iflink = macsec_get_iflink,
- .ndo_get_lock_subclass = macsec_get_nest_level,
};
static const struct device_type macsec_type = {
@@ -3289,8 +3282,6 @@ static int macsec_newlink(struct net *net, struct net_device *dev,
if (err < 0)
return err;
- macsec->nest_level = dev_get_nest_level(real_dev) + 1;
-
err = netdev_upper_dev_link(real_dev, dev, extack);
if (err < 0)
goto unregister;
diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
index dae368a2e8d1..2c14bc606514 100644
--- a/drivers/net/macvlan.c
+++ b/drivers/net/macvlan.c
@@ -867,11 +867,6 @@ static int macvlan_do_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
#define MACVLAN_STATE_MASK \
((1<<__LINK_STATE_NOCARRIER) | (1<<__LINK_STATE_DORMANT))
-static int macvlan_get_nest_level(struct net_device *dev)
-{
- return ((struct macvlan_dev *)netdev_priv(dev))->nest_level;
-}
-
static void macvlan_dev_set_lockdep_one(struct net_device *dev,
struct netdev_queue *txq,
void *_unused)
@@ -1180,7 +1175,6 @@ static const struct net_device_ops macvlan_netdev_ops = {
.ndo_fdb_add = macvlan_fdb_add,
.ndo_fdb_del = macvlan_fdb_del,
.ndo_fdb_dump = ndo_dflt_fdb_dump,
- .ndo_get_lock_subclass = macvlan_get_nest_level,
#ifdef CONFIG_NET_POLL_CONTROLLER
.ndo_poll_controller = macvlan_dev_poll_controller,
.ndo_netpoll_setup = macvlan_dev_netpoll_setup,
@@ -1464,7 +1458,6 @@ int macvlan_common_newlink(struct net *src_net, struct net_device *dev,
vlan->dev = dev;
vlan->port = port;
vlan->set_features = MACVLAN_FEATURES;
- vlan->nest_level = dev_get_nest_level(lowerdev) + 1;
vlan->mode = MACVLAN_MODE_VEPA;
if (data && data[IFLA_MACVLAN_MODE])
diff --git a/include/linux/if_macvlan.h b/include/linux/if_macvlan.h
index ea5b41823287..e9202edcf101 100644
--- a/include/linux/if_macvlan.h
+++ b/include/linux/if_macvlan.h
@@ -29,7 +29,6 @@ struct macvlan_dev {
netdev_features_t set_features;
enum macvlan_mode mode;
u16 flags;
- int nest_level;
unsigned int macaddr_count;
struct lock_class_key xmit_lock_key;
struct lock_class_key addr_lock_key;
diff --git a/include/linux/if_vlan.h b/include/linux/if_vlan.h
index 1aed9f613e90..6f30284a58e5 100644
--- a/include/linux/if_vlan.h
+++ b/include/linux/if_vlan.h
@@ -182,8 +182,6 @@ struct vlan_dev_priv {
#ifdef CONFIG_NET_POLL_CONTROLLER
struct netpoll *netpoll;
#endif
- unsigned int nest_level;
-
struct lock_class_key xmit_lock_key;
struct lock_class_key addr_lock_key;
};
@@ -224,11 +222,6 @@ extern void vlan_vids_del_by_dev(struct net_device *dev,
extern bool vlan_uses_dev(const struct net_device *dev);
-static inline int vlan_get_encap_level(struct net_device *dev)
-{
- BUG_ON(!is_vlan_dev(dev));
- return vlan_dev_priv(dev)->nest_level;
-}
#else
static inline struct net_device *
__vlan_find_dev_deep_rcu(struct net_device *real_dev,
@@ -298,11 +291,6 @@ static inline bool vlan_uses_dev(const struct net_device *dev)
{
return false;
}
-static inline int vlan_get_encap_level(struct net_device *dev)
-{
- BUG();
- return 0;
-}
#endif
/**
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 309ae000bae7..e13db714ee85 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1408,7 +1408,6 @@ struct net_device_ops {
void (*ndo_dfwd_del_station)(struct net_device *pdev,
void *priv);
- int (*ndo_get_lock_subclass)(struct net_device *dev);
int (*ndo_set_tx_maxrate)(struct net_device *dev,
int queue_index,
u32 maxrate);
@@ -4047,16 +4046,6 @@ static inline void netif_addr_lock(struct net_device *dev)
spin_lock(&dev->addr_list_lock);
}
-static inline void netif_addr_lock_nested(struct net_device *dev)
-{
- int subclass = SINGLE_DEPTH_NESTING;
-
- if (dev->netdev_ops->ndo_get_lock_subclass)
- subclass = dev->netdev_ops->ndo_get_lock_subclass(dev);
-
- spin_lock_nested(&dev->addr_list_lock, subclass);
-}
-
static inline void netif_addr_lock_bh(struct net_device *dev)
{
spin_lock_bh(&dev->addr_list_lock);
@@ -4334,7 +4323,6 @@ void netdev_lower_state_changed(struct net_device *lower_dev,
extern u8 netdev_rss_key[NETDEV_RSS_KEY_LEN] __read_mostly;
void netdev_rss_key_fill(void *buffer, size_t len);
-int dev_get_nest_level(struct net_device *dev);
int skb_checksum_help(struct sk_buff *skb);
int skb_crc32c_csum_help(struct sk_buff *skb);
int skb_csum_hwoffload_help(struct sk_buff *skb,
diff --git a/include/net/bonding.h b/include/net/bonding.h
index c39ac7061e41..74f41dd73866 100644
--- a/include/net/bonding.h
+++ b/include/net/bonding.h
@@ -203,7 +203,6 @@ struct bonding {
struct slave __rcu *primary_slave;
struct bond_up_slave __rcu *slave_arr; /* Array of usable slaves */
bool force_primary;
- u32 nest_level;
s32 slave_cnt; /* never change this value outside the attach/detach wrappers */
int (*recv_probe)(const struct sk_buff *, struct bonding *,
struct slave *);
diff --git a/net/8021q/vlan.c b/net/8021q/vlan.c
index 54728d2eda18..d4bcfd8f95bf 100644
--- a/net/8021q/vlan.c
+++ b/net/8021q/vlan.c
@@ -172,7 +172,6 @@ int register_vlan_dev(struct net_device *dev, struct netlink_ext_ack *extack)
if (err < 0)
goto out_uninit_mvrp;
- vlan->nest_level = dev_get_nest_level(real_dev) + 1;
err = register_netdevice(dev);
if (err < 0)
goto out_uninit_mvrp;
diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c
index 12bc80650087..e8707827540c 100644
--- a/net/8021q/vlan_dev.c
+++ b/net/8021q/vlan_dev.c
@@ -514,11 +514,6 @@ static void vlan_dev_set_lockdep_class(struct net_device *dev)
netdev_for_each_tx_queue(dev, vlan_dev_set_lockdep_one, NULL);
}
-static int vlan_dev_get_lock_subclass(struct net_device *dev)
-{
- return vlan_dev_priv(dev)->nest_level;
-}
-
static const struct header_ops vlan_header_ops = {
.create = vlan_dev_hard_header,
.parse = eth_header_parse,
@@ -814,7 +809,6 @@ static const struct net_device_ops vlan_netdev_ops = {
.ndo_netpoll_cleanup = vlan_dev_netpoll_cleanup,
#endif
.ndo_fix_features = vlan_dev_fix_features,
- .ndo_get_lock_subclass = vlan_dev_get_lock_subclass,
.ndo_get_iflink = vlan_dev_get_iflink,
};
diff --git a/net/core/dev.c b/net/core/dev.c
index ac055b531c96..73a69a7a3553 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -7510,26 +7510,6 @@ void *netdev_lower_dev_get_private(struct net_device *dev,
}
EXPORT_SYMBOL(netdev_lower_dev_get_private);
-
-int dev_get_nest_level(struct net_device *dev)
-{
- struct net_device *lower = NULL;
- struct list_head *iter;
- int max_nest = -1;
- int nest;
-
- ASSERT_RTNL();
-
- netdev_for_each_lower_dev(dev, lower, iter) {
- nest = dev_get_nest_level(lower);
- if (max_nest < nest)
- max_nest = nest;
- }
-
- return max_nest + 1;
-}
-EXPORT_SYMBOL(dev_get_nest_level);
-
/**
* netdev_lower_change - Dispatch event about lower device state change
* @lower_dev: device
diff --git a/net/core/dev_addr_lists.c b/net/core/dev_addr_lists.c
index 6393ba930097..2f949b5a1eb9 100644
--- a/net/core/dev_addr_lists.c
+++ b/net/core/dev_addr_lists.c
@@ -637,7 +637,7 @@ int dev_uc_sync(struct net_device *to, struct net_device *from)
if (to->addr_len != from->addr_len)
return -EINVAL;
- netif_addr_lock_nested(to);
+ netif_addr_lock(to);
err = __hw_addr_sync(&to->uc, &from->uc, to->addr_len);
if (!err)
__dev_set_rx_mode(to);
@@ -667,7 +667,7 @@ int dev_uc_sync_multiple(struct net_device *to, struct net_device *from)
if (to->addr_len != from->addr_len)
return -EINVAL;
- netif_addr_lock_nested(to);
+ netif_addr_lock(to);
err = __hw_addr_sync_multiple(&to->uc, &from->uc, to->addr_len);
if (!err)
__dev_set_rx_mode(to);
@@ -691,7 +691,7 @@ void dev_uc_unsync(struct net_device *to, struct net_device *from)
return;
netif_addr_lock_bh(from);
- netif_addr_lock_nested(to);
+ netif_addr_lock(to);
__hw_addr_unsync(&to->uc, &from->uc, to->addr_len);
__dev_set_rx_mode(to);
netif_addr_unlock(to);
@@ -858,7 +858,7 @@ int dev_mc_sync(struct net_device *to, struct net_device *from)
if (to->addr_len != from->addr_len)
return -EINVAL;
- netif_addr_lock_nested(to);
+ netif_addr_lock(to);
err = __hw_addr_sync(&to->mc, &from->mc, to->addr_len);
if (!err)
__dev_set_rx_mode(to);
@@ -888,7 +888,7 @@ int dev_mc_sync_multiple(struct net_device *to, struct net_device *from)
if (to->addr_len != from->addr_len)
return -EINVAL;
- netif_addr_lock_nested(to);
+ netif_addr_lock(to);
err = __hw_addr_sync_multiple(&to->mc, &from->mc, to->addr_len);
if (!err)
__dev_set_rx_mode(to);
@@ -912,7 +912,7 @@ void dev_mc_unsync(struct net_device *to, struct net_device *from)
return;
netif_addr_lock_bh(from);
- netif_addr_lock_nested(to);
+ netif_addr_lock(to);
__hw_addr_unsync(&to->mc, &from->mc, to->addr_len);
__dev_set_rx_mode(to);
netif_addr_unlock(to);
diff --git a/net/smc/smc_core.c b/net/smc/smc_core.c
index 4ca50ddf8d16..a2e91b8d04b3 100644
--- a/net/smc/smc_core.c
+++ b/net/smc/smc_core.c
@@ -558,7 +558,7 @@ int smc_vlan_by_tcpsk(struct socket *clcsock, struct smc_init_info *ini)
}
rtnl_lock();
- nest_lvl = dev_get_nest_level(ndev);
+ nest_lvl = ndev->lower_level;
for (i = 0; i < nest_lvl; i++) {
struct list_head *lower = &ndev->adj_list.lower;
diff --git a/net/smc/smc_pnet.c b/net/smc/smc_pnet.c
index bab2da8cf17a..2920b006f65c 100644
--- a/net/smc/smc_pnet.c
+++ b/net/smc/smc_pnet.c
@@ -718,7 +718,7 @@ static struct net_device *pnet_find_base_ndev(struct net_device *ndev)
int i, nest_lvl;
rtnl_lock();
- nest_lvl = dev_get_nest_level(ndev);
+ nest_lvl = ndev->lower_level;
for (i = 0; i < nest_lvl; i++) {
struct list_head *lower = &ndev->adj_list.lower;
--
2.17.1
^ permalink raw reply related
* Re: [PATCH] net/hamradio/6pack: Fix the size of a sk_buff used in 'sp_bump()'
From: David Miller @ 2019-09-07 13:48 UTC (permalink / raw)
To: christophe.jaillet; +Cc: ajk, linux-hams, netdev, linux-kernel, kernel-janitors
In-Reply-To: <20190826190209.16795-1-christophe.jaillet@wanadoo.fr>
From: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Date: Mon, 26 Aug 2019 21:02:09 +0200
> We 'allocate' 'count' bytes here. In fact, 'dev_alloc_skb' already add some
> extra space for padding, so a bit more is allocated.
>
> However, we use 1 byte for the KISS command, then copy 'count' bytes, so
> count+1 bytes.
>
> Explicitly allocate and use 1 more byte to be safe.
>
> Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
I applied your patch as-is, as it is correct and doesn't change the contents
of the data put into the SKB at all.
->rcount is the cooked count minus two, but then we copy effectively
cooked count minus one bytes from one byte past the beginning of the
cooked buffer and so all the accesses are in range on the input buffer
side.
^ permalink raw reply
* [PATCH net v2 10/11] vxlan: add adjacent link to limit depth level
From: Taehee Yoo @ 2019-09-07 13:47 UTC (permalink / raw)
To: davem, netdev, j.vosburgh, vfalico, andy, jiri, sd, roopa, saeedm,
manishc, rahulv, kys, haiyangz, sthemmin, sashal, hare, varun,
ubraun, kgraul, jay.vosburgh
Cc: ap420073
Current vxlan code doesn't limit the number of nested devices.
Nested devices would be handled recursively and this routine needs
huge stack memory. So, unlimited nested devices could make
stack overflow.
In order to fix this issue, this patch adds adjacent links.
The adjacent link APIs internally check the depth level.
Test commands:
ip link add dummy0 type dummy
ip link add vxlan0 type vxlan id 0 group 239.1.1.1 dev dummy0 \
dstport 4789
for i in {1..100}
do
let A=$i-1
ip link add vxlan$i type vxlan id $i group 239.1.1.1 \
dev vxlan$A dstport 4789
done
ip link del dummy0
The top upper link is vxlan100 and the lowest link is vxlan0.
When vxlan0 is deleting, the upper devices will be deleted recursively.
It needs huge stack memory so it makes stack overflow.
Splat looks like:
[ 229.628477] =============================================================================
[ 229.629785] BUG page->ptl (Not tainted): Padding overwritten. 0x0000000026abf214-0x0000000091f6abb2
[ 229.629785] -----------------------------------------------------------------------------
[ 229.629785]
[ 229.655439] ==================================================================
[ 229.629785] INFO: Slab 0x00000000ff7cfda8 objects=19 used=19 fp=0x00000000fe33776c flags=0x200000000010200
[ 229.655688] BUG: KASAN: stack-out-of-bounds in unmap_single_vma+0x25a/0x2e0
[ 229.655688] Read of size 8 at addr ffff888113076928 by task vlan-network-in/2334
[ 229.655688]
[ 229.629785] Padding 0000000026abf214: 00 80 14 0d 81 88 ff ff 68 91 81 14 81 88 ff ff ........h.......
[ 229.629785] Padding 0000000001e24790: 38 91 81 14 81 88 ff ff 68 91 81 14 81 88 ff ff 8.......h.......
[ 229.629785] Padding 00000000b39397c8: 33 30 62 a7 ff ff ff ff ff eb 60 22 10 f1 ff 1f 30b.......`"....
[ 229.629785] Padding 00000000bc98f53a: 80 60 07 13 81 88 ff ff 00 80 14 0d 81 88 ff ff .`..............
[ 229.629785] Padding 000000002aa8123d: 68 91 81 14 81 88 ff ff f7 21 17 a7 ff ff ff ff h........!......
[ 229.629785] Padding 000000001c8c2369: 08 81 14 0d 81 88 ff ff 03 02 00 00 00 00 00 00 ................
[ 229.629785] Padding 000000004e290c5d: 21 90 a2 21 10 ed ff ff 00 00 00 00 00 fc ff df !..!............
[ 229.629785] Padding 000000000e25d731: 18 60 07 13 81 88 ff ff c0 8b 13 05 81 88 ff ff .`..............
[ 229.629785] Padding 000000007adc7ab3: b3 8a b5 41 00 00 00 00 ...A....
[ 229.629785] FIX page->ptl: Restoring 0x0000000026abf214-0x0000000091f6abb2=0x5a
[ ... ]
Fixes: acaf4e70997f ("net: vxlan: when lower dev unregisters remove vxlan dev as well")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
---
v1 -> v2 : this patch isn't changed
drivers/net/vxlan.c | 71 ++++++++++++++++++++++++++++++++++++++-------
include/net/vxlan.h | 1 +
2 files changed, 62 insertions(+), 10 deletions(-)
diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index 3d9bcc957f7d..0d5c8d22d8a4 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -3567,6 +3567,8 @@ static int __vxlan_dev_create(struct net *net, struct net_device *dev,
struct vxlan_net *vn = net_generic(net, vxlan_net_id);
struct vxlan_dev *vxlan = netdev_priv(dev);
struct vxlan_fdb *f = NULL;
+ struct net_device *remote_dev = NULL;
+ struct vxlan_rdst *dst = &vxlan->default_dst;
bool unregister = false;
int err;
@@ -3577,14 +3579,14 @@ static int __vxlan_dev_create(struct net *net, struct net_device *dev,
dev->ethtool_ops = &vxlan_ethtool_ops;
/* create an fdb entry for a valid default destination */
- if (!vxlan_addr_any(&vxlan->default_dst.remote_ip)) {
+ if (!vxlan_addr_any(&dst->remote_ip)) {
err = vxlan_fdb_create(vxlan, all_zeros_mac,
- &vxlan->default_dst.remote_ip,
+ &dst->remote_ip,
NUD_REACHABLE | NUD_PERMANENT,
vxlan->cfg.dst_port,
- vxlan->default_dst.remote_vni,
- vxlan->default_dst.remote_vni,
- vxlan->default_dst.remote_ifindex,
+ dst->remote_vni,
+ dst->remote_vni,
+ dst->remote_ifindex,
NTF_SELF, &f);
if (err)
return err;
@@ -3595,26 +3597,43 @@ static int __vxlan_dev_create(struct net *net, struct net_device *dev,
goto errout;
unregister = true;
+ if (dst->remote_ifindex) {
+ remote_dev = __dev_get_by_index(net, dst->remote_ifindex);
+ if (!remote_dev)
+ goto errout;
+
+ err = netdev_upper_dev_link(remote_dev, dev, extack);
+ if (err)
+ goto errout;
+ }
+
err = rtnl_configure_link(dev, NULL);
if (err)
- goto errout;
+ goto unlink;
if (f) {
- vxlan_fdb_insert(vxlan, all_zeros_mac,
- vxlan->default_dst.remote_vni, f);
+ vxlan_fdb_insert(vxlan, all_zeros_mac, dst->remote_vni, f);
/* notify default fdb entry */
err = vxlan_fdb_notify(vxlan, f, first_remote_rtnl(f),
RTM_NEWNEIGH, true, extack);
if (err) {
vxlan_fdb_destroy(vxlan, f, false, false);
+ if (remote_dev)
+ netdev_upper_dev_unlink(remote_dev, dev);
goto unregister;
}
}
list_add(&vxlan->next, &vn->vxlan_list);
+ if (remote_dev) {
+ dst->remote_dev = remote_dev;
+ dev_hold(remote_dev);
+ }
return 0;
-
+unlink:
+ if (remote_dev)
+ netdev_upper_dev_unlink(remote_dev, dev);
errout:
/* unregister_netdevice() destroys the default FDB entry with deletion
* notification. But the addition notification was not sent yet, so
@@ -3936,6 +3955,8 @@ static int vxlan_changelink(struct net_device *dev, struct nlattr *tb[],
struct net_device *lowerdev;
struct vxlan_config conf;
int err;
+ bool linked = false;
+ bool disabled = false;
err = vxlan_nl2conf(tb, data, dev, &conf, true, extack);
if (err)
@@ -3946,6 +3967,16 @@ static int vxlan_changelink(struct net_device *dev, struct nlattr *tb[],
if (err)
return err;
+ if (lowerdev) {
+ if (dst->remote_dev && lowerdev != dst->remote_dev) {
+ netdev_adjacent_dev_disable(dst->remote_dev, dev);
+ disabled = true;
+ }
+ err = netdev_upper_dev_link(lowerdev, dev, extack);
+ if (err)
+ goto err;
+ linked = true;
+ }
/* handle default dst entry */
if (!vxlan_addr_equal(&conf.remote_ip, &dst->remote_ip)) {
u32 hash_index = fdb_head_index(vxlan, all_zeros_mac, conf.vni);
@@ -3962,7 +3993,7 @@ static int vxlan_changelink(struct net_device *dev, struct nlattr *tb[],
NTF_SELF, true, extack);
if (err) {
spin_unlock_bh(&vxlan->hash_lock[hash_index]);
- return err;
+ goto err;
}
}
if (!vxlan_addr_any(&dst->remote_ip))
@@ -3979,8 +4010,24 @@ static int vxlan_changelink(struct net_device *dev, struct nlattr *tb[],
if (conf.age_interval != vxlan->cfg.age_interval)
mod_timer(&vxlan->age_timer, jiffies);
+ if (disabled) {
+ netdev_adjacent_dev_enable(dst->remote_dev, dev);
+ netdev_upper_dev_unlink(dst->remote_dev, dev);
+ dev_put(dst->remote_dev);
+ }
+ if (linked) {
+ dst->remote_dev = lowerdev;
+ dev_hold(dst->remote_dev);
+ }
+
vxlan_config_apply(dev, &conf, lowerdev, vxlan->net, true);
return 0;
+err:
+ if (linked)
+ netdev_upper_dev_unlink(lowerdev, dev);
+ if (disabled)
+ netdev_adjacent_dev_enable(dst->remote_dev, dev);
+ return err;
}
static void vxlan_dellink(struct net_device *dev, struct list_head *head)
@@ -3991,6 +4038,10 @@ static void vxlan_dellink(struct net_device *dev, struct list_head *head)
list_del(&vxlan->next);
unregister_netdevice_queue(dev, head);
+ if (vxlan->default_dst.remote_dev) {
+ netdev_upper_dev_unlink(vxlan->default_dst.remote_dev, dev);
+ dev_put(vxlan->default_dst.remote_dev);
+ }
}
static size_t vxlan_get_size(const struct net_device *dev)
diff --git a/include/net/vxlan.h b/include/net/vxlan.h
index dc1583a1fb8a..08e237d7aa73 100644
--- a/include/net/vxlan.h
+++ b/include/net/vxlan.h
@@ -197,6 +197,7 @@ struct vxlan_rdst {
u8 offloaded:1;
__be32 remote_vni;
u32 remote_ifindex;
+ struct net_device *remote_dev;
struct list_head list;
struct rcu_head rcu;
struct dst_cache dst_cache;
--
2.17.1
^ permalink raw reply related
* [PATCH net v2 09/11] net: core: add ignore flag to netdev_adjacent structure
From: Taehee Yoo @ 2019-09-07 13:47 UTC (permalink / raw)
To: davem, netdev, j.vosburgh, vfalico, andy, jiri, sd, roopa, saeedm,
manishc, rahulv, kys, haiyangz, sthemmin, sashal, hare, varun,
ubraun, kgraul, jay.vosburgh
Cc: ap420073
In order to link an adjacent node, netdev_upper_dev_link() is used
and in order to unlink an adjacent node, netdev_upper_dev_unlink() is used.
unlink operation does not fail, but link operation can fail.
In order to exchange adjacent nodes, we should unlink an old adjacent
node first. then, link a new adjacent node.
If link operation is failed, we should link an old adjacent node again.
But this link operation can fail too.
It eventually breaks the adjacent link relationship.
This patch adds an ignore flag into the netdev_adjacent structure.
If this flag is set, netdev_upper_dev_link() ignores an old adjacent
node for a moment.
So we can skip unlink operation before link operation.
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
---
v1 -> v2 : this patch isn't changed
include/linux/netdevice.h | 4 +
net/core/dev.c | 160 +++++++++++++++++++++++++++++++++-----
2 files changed, 144 insertions(+), 20 deletions(-)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 5bb5756129af..309ae000bae7 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -4319,6 +4319,10 @@ int netdev_master_upper_dev_link(struct net_device *dev,
struct netlink_ext_ack *extack);
void netdev_upper_dev_unlink(struct net_device *dev,
struct net_device *upper_dev);
+void netdev_adjacent_dev_disable(struct net_device *upper_dev,
+ struct net_device *lower_dev);
+void netdev_adjacent_dev_enable(struct net_device *upper_dev,
+ struct net_device *lower_dev);
void netdev_adjacent_rename_links(struct net_device *dev, char *oldname);
void *netdev_lower_dev_get_private(struct net_device *dev,
struct net_device *lower_dev);
diff --git a/net/core/dev.c b/net/core/dev.c
index 6a4b4ce62204..ac055b531c96 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -6448,6 +6448,9 @@ struct netdev_adjacent {
/* upper master flag, there can only be one master device per list */
bool master;
+ /* lookup ignore flag */
+ bool ignore;
+
/* counter for the number of times this device was added to us */
u16 ref_nr;
@@ -6553,6 +6556,22 @@ struct net_device *netdev_master_upper_dev_get(struct net_device *dev)
}
EXPORT_SYMBOL(netdev_master_upper_dev_get);
+struct net_device *netdev_master_upper_dev_get_ignore(struct net_device *dev)
+{
+ struct netdev_adjacent *upper;
+
+ ASSERT_RTNL();
+
+ if (list_empty(&dev->adj_list.upper))
+ return NULL;
+
+ upper = list_first_entry(&dev->adj_list.upper,
+ struct netdev_adjacent, list);
+ if (likely(upper->master) && !upper->ignore)
+ return upper->dev;
+ return NULL;
+}
+
/**
* netdev_has_any_lower_dev - Check if device is linked to some device
* @dev: device
@@ -6603,8 +6622,9 @@ struct net_device *netdev_upper_get_next_dev_rcu(struct net_device *dev,
}
EXPORT_SYMBOL(netdev_upper_get_next_dev_rcu);
-static struct net_device *netdev_next_upper_dev(struct net_device *dev,
- struct list_head **iter)
+static struct net_device *netdev_next_upper_dev_ignore(struct net_device *dev,
+ struct list_head **iter,
+ bool *ignore)
{
struct netdev_adjacent *upper;
@@ -6614,6 +6634,7 @@ static struct net_device *netdev_next_upper_dev(struct net_device *dev,
return NULL;
*iter = &upper->list;
+ *ignore = upper->ignore;
return upper->dev;
}
@@ -6635,26 +6656,29 @@ static struct net_device *netdev_next_upper_dev_rcu(struct net_device *dev,
return upper->dev;
}
-int netdev_walk_all_upper_dev(struct net_device *dev,
- int (*fn)(struct net_device *dev,
- void *data),
- void *data)
+int netdev_walk_all_upper_dev_ignore(struct net_device *dev,
+ int (*fn)(struct net_device *dev,
+ void *data),
+ void *data)
{
struct net_device *udev;
struct list_head *iter;
int ret;
+ bool ignore;
for (iter = &dev->adj_list.upper,
- udev = netdev_next_upper_dev(dev, &iter);
+ udev = netdev_next_upper_dev_ignore(dev, &iter, &ignore);
udev;
- udev = netdev_next_upper_dev(dev, &iter)) {
+ udev = netdev_next_upper_dev_ignore(dev, &iter, &ignore)) {
+ if (ignore)
+ continue;
/* first is the upper device itself */
ret = fn(udev, data);
if (ret)
return ret;
/* then look at all of its upper devices */
- ret = netdev_walk_all_upper_dev(udev, fn, data);
+ ret = netdev_walk_all_upper_dev_ignore(udev, fn, data);
if (ret)
return ret;
}
@@ -6690,6 +6714,15 @@ int netdev_walk_all_upper_dev_rcu(struct net_device *dev,
}
EXPORT_SYMBOL_GPL(netdev_walk_all_upper_dev_rcu);
+bool netdev_has_upper_dev_ignore(struct net_device *dev,
+ struct net_device *upper_dev)
+{
+ ASSERT_RTNL();
+
+ return netdev_walk_all_upper_dev_ignore(dev, __netdev_has_upper_dev,
+ upper_dev);
+}
+
/**
* netdev_lower_get_next_private - Get the next ->private from the
* lower neighbour list
@@ -6786,6 +6819,23 @@ static struct net_device *netdev_next_lower_dev(struct net_device *dev,
return lower->dev;
}
+static struct net_device *netdev_next_lower_dev_ignore(struct net_device *dev,
+ struct list_head **iter,
+ bool *ignore)
+{
+ struct netdev_adjacent *lower;
+
+ lower = list_entry((*iter)->next, struct netdev_adjacent, list);
+
+ if (&lower->list == &dev->adj_list.lower)
+ return NULL;
+
+ *iter = &lower->list;
+ *ignore = lower->ignore;
+
+ return lower->dev;
+}
+
int netdev_walk_all_lower_dev(struct net_device *dev,
int (*fn)(struct net_device *dev,
void *data),
@@ -6814,6 +6864,36 @@ int netdev_walk_all_lower_dev(struct net_device *dev,
}
EXPORT_SYMBOL_GPL(netdev_walk_all_lower_dev);
+int netdev_walk_all_lower_dev_ignore(struct net_device *dev,
+ int (*fn)(struct net_device *dev,
+ void *data),
+ void *data)
+{
+ struct net_device *ldev;
+ struct list_head *iter;
+ int ret;
+ bool ignore;
+
+ for (iter = &dev->adj_list.lower,
+ ldev = netdev_next_lower_dev_ignore(dev, &iter, &ignore);
+ ldev;
+ ldev = netdev_next_lower_dev_ignore(dev, &iter, &ignore)) {
+ if (ignore)
+ continue;
+ /* first is the lower device itself */
+ ret = fn(ldev, data);
+ if (ret)
+ return ret;
+
+ /* then look at all of its lower devices */
+ ret = netdev_walk_all_lower_dev_ignore(ldev, fn, data);
+ if (ret)
+ return ret;
+ }
+
+ return 0;
+}
+
static struct net_device *netdev_next_lower_dev_rcu(struct net_device *dev,
struct list_head **iter)
{
@@ -6833,11 +6913,14 @@ static u8 __netdev_upper_depth(struct net_device *dev)
struct net_device *udev;
struct list_head *iter;
u8 max_depth = 0;
+ bool ignore;
for (iter = &dev->adj_list.upper,
- udev = netdev_next_upper_dev(dev, &iter);
+ udev = netdev_next_upper_dev_ignore(dev, &iter, &ignore);
udev;
- udev = netdev_next_upper_dev(dev, &iter)) {
+ udev = netdev_next_upper_dev_ignore(dev, &iter, &ignore)) {
+ if (ignore)
+ continue;
if (max_depth < udev->upper_level)
max_depth = udev->upper_level;
}
@@ -6850,11 +6933,14 @@ static u8 __netdev_lower_depth(struct net_device *dev)
struct net_device *ldev;
struct list_head *iter;
u8 max_depth = 0;
+ bool ignore;
for (iter = &dev->adj_list.lower,
- ldev = netdev_next_lower_dev(dev, &iter);
+ ldev = netdev_next_lower_dev_ignore(dev, &iter, &ignore);
ldev;
- ldev = netdev_next_lower_dev(dev, &iter)) {
+ ldev = netdev_next_lower_dev_ignore(dev, &iter, &ignore)) {
+ if (ignore)
+ continue;
if (max_depth < ldev->lower_level)
max_depth = ldev->lower_level;
}
@@ -6999,6 +7085,7 @@ static int __netdev_adjacent_dev_insert(struct net_device *dev,
adj->master = master;
adj->ref_nr = 1;
adj->private = private;
+ adj->ignore = false;
dev_hold(adj_dev);
pr_debug("Insert adjacency: dev %s adj_dev %s adj->ref_nr %d; dev_hold on %s\n",
@@ -7149,17 +7236,17 @@ static int __netdev_upper_dev_link(struct net_device *dev,
return -EBUSY;
/* To prevent loops, check if dev is not upper device to upper_dev. */
- if (netdev_has_upper_dev(upper_dev, dev))
+ if (netdev_has_upper_dev_ignore(upper_dev, dev))
return -EBUSY;
if ((dev->lower_level + upper_dev->upper_level) > MAX_NEST_DEV)
return -EMLINK;
if (!master) {
- if (netdev_has_upper_dev(dev, upper_dev))
+ if (netdev_has_upper_dev_ignore(dev, upper_dev))
return -EEXIST;
} else {
- master_dev = netdev_master_upper_dev_get(dev);
+ master_dev = netdev_master_upper_dev_get_ignore(dev);
if (master_dev)
return master_dev == upper_dev ? -EEXIST : -EBUSY;
}
@@ -7182,10 +7269,12 @@ static int __netdev_upper_dev_link(struct net_device *dev,
goto rollback;
__netdev_update_upper_level(dev, NULL);
- netdev_walk_all_lower_dev(dev, __netdev_update_upper_level, NULL);
+ netdev_walk_all_lower_dev_ignore(dev, __netdev_update_upper_level,
+ NULL);
__netdev_update_lower_level(upper_dev, NULL);
- netdev_walk_all_upper_dev(upper_dev, __netdev_update_lower_level, NULL);
+ netdev_walk_all_upper_dev_ignore(upper_dev,
+ __netdev_update_lower_level, NULL);
return 0;
@@ -7271,13 +7360,44 @@ void netdev_upper_dev_unlink(struct net_device *dev,
&changeupper_info.info);
__netdev_update_upper_level(dev, NULL);
- netdev_walk_all_lower_dev(dev, __netdev_update_upper_level, NULL);
+ netdev_walk_all_lower_dev_ignore(dev, __netdev_update_upper_level,
+ NULL);
__netdev_update_lower_level(upper_dev, NULL);
- netdev_walk_all_upper_dev(upper_dev, __netdev_update_lower_level, NULL);
+ netdev_walk_all_upper_dev_ignore(upper_dev,
+ __netdev_update_lower_level, NULL);
}
EXPORT_SYMBOL(netdev_upper_dev_unlink);
+void __netdev_adjacent_dev_set(struct net_device *upper_dev,
+ struct net_device *lower_dev,
+ bool val)
+{
+ struct netdev_adjacent *adj;
+
+ adj = __netdev_find_adj(lower_dev, &upper_dev->adj_list.lower);
+ if (adj)
+ adj->ignore = val;
+
+ adj = __netdev_find_adj(upper_dev, &lower_dev->adj_list.upper);
+ if (adj)
+ adj->ignore = val;
+}
+
+void netdev_adjacent_dev_disable(struct net_device *upper_dev,
+ struct net_device *lower_dev)
+{
+ __netdev_adjacent_dev_set(upper_dev, lower_dev, true);
+}
+EXPORT_SYMBOL(netdev_adjacent_dev_disable);
+
+void netdev_adjacent_dev_enable(struct net_device *upper_dev,
+ struct net_device *lower_dev)
+{
+ __netdev_adjacent_dev_set(upper_dev, lower_dev, false);
+}
+EXPORT_SYMBOL(netdev_adjacent_dev_enable);
+
/**
* netdev_bonding_info_change - Dispatch event about slave change
* @dev: device
--
2.17.1
^ permalink raw reply related
* [PATCH net v2 08/11] macsec: fix refcnt leak in module exit routine
From: Taehee Yoo @ 2019-09-07 13:47 UTC (permalink / raw)
To: davem, netdev, j.vosburgh, vfalico, andy, jiri, sd, roopa, saeedm,
manishc, rahulv, kys, haiyangz, sthemmin, sashal, hare, varun,
ubraun, kgraul, jay.vosburgh
Cc: ap420073
When a macsec interface is created, it increases a refcnt to a lower
device(real device). when macsec interface is deleted, the refcnt is
decreased in macsec_free_netdev(), which is ->priv_destructor() of
macsec interface.
The problem scenario is this.
When nested macsec interfaces are exiting, the exit routine of the
macsec module makes refcnt leaks.
Test commands:
ip link add dummy0 type dummy
ip link add macsec0 link dummy0 type macsec
ip link add macsec1 link macsec0 type macsec
modprobe -rv macsec
[ 208.629433] unregister_netdevice: waiting for macsec0 to become free. Usage count = 1
Steps of exit routine of macsec module are below.
1. Calls ->dellink() in __rtnl_link_unregister().
2. Checks refcnt and wait refcnt to be 0 if refcnt is not 0 in
netdev_run_todo().
3. Calls ->priv_destruvtor() in netdev_run_todo().
Step2 checks refcnt, but step3 decreases refcnt.
So, step2 waits forever.
This patch makes the macsec module do not hold a refcnt of the lower
device because it already holds a refcnt of the lower device with
netdev_upper_dev_link().
Fixes: c09440f7dcb3 ("macsec: introduce IEEE 802.1AE driver")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
---
v1 -> v2 : this patch isn't changed
drivers/net/macsec.c | 4 ----
1 file changed, 4 deletions(-)
diff --git a/drivers/net/macsec.c b/drivers/net/macsec.c
index 25a4fc88145d..41ec1ed0d545 100644
--- a/drivers/net/macsec.c
+++ b/drivers/net/macsec.c
@@ -3031,12 +3031,10 @@ static const struct nla_policy macsec_rtnl_policy[IFLA_MACSEC_MAX + 1] = {
static void macsec_free_netdev(struct net_device *dev)
{
struct macsec_dev *macsec = macsec_priv(dev);
- struct net_device *real_dev = macsec->real_dev;
free_percpu(macsec->stats);
free_percpu(macsec->secy.tx_sc.stats);
- dev_put(real_dev);
}
static void macsec_setup(struct net_device *dev)
@@ -3291,8 +3289,6 @@ static int macsec_newlink(struct net *net, struct net_device *dev,
if (err < 0)
return err;
- dev_hold(real_dev);
-
macsec->nest_level = dev_get_nest_level(real_dev) + 1;
err = netdev_upper_dev_link(real_dev, dev, extack);
--
2.17.1
^ permalink raw reply related
* [PATCH net v2 07/11] macvlan: use dynamic lockdep key instead of subclass
From: Taehee Yoo @ 2019-09-07 13:46 UTC (permalink / raw)
To: davem, netdev, j.vosburgh, vfalico, andy, jiri, sd, roopa, saeedm,
manishc, rahulv, kys, haiyangz, sthemmin, sashal, hare, varun,
ubraun, kgraul, jay.vosburgh
Cc: ap420073
All macvlan device has same lockdep key and subclass is initialized with
nest_level.
But actual nest_level value can be changed when a lower device is attached.
And at this moment, the subclass should be updated but it seems to be
unsafe.
So this patch makes macvlan use dynamic lockdep key instead of the
subclass.
Test commands:
ip link add bond0 type bond
ip link add dummy0 type dummy
ip link add macvlan0 link bond0 type macvlan mode bridge
ip link add macvlan1 link dummy0 type macvlan mode bridge
ip link set bond0 mtu 1000
ip link set macvlan1 master bond0
ip link set bond0 up
ip link set macvlan0 up
ip link set dummy0 up
ip link set macvlan1 up
Splat looks like:
[ 165.677603] ============================================
[ 165.679642] WARNING: possible recursive locking detected
[ 165.679642] 5.3.0-rc7+ #322 Not tainted
[ 165.679642] --------------------------------------------
[ 165.679642] ip/1812 is trying to acquire lock:
[ 165.679642] 00000000ae6a8a03 (&macvlan_netdev_addr_lock_key/1){+...}, at: dev_uc_sync_multiple+0xfa/0x1a0
[ 165.679642]
[ 165.679642] but task is already holding lock:
[ 165.679642] 00000000cec5da0b (&macvlan_netdev_addr_lock_key/1){+...}, at: dev_set_rx_mode+0x19/0x30
[ 165.679642]
[ 165.679642] other info that might help us debug this:
[ 165.679642] Possible unsafe locking scenario:
[ 165.679642]
[ 165.679642] CPU0
[ 165.679642] ----
[ 165.679642] lock(&macvlan_netdev_addr_lock_key/1);
[ 165.679642] lock(&macvlan_netdev_addr_lock_key/1);
[ 165.679642]
[ 165.679642] *** DEADLOCK ***
[ 165.679642]
[ 165.679642] May be due to missing lock nesting notation
[ 165.679642]
[ 165.679642] 4 locks held by ip/1812:
[ 165.679642] #0: 0000000088d10bd8 (rtnl_mutex){+.+.}, at: rtnetlink_rcv_msg+0x466/0x8a0
[ 165.679642] #1: 00000000cec5da0b (&macvlan_netdev_addr_lock_key/1){+...}, at: dev_set_rx_mode+0x19/0x30
[ 165.679642] #2: 000000000ca6fdb5 (&dev_addr_list_lock_key/3){+...}, at: dev_uc_sync+0xfa/0x1a0
[ 165.679642] #3: 00000000dc1495a2 (rcu_read_lock){....}, at: bond_set_rx_mode+0x5/0x3c0 [bonding]
[ 165.679642]
[ 165.679642] stack backtrace:
[ 165.679642] CPU: 1 PID: 1812 Comm: ip Not tainted 5.3.0-rc7+ #322
[ 165.679642] Call Trace:
[ 165.679642] dump_stack+0x7c/0xbb
[ 165.679642] __lock_acquire+0x26a9/0x3de0
[ 165.679642] ? register_lock_class+0x14d0/0x14d0
[ 165.679642] ? mark_held_locks+0xa5/0xe0
[ 165.679642] ? trace_hardirqs_on_thunk+0x1a/0x20
[ 165.679642] ? register_lock_class+0x14d0/0x14d0
[ 165.679642] lock_acquire+0x164/0x3b0
[ 165.679642] ? dev_uc_sync_multiple+0xfa/0x1a0
[ 165.679642] _raw_spin_lock_nested+0x2e/0x60
[ 165.679642] ? dev_uc_sync_multiple+0xfa/0x1a0
[ 165.679642] dev_uc_sync_multiple+0xfa/0x1a0
[ 165.679642] bond_set_rx_mode+0x269/0x3c0 [bonding]
[ 165.679642] ? bond_init+0x6f0/0x6f0 [bonding]
[ 165.679642] dev_uc_sync+0x15a/0x1a0
[ 165.679642] macvlan_set_mac_lists+0x55/0x110 [macvlan]
[ 165.679642] dev_set_rx_mode+0x21/0x30
[ 165.679642] __dev_open+0x202/0x310
[ 165.679642] ? dev_set_rx_mode+0x30/0x30
[ 165.679642] ? mark_held_locks+0xa5/0xe0
[ 165.679642] ? __local_bh_enable_ip+0xe9/0x1b0
[ 165.679642] __dev_change_flags+0x3c3/0x500
[ 165.679642] ? dev_set_allmulti+0x10/0x10
[ 165.679642] dev_change_flags+0x7a/0x160
[ ...]
Fixes: c674ac30c549 ("macvlan: Fix lockdep warnings with stacked macvlan devices")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
---
v1 -> v2 : this patch isn't changed
drivers/net/macvlan.c | 35 +++++++++++++++++++++++++++--------
include/linux/if_macvlan.h | 2 ++
2 files changed, 29 insertions(+), 8 deletions(-)
diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
index 940192c057b6..dae368a2e8d1 100644
--- a/drivers/net/macvlan.c
+++ b/drivers/net/macvlan.c
@@ -852,8 +852,6 @@ static int macvlan_do_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
* "super class" of normal network devices; split their locks off into a
* separate class since they always nest.
*/
-static struct lock_class_key macvlan_netdev_addr_lock_key;
-
#define ALWAYS_ON_OFFLOADS \
(NETIF_F_SG | NETIF_F_HW_CSUM | NETIF_F_GSO_SOFTWARE | \
NETIF_F_GSO_ROBUST | NETIF_F_GSO_ENCAP_ALL)
@@ -874,12 +872,30 @@ static int macvlan_get_nest_level(struct net_device *dev)
return ((struct macvlan_dev *)netdev_priv(dev))->nest_level;
}
-static void macvlan_set_lockdep_class(struct net_device *dev)
+static void macvlan_dev_set_lockdep_one(struct net_device *dev,
+ struct netdev_queue *txq,
+ void *_unused)
+{
+ struct macvlan_dev *macvlan = netdev_priv(dev);
+
+ lockdep_set_class(&txq->_xmit_lock, &macvlan->xmit_lock_key);
+}
+
+static struct lock_class_key qdisc_tx_busylock_key;
+static struct lock_class_key qdisc_running_key;
+
+static void macvlan_dev_set_lockdep_class(struct net_device *dev)
{
- netdev_lockdep_set_classes(dev);
- lockdep_set_class_and_subclass(&dev->addr_list_lock,
- &macvlan_netdev_addr_lock_key,
- macvlan_get_nest_level(dev));
+ struct macvlan_dev *macvlan = netdev_priv(dev);
+
+ dev->qdisc_tx_busylock = &qdisc_tx_busylock_key;
+ dev->qdisc_running_key = &qdisc_running_key;
+
+ lockdep_register_key(&macvlan->addr_lock_key);
+ lockdep_set_class(&dev->addr_list_lock, &macvlan->addr_lock_key);
+
+ lockdep_register_key(&macvlan->xmit_lock_key);
+ netdev_for_each_tx_queue(dev, macvlan_dev_set_lockdep_one, NULL);
}
static int macvlan_init(struct net_device *dev)
@@ -900,7 +916,7 @@ static int macvlan_init(struct net_device *dev)
dev->gso_max_segs = lowerdev->gso_max_segs;
dev->hard_header_len = lowerdev->hard_header_len;
- macvlan_set_lockdep_class(dev);
+ macvlan_dev_set_lockdep_class(dev);
vlan->pcpu_stats = netdev_alloc_pcpu_stats(struct vlan_pcpu_stats);
if (!vlan->pcpu_stats)
@@ -922,6 +938,9 @@ static void macvlan_uninit(struct net_device *dev)
port->count -= 1;
if (!port->count)
macvlan_port_destroy(port->dev);
+
+ lockdep_unregister_key(&vlan->addr_lock_key);
+ lockdep_unregister_key(&vlan->xmit_lock_key);
}
static void macvlan_dev_get_stats64(struct net_device *dev,
diff --git a/include/linux/if_macvlan.h b/include/linux/if_macvlan.h
index 2e55e4cdbd8a..ea5b41823287 100644
--- a/include/linux/if_macvlan.h
+++ b/include/linux/if_macvlan.h
@@ -31,6 +31,8 @@ struct macvlan_dev {
u16 flags;
int nest_level;
unsigned int macaddr_count;
+ struct lock_class_key xmit_lock_key;
+ struct lock_class_key addr_lock_key;
#ifdef CONFIG_NET_POLL_CONTROLLER
struct netpoll *netpoll;
#endif
--
2.17.1
^ permalink raw reply related
* [PATCH net v2 06/11] macsec: use dynamic lockdep key instead of subclass
From: Taehee Yoo @ 2019-09-07 13:46 UTC (permalink / raw)
To: davem, netdev, j.vosburgh, vfalico, andy, jiri, sd, roopa, saeedm,
manishc, rahulv, kys, haiyangz, sthemmin, sashal, hare, varun,
ubraun, kgraul, jay.vosburgh
Cc: ap420073
All macsec device has same lockdep key and subclass is initialized with
nest_level.
But actual nest_level value can be changed when a lower device is attached.
And at this moment, the subclass should be updated but it seems to be
unsafe.
So this patch makes macsec use dynamic lockdep key instead of the subclass.
Test commands:
ip link add bond0 type bond
ip link add dummy0 type dummy
ip link add macsec0 link bond0 type macsec
ip link add macsec1 link dummy0 type macsec
ip link set bond0 mtu 1000
ip link set macsec1 master bond0
ip link set bond0 up
ip link set macsec0 up
ip link set dummy0 up
ip link set macsec1 up
Splat looks like:
[ 146.540123] ============================================
[ 146.540123] WARNING: possible recursive locking detected
[ 146.540123] 5.3.0-rc7+ #322 Not tainted
[ 146.540123] --------------------------------------------
[ 146.540123] ip/1340 is trying to acquire lock:
[ 146.540123] 00000000446fd8bd (&macsec_netdev_addr_lock_key/1){+...}, at: dev_uc_sync_multiple+0xfa/0x1a0
[ 146.540123]
[ 146.540123] but task is already holding lock:
[ 146.540123] 00000000a9ab6378 (&macsec_netdev_addr_lock_key/1){+...}, at: dev_set_rx_mode+0x19/0x30
[ 146.540123]
[ 146.540123] other info that might help us debug this:
[ 146.540123] Possible unsafe locking scenario:
[ 146.540123]
[ 146.540123] CPU0
[ 146.540123] ----
[ 146.540123] lock(&macsec_netdev_addr_lock_key/1);
[ 146.540123] lock(&macsec_netdev_addr_lock_key/1);
[ 146.623155]
[ 146.623155] *** DEADLOCK ***
[ 146.623155]
[ 146.623155] May be due to missing lock nesting notation
[ 146.623155]
[ 146.623155] 4 locks held by ip/1340:
[ 146.623155] #0: 0000000026436ef0 (rtnl_mutex){+.+.}, at: rtnetlink_rcv_msg+0x466/0x8a0
[ 146.623155] #1: 00000000a9ab6378 (&macsec_netdev_addr_lock_key/1){+...}, at: dev_set_rx_mode+0x19/0x30
[ 146.623155] #2: 00000000a8947dd0 (&dev_addr_list_lock_key/3){+...}, at: dev_mc_sync+0xfa/0x1a0
[ 146.623155] #3: 00000000b62011e9 (rcu_read_lock){....}, at: bond_set_rx_mode+0x5/0x3c0 [bonding]
[ 146.674970]
[ 146.674970] stack backtrace:
[ 146.687145] CPU: 0 PID: 1340 Comm: ip Not tainted 5.3.0-rc7+ #322
[ 146.693024] Call Trace:
[ 146.693024] dump_stack+0x7c/0xbb
[ 146.693024] __lock_acquire+0x26a9/0x3de0
[ 146.693024] ? register_lock_class+0x14d0/0x14d0
[ 146.693024] ? register_lock_class+0x14d0/0x14d0
[ 146.693024] lock_acquire+0x164/0x3b0
[ 146.693024] ? dev_uc_sync_multiple+0xfa/0x1a0
[ 146.693024] _raw_spin_lock_nested+0x2e/0x60
[ 146.693024] ? dev_uc_sync_multiple+0xfa/0x1a0
[ 146.693024] dev_uc_sync_multiple+0xfa/0x1a0
[ 146.693024] bond_set_rx_mode+0x269/0x3c0 [bonding]
[ 146.751163] ? bond_init+0x6f0/0x6f0 [bonding]
[ 146.757006] ? do_raw_spin_trylock+0xa9/0x170
[ 146.757006] dev_mc_sync+0x15a/0x1a0
[ 146.757006] macsec_dev_set_rx_mode+0x3a/0x50 [macsec]
[ 146.757006] dev_set_rx_mode+0x21/0x30
[ 146.757006] __dev_open+0x202/0x310
[ 146.757006] ? dev_set_rx_mode+0x30/0x30
[ 146.757006] ? mark_held_locks+0xa5/0xe0
[ 146.757006] ? __local_bh_enable_ip+0xe9/0x1b0
[ 146.757006] __dev_change_flags+0x3c3/0x500
[ 146.757006] ? dev_set_allmulti+0x10/0x10
[ 146.757006] ? sched_clock_local+0xd4/0x140
[ 146.757006] ? check_chain_key+0x236/0x5d0
[ 146.757006] dev_change_flags+0x7a/0x160
[ 146.757006] do_setlink+0xa26/0x2f20
[ 146.757006] ? sched_clock_local+0xd4/0x140
[ ... ]
Fixes: e20038724552 ("macsec: fix lockdep splats when nesting devices")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
---
v1 -> v2 : this patch isn't changed
drivers/net/macsec.c | 37 ++++++++++++++++++++++++++++++++-----
1 file changed, 32 insertions(+), 5 deletions(-)
diff --git a/drivers/net/macsec.c b/drivers/net/macsec.c
index 8f46aa1ddec0..25a4fc88145d 100644
--- a/drivers/net/macsec.c
+++ b/drivers/net/macsec.c
@@ -267,6 +267,8 @@ struct macsec_dev {
struct pcpu_secy_stats __percpu *stats;
struct list_head secys;
struct gro_cells gro_cells;
+ struct lock_class_key xmit_lock_key;
+ struct lock_class_key addr_lock_key;
unsigned int nest_level;
};
@@ -2749,7 +2751,32 @@ static netdev_tx_t macsec_start_xmit(struct sk_buff *skb,
#define MACSEC_FEATURES \
(NETIF_F_SG | NETIF_F_HIGHDMA | NETIF_F_FRAGLIST)
-static struct lock_class_key macsec_netdev_addr_lock_key;
+
+static void macsec_dev_set_lockdep_one(struct net_device *dev,
+ struct netdev_queue *txq,
+ void *_unused)
+{
+ struct macsec_dev *macsec = macsec_priv(dev);
+
+ lockdep_set_class(&txq->_xmit_lock, &macsec->xmit_lock_key);
+}
+
+static struct lock_class_key qdisc_tx_busylock_key;
+static struct lock_class_key qdisc_running_key;
+
+static void macsec_dev_set_lockdep_class(struct net_device *dev)
+{
+ struct macsec_dev *macsec = macsec_priv(dev);
+
+ dev->qdisc_tx_busylock = &qdisc_tx_busylock_key;
+ dev->qdisc_running_key = &qdisc_running_key;
+
+ lockdep_register_key(&macsec->addr_lock_key);
+ lockdep_set_class(&dev->addr_list_lock, &macsec->addr_lock_key);
+
+ lockdep_register_key(&macsec->xmit_lock_key);
+ netdev_for_each_tx_queue(dev, macsec_dev_set_lockdep_one, NULL);
+}
static int macsec_dev_init(struct net_device *dev)
{
@@ -2780,6 +2807,7 @@ static int macsec_dev_init(struct net_device *dev)
if (is_zero_ether_addr(dev->broadcast))
memcpy(dev->broadcast, real_dev->broadcast, dev->addr_len);
+ macsec_dev_set_lockdep_class(dev);
return 0;
}
@@ -2789,6 +2817,9 @@ static void macsec_dev_uninit(struct net_device *dev)
gro_cells_destroy(&macsec->gro_cells);
free_percpu(dev->tstats);
+
+ lockdep_unregister_key(&macsec->addr_lock_key);
+ lockdep_unregister_key(&macsec->xmit_lock_key);
}
static netdev_features_t macsec_fix_features(struct net_device *dev,
@@ -3263,10 +3294,6 @@ static int macsec_newlink(struct net *net, struct net_device *dev,
dev_hold(real_dev);
macsec->nest_level = dev_get_nest_level(real_dev) + 1;
- netdev_lockdep_set_classes(dev);
- lockdep_set_class_and_subclass(&dev->addr_list_lock,
- &macsec_netdev_addr_lock_key,
- macsec_get_nest_level(dev));
err = netdev_upper_dev_link(real_dev, dev, extack);
if (err < 0)
--
2.17.1
^ permalink raw reply related
* [PATCH net v2 05/11] team: use dynamic lockdep key instead of static key
From: Taehee Yoo @ 2019-09-07 13:46 UTC (permalink / raw)
To: davem, netdev, j.vosburgh, vfalico, andy, jiri, sd, roopa, saeedm,
manishc, rahulv, kys, haiyangz, sthemmin, sashal, hare, varun,
ubraun, kgraul, jay.vosburgh
Cc: ap420073
In the current code, all team devices have same static lockdep key
and team devices could be nested so that it makes unnecessary
lockdep warning.
Test commands:
ip link add team0 type team
for i in {1..7}
do
let A=$i-1
ip link add team$i type team
ip link set team$i master team$A
done
ip link del team0
Splat looks like:
[ 137.406730] ============================================
[ 137.412685] WARNING: possible recursive locking detected
[ 137.418642] 5.3.0-rc7+ #322 Not tainted
[ 137.422941] --------------------------------------------
[ 137.428886] ip/1383 is trying to acquire lock:
[ 137.433869] 0000000089571080 (&dev_addr_list_lock_key/1){+...}, at: dev_uc_sync_multiple+0xfa/0x1a0
[ 137.444034]
[ 137.444034] but task is already holding lock:
[ 137.450572] 00000000d9597252 (&dev_addr_list_lock_key/1){+...}, at: dev_uc_unsync+0x10c/0x1b0
[ 137.460142]
[ 137.460142] other info that might help us debug this:
[ 137.467458] Possible unsafe locking scenario:
[ 137.467458]
[ 137.474096] CPU0
[ 137.476828] ----
[ 137.479569] lock(&dev_addr_list_lock_key/1);
[ 137.484554] lock(&dev_addr_list_lock_key/1);
[ 137.489539]
[ 137.489539] *** DEADLOCK ***
[ 137.489539]
[ 137.496178] May be due to missing lock nesting notation
[ 137.496178]
[ 137.503789] 5 locks held by ip/1383:
[ 137.507797] #0: 00000000d497f415 (rtnl_mutex){+.+.}, at: rtnetlink_rcv_msg+0x466/0x8a0
[ 137.516786] #1: 000000008e4b4656 (&team->lock){+.+.}, at: team_uninit+0x3a/0x1a0 [team]
[ 137.525882] #2: 000000005cf248d1 (&dev_addr_list_lock_key){+...}, at: dev_uc_unsync+0x98/0x1b0
[ 137.535649] #3: 00000000d9597252 (&dev_addr_list_lock_key/1){+...}, at: dev_uc_unsync+0x10c/0x1b0
[ 137.545709] #4: 00000000bec134c3 (rcu_read_lock){....}, at: team_set_rx_mode+0x5/0x1d0 [team]
[ 137.555384]
[ 137.555384] stack backtrace:
[ 137.560277] CPU: 0 PID: 1383 Comm: ip Not tainted 5.3.0-rc7+ #322
[ 137.577826] Call Trace:
[ 137.580586] dump_stack+0x7c/0xbb
[ 137.584307] __lock_acquire+0x26a9/0x3de0
[ 137.588820] ? register_lock_class+0x14d0/0x14d0
[ 137.594008] ? register_lock_class+0x14d0/0x14d0
[ 137.599194] lock_acquire+0x164/0x3b0
[ 137.603310] ? dev_uc_sync_multiple+0xfa/0x1a0
[ 137.608307] _raw_spin_lock_nested+0x2e/0x60
[ 137.613105] ? dev_uc_sync_multiple+0xfa/0x1a0
[ 137.618095] dev_uc_sync_multiple+0xfa/0x1a0
[ 137.622900] team_set_rx_mode+0xa9/0x1d0 [team]
[ 137.627993] dev_uc_unsync+0x151/0x1b0
[ 137.632205] team_port_del+0x304/0x790 [team]
[ 137.637110] team_uninit+0xb0/0x1a0 [team]
[ 137.641717] rollback_registered_many+0x728/0xda0
[ 137.647005] ? generic_xdp_install+0x310/0x310
[ 137.651994] ? __set_pages_p+0xf4/0x150
[ 137.656306] ? check_chain_key+0x236/0x5d0
[ 137.660914] ? __nla_validate_parse+0x98/0x1ad0
[ 137.666006] unregister_netdevice_many.part.120+0x13/0x1b0
[ 137.672167] rtnl_delete_link+0xbc/0x100
[ 137.676575] ? rtnl_af_register+0xc0/0xc0
[ 137.681084] rtnl_dellink+0x2e7/0x870
[ 137.685204] ? find_held_lock+0x39/0x1d0
[ ... ]
Fixes: 3d249d4ca7d0 ("net: introduce ethernet teaming device")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
---
v1 -> v2 : this patch isn't changed
drivers/net/team/team.c | 61 ++++++++++++++++++++++++++++++++++++++---
include/linux/if_team.h | 5 ++++
2 files changed, 62 insertions(+), 4 deletions(-)
diff --git a/drivers/net/team/team.c b/drivers/net/team/team.c
index e8089def5a46..bfcd6ed57493 100644
--- a/drivers/net/team/team.c
+++ b/drivers/net/team/team.c
@@ -1607,6 +1607,34 @@ static const struct team_option team_options[] = {
},
};
+static void team_dev_set_lockdep_one(struct net_device *dev,
+ struct netdev_queue *txq,
+ void *_unused)
+{
+ struct team *team = netdev_priv(dev);
+
+ lockdep_set_class(&txq->_xmit_lock, &team->xmit_lock_key);
+}
+
+static struct lock_class_key qdisc_tx_busylock_key;
+static struct lock_class_key qdisc_running_key;
+
+static void team_dev_set_lockdep_class(struct net_device *dev)
+{
+ struct team *team = netdev_priv(dev);
+
+ dev->qdisc_tx_busylock = &qdisc_tx_busylock_key;
+ dev->qdisc_running_key = &qdisc_running_key;
+
+ lockdep_register_key(&team->team_lock_key);
+ __mutex_init(&team->lock, "team->team_lock_key", &team->team_lock_key);
+
+ lockdep_register_key(&team->addr_lock_key);
+ lockdep_set_class(&dev->addr_list_lock, &team->addr_lock_key);
+
+ lockdep_register_key(&team->xmit_lock_key);
+ netdev_for_each_tx_queue(dev, team_dev_set_lockdep_one, NULL);
+}
static int team_init(struct net_device *dev)
{
@@ -1615,7 +1643,6 @@ static int team_init(struct net_device *dev)
int err;
team->dev = dev;
- mutex_init(&team->lock);
team_set_no_mode(team);
team->pcpu_stats = netdev_alloc_pcpu_stats(struct team_pcpu_stats);
@@ -1642,7 +1669,7 @@ static int team_init(struct net_device *dev)
goto err_options_register;
netif_carrier_off(dev);
- netdev_lockdep_set_classes(dev);
+ team_dev_set_lockdep_class(dev);
return 0;
@@ -1673,6 +1700,11 @@ static void team_uninit(struct net_device *dev)
team_queue_override_fini(team);
mutex_unlock(&team->lock);
netdev_change_features(dev);
+
+ lockdep_unregister_key(&team->team_lock_key);
+ lockdep_unregister_key(&team->addr_lock_key);
+ lockdep_unregister_key(&team->xmit_lock_key);
+
}
static void team_destructor(struct net_device *dev)
@@ -1967,6 +1999,23 @@ static int team_add_slave(struct net_device *dev, struct net_device *port_dev,
return err;
}
+static void team_update_lock_key(struct net_device *dev)
+{
+ struct team *team = netdev_priv(dev);
+
+ lockdep_unregister_key(&team->team_lock_key);
+ lockdep_unregister_key(&team->addr_lock_key);
+ lockdep_unregister_key(&team->xmit_lock_key);
+
+ lockdep_register_key(&team->team_lock_key);
+ lockdep_register_key(&team->addr_lock_key);
+ lockdep_register_key(&team->xmit_lock_key);
+
+ lockdep_set_class(&team->lock, &team->team_lock_key);
+ lockdep_set_class(&dev->addr_list_lock, &team->addr_lock_key);
+ netdev_for_each_tx_queue(dev, team_dev_set_lockdep_one, NULL);
+}
+
static int team_del_slave(struct net_device *dev, struct net_device *port_dev)
{
struct team *team = netdev_priv(dev);
@@ -1976,8 +2025,12 @@ static int team_del_slave(struct net_device *dev, struct net_device *port_dev)
err = team_port_del(team, port_dev);
mutex_unlock(&team->lock);
- if (!err)
- netdev_change_features(dev);
+ if (err)
+ return err;
+
+ if (netif_is_team_master(port_dev))
+ team_update_lock_key(port_dev);
+ netdev_change_features(dev);
return err;
}
diff --git a/include/linux/if_team.h b/include/linux/if_team.h
index 06faa066496f..9c97bb19ed34 100644
--- a/include/linux/if_team.h
+++ b/include/linux/if_team.h
@@ -223,6 +223,11 @@ struct team {
atomic_t count_pending;
struct delayed_work dw;
} mcast_rejoin;
+
+ struct lock_class_key team_lock_key;
+ struct lock_class_key xmit_lock_key;
+ struct lock_class_key addr_lock_key;
+
long mode_priv[TEAM_MODE_PRIV_LONGS];
};
--
2.17.1
^ permalink raw reply related
* [PATCH net v2 04/11] bonding: use dynamic lockdep key instead of subclass
From: Taehee Yoo @ 2019-09-07 13:46 UTC (permalink / raw)
To: davem, netdev, j.vosburgh, vfalico, andy, jiri, sd, roopa, saeedm,
manishc, rahulv, kys, haiyangz, sthemmin, sashal, hare, varun,
ubraun, kgraul, jay.vosburgh
Cc: ap420073
All bonding device has same lockdep key and subclass is initialized with
nest_level.
But actual nest_level value can be changed when a lower device is attached.
And at this moment, the subclass should be updated but it seems to be
unsafe.
So this patch makes bonding use dynamic lockdep key instead of the
subclass.
Test commands:
ip link add bond0 type bond
for i in {1..5}
do
let A=$i-1
ip link add bond$i type bond
ip link set bond$i master bond$A
done
ip link set bond5 master bond0
Splat looks like:
[ 327.477830] ============================================
[ 327.477830] WARNING: possible recursive locking detected
[ 327.477830] 5.3.0-rc7+ #322 Not tainted
[ 327.477830] --------------------------------------------
[ 327.477830] ip/1399 is trying to acquire lock:
[ 327.477830] 00000000f604be63 (&(&bond->stats_lock)->rlock#2/2){+.+.}, at: bond_get_stats+0xb8/0x500 [bonding]
[ 327.477830]
[ 327.477830] but task is already holding lock:
[ 327.477830] 00000000e9d31238 (&(&bond->stats_lock)->rlock#2/2){+.+.}, at: bond_get_stats+0xb8/0x500 [bonding]
[ 327.477830]
[ 327.477830] other info that might help us debug this:
[ 327.477830] Possible unsafe locking scenario:
[ 327.477830]
[ 327.477830] CPU0
[ 327.477830] ----
[ 327.477830] lock(&(&bond->stats_lock)->rlock#2/2);
[ 327.477830] lock(&(&bond->stats_lock)->rlock#2/2);
[ 327.477830]
[ 327.477830] *** DEADLOCK ***
[ 327.477830]
[ 327.477830] May be due to missing lock nesting notation
[ 327.477830]
[ 327.477830] 3 locks held by ip/1399:
[ 327.477830] #0: 00000000a762c4e3 (rtnl_mutex){+.+.}, at: rtnetlink_rcv_msg+0x466/0x8a0
[ 327.477830] #1: 00000000e9d31238 (&(&bond->stats_lock)->rlock#2/2){+.+.}, at: bond_get_stats+0xb8/0x500 [bonding]
[ 327.477830] #2: 000000008f7ebff4 (rcu_read_lock){....}, at: bond_get_stats+0x9f/0x500 [bonding]
[ 327.477830]
[ 327.477830] stack backtrace:
[ 327.477830] CPU: 0 PID: 1399 Comm: ip Not tainted 5.3.0-rc7+ #322
[ 327.477830] Call Trace:
[ 327.477830] dump_stack+0x7c/0xbb
[ 327.477830] __lock_acquire+0x26a9/0x3de0
[ 327.477830] ? __change_page_attr_set_clr+0x133b/0x1d20
[ 327.477830] ? register_lock_class+0x14d0/0x14d0
[ 327.477830] lock_acquire+0x164/0x3b0
[ 327.477830] ? bond_get_stats+0xb8/0x500 [bonding]
[ 327.666914] _raw_spin_lock_nested+0x2e/0x60
[ 327.666914] ? bond_get_stats+0xb8/0x500 [bonding]
[ 327.678302] bond_get_stats+0xb8/0x500 [bonding]
[ 327.678302] ? bond_arp_rcv+0xf10/0xf10 [bonding]
[ 327.678302] ? register_lock_class+0x14d0/0x14d0
[ 327.678302] ? bond_get_stats+0xb8/0x500 [bonding]
[ 327.678302] dev_get_stats+0x1ec/0x270
[ 327.678302] bond_get_stats+0x1d1/0x500 [bonding]
[ 327.678302] ? lock_acquire+0x164/0x3b0
[ 327.678302] ? bond_arp_rcv+0xf10/0xf10 [bonding]
[ 327.678302] ? rtnl_is_locked+0x16/0x30
[ 327.678302] ? devlink_compat_switch_id_get+0x18/0x140
[ 327.678302] ? dev_get_alias+0xe2/0x190
[ 327.731145] ? dev_get_port_parent_id+0x12a/0x340
[ 327.731145] ? rtnl_phys_switch_id_fill+0x88/0xe0
[ 327.731145] dev_get_stats+0x1ec/0x270
[ 327.731145] rtnl_fill_stats+0x44/0xbe0
[ 327.731145] ? nla_put+0xc2/0x140
[ ... ]
Fixes: d3fff6c443fe ("net: add netdev_lockdep_set_classes() helper")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
---
v1 -> v2 : this patch isn't changed
drivers/net/bonding/bond_main.c | 61 ++++++++++++++++++++++++++++++---
include/net/bonding.h | 3 ++
2 files changed, 59 insertions(+), 5 deletions(-)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 0db12fcfc953..7f574e74ed78 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -1857,6 +1857,32 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev,
return res;
}
+static void bond_dev_set_lockdep_one(struct net_device *dev,
+ struct netdev_queue *txq,
+ void *_unused)
+{
+ struct bonding *bond = netdev_priv(dev);
+
+ lockdep_set_class(&txq->_xmit_lock, &bond->xmit_lock_key);
+}
+
+static void bond_update_lock_key(struct net_device *dev)
+{
+ struct bonding *bond = netdev_priv(dev);
+
+ lockdep_unregister_key(&bond->stats_lock_key);
+ lockdep_unregister_key(&bond->addr_lock_key);
+ lockdep_unregister_key(&bond->xmit_lock_key);
+
+ lockdep_register_key(&bond->stats_lock_key);
+ lockdep_register_key(&bond->addr_lock_key);
+ lockdep_register_key(&bond->xmit_lock_key);
+
+ lockdep_set_class(&bond->stats_lock, &bond->stats_lock_key);
+ lockdep_set_class(&dev->addr_list_lock, &bond->addr_lock_key);
+ netdev_for_each_tx_queue(dev, bond_dev_set_lockdep_one, NULL);
+}
+
/* Try to release the slave device <slave> from the bond device <master>
* It is legal to access curr_active_slave without a lock because all the function
* is RTNL-locked. If "all" is true it means that the function is being called
@@ -2022,6 +2048,8 @@ static int __bond_release_one(struct net_device *bond_dev,
slave_dev->priv_flags &= ~IFF_BONDING;
bond_free_slave(slave);
+ if (netif_is_bond_master(slave_dev))
+ bond_update_lock_key(slave_dev);
return 0;
}
@@ -3459,7 +3487,7 @@ static void bond_get_stats(struct net_device *bond_dev,
struct list_head *iter;
struct slave *slave;
- spin_lock_nested(&bond->stats_lock, bond_get_nest_level(bond_dev));
+ spin_lock(&bond->stats_lock);
memcpy(stats, &bond->bond_stats, sizeof(*stats));
rcu_read_lock();
@@ -4297,8 +4325,6 @@ void bond_setup(struct net_device *bond_dev)
{
struct bonding *bond = netdev_priv(bond_dev);
- spin_lock_init(&bond->mode_lock);
- spin_lock_init(&bond->stats_lock);
bond->params = bonding_defaults;
/* Initialize pointers */
@@ -4367,6 +4393,9 @@ static void bond_uninit(struct net_device *bond_dev)
list_del(&bond->bond_list);
+ lockdep_unregister_key(&bond->stats_lock_key);
+ lockdep_unregister_key(&bond->addr_lock_key);
+ lockdep_unregister_key(&bond->xmit_lock_key);
bond_debug_unregister(bond);
}
@@ -4758,6 +4787,29 @@ static int bond_check_params(struct bond_params *params)
return 0;
}
+static struct lock_class_key qdisc_tx_busylock_key;
+static struct lock_class_key qdisc_running_key;
+
+static void bond_dev_set_lockdep_class(struct net_device *dev)
+{
+ struct bonding *bond = netdev_priv(dev);
+
+ dev->qdisc_tx_busylock = &qdisc_tx_busylock_key;
+ dev->qdisc_running_key = &qdisc_running_key;
+
+ spin_lock_init(&bond->mode_lock);
+
+ spin_lock_init(&bond->stats_lock);
+ lockdep_register_key(&bond->stats_lock_key);
+ lockdep_set_class(&bond->stats_lock, &bond->stats_lock_key);
+
+ lockdep_register_key(&bond->addr_lock_key);
+ lockdep_set_class(&dev->addr_list_lock, &bond->addr_lock_key);
+
+ lockdep_register_key(&bond->xmit_lock_key);
+ netdev_for_each_tx_queue(dev, bond_dev_set_lockdep_one, NULL);
+}
+
/* Called from registration process */
static int bond_init(struct net_device *bond_dev)
{
@@ -4771,8 +4823,7 @@ static int bond_init(struct net_device *bond_dev)
return -ENOMEM;
bond->nest_level = SINGLE_DEPTH_NESTING;
- netdev_lockdep_set_classes(bond_dev);
-
+ bond_dev_set_lockdep_class(bond_dev);
list_add_tail(&bond->bond_list, &bn->dev_list);
bond_prepare_sysfs_group(bond);
diff --git a/include/net/bonding.h b/include/net/bonding.h
index f7fe45689142..c39ac7061e41 100644
--- a/include/net/bonding.h
+++ b/include/net/bonding.h
@@ -239,6 +239,9 @@ struct bonding {
struct dentry *debug_dir;
#endif /* CONFIG_DEBUG_FS */
struct rtnl_link_stats64 bond_stats;
+ struct lock_class_key stats_lock_key;
+ struct lock_class_key xmit_lock_key;
+ struct lock_class_key addr_lock_key;
};
#define bond_slave_get_rcu(dev) \
--
2.17.1
^ permalink raw reply related
* [PATCH net v2 03/11] bonding: fix unexpected IFF_BONDING bit unset
From: Taehee Yoo @ 2019-09-07 13:46 UTC (permalink / raw)
To: davem, netdev, j.vosburgh, vfalico, andy, jiri, sd, roopa, saeedm,
manishc, rahulv, kys, haiyangz, sthemmin, sashal, hare, varun,
ubraun, kgraul, jay.vosburgh
Cc: ap420073
The IFF_BONDING means bonding master or bonding slave device.
->ndo_add_slave() sets IFF_BONDING flag and ->ndo_del_slave() unsets
IFF_BONDING flag.
bond0<--bond1
Both bond0 and bond1 are bonding device and these should keep having
IFF_BONDING flag until they are removed.
But bond1 would lose IFF_BONDING at ->ndo_del_slave() because that routine
do not check whether the slave device is the bonding type or not.
This patch adds the interface type check routine before removing
IFF_BONDING flag.
Test commands:
ip link add bond0 type bond
ip link add bond1 type bond
ip link set bond1 master bond0
ip link set bond1 nomaster
ip link del bond1 type bond
ip link add bond1 type bond
Splat looks like:
[ 149.201107] proc_dir_entry 'bonding/bond1' already registered
[ 149.208013] WARNING: CPU: 1 PID: 1308 at fs/proc/generic.c:361 proc_register+0x2a9/0x3e0
[ 149.208866] Modules linked in: bonding veth openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv4 ip_tables6
[ 149.208866] CPU: 1 PID: 1308 Comm: ip Not tainted 5.3.0-rc7+ #322
[ 149.208866] RIP: 0010:proc_register+0x2a9/0x3e0
[ 149.208866] Code: 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 39 01 00 00 48 8b 04 24 48 89 ea 48 c7 c7 a0 a0 13 89 48 8b b0 0
[ 149.208866] RSP: 0018:ffff88810df9f098 EFLAGS: 00010286
[ 149.208866] RAX: dffffc0000000008 RBX: ffff8880b5d3aa50 RCX: ffffffff87cdec92
[ 149.208866] RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffff888116bf6a8c
[ 149.208866] RBP: ffff8880b5d3acd3 R08: ffffed1022d7ff71 R09: ffffed1022d7ff71
[ 149.208866] R10: 0000000000000001 R11: ffffed1022d7ff70 R12: ffff8880b5d3abe8
[ 149.208866] R13: ffff8880b5d3acd2 R14: dffffc0000000000 R15: ffffed1016ba759a
[ 149.208866] FS: 00007f4bd1f650c0(0000) GS:ffff888116a00000(0000) knlGS:0000000000000000
[ 149.208866] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 149.208866] CR2: 000055e7ca686118 CR3: 0000000106fd4000 CR4: 00000000001006e0
[ 149.208866] Call Trace:
[ 149.208866] proc_create_seq_private+0xb3/0xf0
[ 149.208866] bond_create_proc_entry+0x1b3/0x3f0 [bonding]
[ 149.208866] bond_netdev_event+0x433/0x970 [bonding]
[ 149.208866] ? __module_text_address+0x13/0x140
[ 149.208866] notifier_call_chain+0x90/0x160
[ 149.208866] register_netdevice+0x9b3/0xd70
[ 149.208866] ? alloc_netdev_mqs+0x854/0xc10
[ 149.208866] ? netdev_change_features+0xa0/0xa0
[ 149.208866] ? rtnl_create_link+0x2ed/0xad0
[ 149.208866] bond_newlink+0x2a/0x60 [bonding]
[ 149.208866] __rtnl_newlink+0xb75/0x1180
[ ... ]
Fixes: 0b680e753724 ("[PATCH] bonding: Add priv_flag to avoid event mishandling")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
---
v1 -> v2: do not add a new priv_flag.
drivers/net/bonding/bond_main.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 931d9d935686..0db12fcfc953 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -1816,7 +1816,8 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev,
slave_disable_netpoll(new_slave);
err_close:
- slave_dev->priv_flags &= ~IFF_BONDING;
+ if (!netif_is_bond_master(slave_dev))
+ slave_dev->priv_flags &= ~IFF_BONDING;
dev_close(slave_dev);
err_restore_mac:
@@ -2017,7 +2018,8 @@ static int __bond_release_one(struct net_device *bond_dev,
else
dev_set_mtu(slave_dev, slave->original_mtu);
- slave_dev->priv_flags &= ~IFF_BONDING;
+ if (!netif_is_bond_master(slave_dev))
+ slave_dev->priv_flags &= ~IFF_BONDING;
bond_free_slave(slave);
--
2.17.1
^ permalink raw reply related
* [PATCH net v2 02/11] vlan: use dynamic lockdep key instead of subclass
From: Taehee Yoo @ 2019-09-07 13:45 UTC (permalink / raw)
To: davem, netdev, j.vosburgh, vfalico, andy, jiri, sd, roopa, saeedm,
manishc, rahulv, kys, haiyangz, sthemmin, sashal, hare, varun,
ubraun, kgraul, jay.vosburgh
Cc: ap420073
All VLAN device has same lockdep key and subclass is initialized with
nest_level.
But actual nest_level value can be changed when a lower device is attached.
And at this moment, the subclass should be updated but it seems to be
unsafe.
So this patch makes VLAN use dynamic lockdep key instead of the subclass.
Test commands:
ip link add dummy0 type dummy
ip link set dummy0 up
ip link add bond0 type bond
ip link add vlan_dummy1 link dummy0 type vlan id 1
ip link add vlan_bond1 link bond0 type vlan id 2
ip link set vlan_dummy1 master bond0
ip link set bond0 up
ip link set vlan_dummy1 up
ip link set vlan_bond1 up
Both vlan_dummy1 and vlan_bond1 have the same subclass and it makes
unnecessary deadlock warning message.
Splat looks like:
[ 149.244978] ============================================
[ 149.244978] WARNING: possible recursive locking detected
[ 149.244978] 5.3.0-rc7+ #322 Not tainted
[ 149.244978] --------------------------------------------
[ 149.244978] ip/1340 is trying to acquire lock:
[ 149.244978] 000000001399b1a7 (&vlan_netdev_addr_lock_key/1){+...}, at: dev_uc_sync_multiple+0xfa/0x1a0
[ 149.279600]
[ 149.279600] but task is already holding lock:
[ 149.279600] 00000000b963d9b4 (&vlan_netdev_addr_lock_key/1){+...}, at: dev_set_rx_mode+0x19/0x30
[ 149.279600]
[ 149.279600] other info that might help us debug this:
[ 149.305981] Possible unsafe locking scenario:
[ 149.305981]
[ 149.305981] CPU0
[ 149.305981] ----
[ 149.305981] lock(&vlan_netdev_addr_lock_key/1);
[ 149.305981] lock(&vlan_netdev_addr_lock_key/1);
[ 149.326258]
[ 149.326258] *** DEADLOCK ***
[ 149.326258]
[ 149.326258] May be due to missing lock nesting notation
[ 149.326258]
[ 149.326258] 4 locks held by ip/1340:
[ 149.326258] #0: 00000000927f0698 (rtnl_mutex){+.+.}, at: rtnetlink_rcv_msg+0x466/0x8a0
[ 149.326258] #1: 00000000b963d9b4 (&vlan_netdev_addr_lock_key/1){+...}, at: dev_set_rx_mode+0x19/0x30
[ 149.326258] #2: 0000000027395445 (&dev_addr_list_lock_key/3){+...}, at: dev_mc_sync+0xfa/0x1a0
[ 149.369961] #3: 00000000ce334932 (rcu_read_lock){....}, at: bond_set_rx_mode+0x5/0x3c0 [bonding]
[ 149.369961]
[ 149.369961] stack backtrace:
[ 149.369961] CPU: 1 PID: 1340 Comm: ip Not tainted 5.3.0-rc7+ #322
[ 149.369961] Call Trace:
[ 149.369961] dump_stack+0x7c/0xbb
[ 149.369961] __lock_acquire+0x26a9/0x3de0
[ 149.369961] ? register_lock_class+0x14d0/0x14d0
[ 149.369961] ? register_lock_class+0x14d0/0x14d0
[ 149.369961] lock_acquire+0x164/0x3b0
[ 149.433970] ? dev_uc_sync_multiple+0xfa/0x1a0
[ 149.433970] _raw_spin_lock_nested+0x2e/0x60
[ 149.433970] ? dev_uc_sync_multiple+0xfa/0x1a0
[ 149.433970] dev_uc_sync_multiple+0xfa/0x1a0
[ 149.433970] bond_set_rx_mode+0x269/0x3c0 [bonding]
[ 149.433970] ? bond_init+0x6f0/0x6f0 [bonding]
[ 149.433970] dev_mc_sync+0x15a/0x1a0
[ 149.433970] vlan_dev_set_rx_mode+0x37/0x80 [8021q]
[ 149.433970] dev_set_rx_mode+0x21/0x30
[ 149.433970] __dev_open+0x202/0x310
[ 149.433970] ? dev_set_rx_mode+0x30/0x30
[ 149.433970] ? mark_held_locks+0xa5/0xe0
[ 149.433970] ? __local_bh_enable_ip+0xe9/0x1b0
[ 149.433970] __dev_change_flags+0x3c3/0x500
[ ... ]
Fixes: 0fe1e567d0b4 ("[VLAN]: nested VLAN: fix lockdep's recursive locking warning")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
---
v1 -> v2 : this patch isn't changed
include/linux/if_vlan.h | 3 +++
net/8021q/vlan_dev.c | 28 +++++++++++++++-------------
2 files changed, 18 insertions(+), 13 deletions(-)
diff --git a/include/linux/if_vlan.h b/include/linux/if_vlan.h
index 244278d5c222..1aed9f613e90 100644
--- a/include/linux/if_vlan.h
+++ b/include/linux/if_vlan.h
@@ -183,6 +183,9 @@ struct vlan_dev_priv {
struct netpoll *netpoll;
#endif
unsigned int nest_level;
+
+ struct lock_class_key xmit_lock_key;
+ struct lock_class_key addr_lock_key;
};
static inline struct vlan_dev_priv *vlan_dev_priv(const struct net_device *dev)
diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c
index 93eadf179123..12bc80650087 100644
--- a/net/8021q/vlan_dev.c
+++ b/net/8021q/vlan_dev.c
@@ -494,24 +494,24 @@ static void vlan_dev_set_rx_mode(struct net_device *vlan_dev)
* "super class" of normal network devices; split their locks off into a
* separate class since they always nest.
*/
-static struct lock_class_key vlan_netdev_xmit_lock_key;
-static struct lock_class_key vlan_netdev_addr_lock_key;
-
static void vlan_dev_set_lockdep_one(struct net_device *dev,
struct netdev_queue *txq,
- void *_subclass)
+ void *_unused)
{
- lockdep_set_class_and_subclass(&txq->_xmit_lock,
- &vlan_netdev_xmit_lock_key,
- *(int *)_subclass);
+ struct vlan_dev_priv *vlan = vlan_dev_priv(dev);
+
+ lockdep_set_class(&txq->_xmit_lock, &vlan->xmit_lock_key);
}
-static void vlan_dev_set_lockdep_class(struct net_device *dev, int subclass)
+static void vlan_dev_set_lockdep_class(struct net_device *dev)
{
- lockdep_set_class_and_subclass(&dev->addr_list_lock,
- &vlan_netdev_addr_lock_key,
- subclass);
- netdev_for_each_tx_queue(dev, vlan_dev_set_lockdep_one, &subclass);
+ struct vlan_dev_priv *vlan = vlan_dev_priv(dev);
+
+ lockdep_register_key(&vlan->addr_lock_key);
+ lockdep_set_class(&dev->addr_list_lock, &vlan->addr_lock_key);
+
+ lockdep_register_key(&vlan->xmit_lock_key);
+ netdev_for_each_tx_queue(dev, vlan_dev_set_lockdep_one, NULL);
}
static int vlan_dev_get_lock_subclass(struct net_device *dev)
@@ -609,7 +609,7 @@ static int vlan_dev_init(struct net_device *dev)
SET_NETDEV_DEVTYPE(dev, &vlan_type);
- vlan_dev_set_lockdep_class(dev, vlan_dev_get_lock_subclass(dev));
+ vlan_dev_set_lockdep_class(dev);
vlan->vlan_pcpu_stats = netdev_alloc_pcpu_stats(struct vlan_pcpu_stats);
if (!vlan->vlan_pcpu_stats)
@@ -630,6 +630,8 @@ static void vlan_dev_uninit(struct net_device *dev)
kfree(pm);
}
}
+ lockdep_unregister_key(&vlan->addr_lock_key);
+ lockdep_unregister_key(&vlan->xmit_lock_key);
}
static netdev_features_t vlan_dev_fix_features(struct net_device *dev,
--
2.17.1
^ permalink raw reply related
* [PATCH net v2 01/11] net: core: limit nested device depth
From: Taehee Yoo @ 2019-09-07 13:45 UTC (permalink / raw)
To: davem, netdev, j.vosburgh, vfalico, andy, jiri, sd, roopa, saeedm,
manishc, rahulv, kys, haiyangz, sthemmin, sashal, hare, varun,
ubraun, kgraul, jay.vosburgh
Cc: ap420073
Current code doesn't limit the number of nested devices.
Nested devices would be handled recursively and this needs huge stack
memory. So, unlimited nested devices could make stack overflow.
This patch adds upper_level and lower_leve, they are common variables
and represent maximum lower/upper depth.
When upper/lower device is attached or dettached,
{lower/upper}_level are updated. and if maximum depth is bigger than 8,
attach routine fails and returns -EMLINK.
Test commands:
ip link add dummy0 type dummy
ip link add link dummy0 name vlan1 type vlan id 1
ip link set vlan1 up
for i in {2..100}
do
let A=$i-1
ip link add name vlan$i link vlan$A type vlan id $i
done
Splat looks like:
[ 140.483124] BUG: looking up invalid subclass: 8
[ 140.483505] turning off the locking correctness validator.
[ 140.483505] CPU: 0 PID: 1324 Comm: ip Not tainted 5.3.0-rc7+ #322
[ 140.483505] Hardware name: To be filled by O.E.M. To be filled by O.E.M./Aptio CRB, BIOS 5.6.5 07/08/2015
[ 140.483505] Call Trace:
[ 140.483505] dump_stack+0x7c/0xbb
[ 140.483505] register_lock_class+0x64d/0x14d0
[ 140.483505] ? is_dynamic_key+0x230/0x230
[ 140.483505] ? module_assert_mutex_or_preempt+0x41/0x70
[ 140.483505] ? __module_address+0x3f/0x3c0
[ 140.483505] lockdep_init_map+0x24e/0x630
[ 140.483505] vlan_dev_init+0x828/0xce0 [8021q]
[ 140.483505] register_netdevice+0x24f/0xd70
[ 140.483505] ? netdev_change_features+0xa0/0xa0
[ 140.483505] ? dev_get_nest_level+0xe1/0x170
[ 140.483505] register_vlan_dev+0x29b/0x710 [8021q]
[ 140.483505] __rtnl_newlink+0xb75/0x1180
[ ... ]
[ 168.446539] WARNING: can't dereference registers at 00000000bef3d701 for ip apic_timer_interrupt+0xf/0x20
[ 168.466843] ==================================================================
[ 168.469452] BUG: KASAN: slab-out-of-bounds in __unwind_start+0x71/0x850
[ 168.480707] Write of size 88 at addr ffff8880b8856d38 by task ip/1758
[ 168.480707]
[ 168.480707] CPU: 1 PID: 1758 Comm: ip Not tainted 5.3.0-rc7+ #322
[ ... ]
[ 168.794493] Rebooting in 5 seconds..
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
---
v1 -> v2 : this patch isn't changed
include/linux/netdevice.h | 4 ++
net/core/dev.c | 106 ++++++++++++++++++++++++++++++++++++++
2 files changed, 110 insertions(+)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 88292953aa6f..5bb5756129af 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1624,6 +1624,8 @@ enum netdev_priv_flags {
* @type: Interface hardware type
* @hard_header_len: Maximum hardware header length.
* @min_header_len: Minimum hardware header length
+ * @upper_level: Maximum depth level of upper devices.
+ * @lower_level: Maximum depth level of lower devices.
*
* @needed_headroom: Extra headroom the hardware may need, but not in all
* cases can this be guaranteed
@@ -1854,6 +1856,8 @@ struct net_device {
unsigned short type;
unsigned short hard_header_len;
unsigned char min_header_len;
+ unsigned char upper_level;
+ unsigned char lower_level;
unsigned short needed_headroom;
unsigned short needed_tailroom;
diff --git a/net/core/dev.c b/net/core/dev.c
index 0891f499c1bb..6a4b4ce62204 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -146,6 +146,7 @@
#include "net-sysfs.h"
#define MAX_GRO_SKBS 8
+#define MAX_NEST_DEV 8
/* This should be increased if a protocol with a bigger head is added. */
#define GRO_MAX_HEAD (MAX_HEADER + 128)
@@ -6602,6 +6603,21 @@ struct net_device *netdev_upper_get_next_dev_rcu(struct net_device *dev,
}
EXPORT_SYMBOL(netdev_upper_get_next_dev_rcu);
+static struct net_device *netdev_next_upper_dev(struct net_device *dev,
+ struct list_head **iter)
+{
+ struct netdev_adjacent *upper;
+
+ upper = list_entry((*iter)->next, struct netdev_adjacent, list);
+
+ if (&upper->list == &dev->adj_list.upper)
+ return NULL;
+
+ *iter = &upper->list;
+
+ return upper->dev;
+}
+
static struct net_device *netdev_next_upper_dev_rcu(struct net_device *dev,
struct list_head **iter)
{
@@ -6619,6 +6635,33 @@ static struct net_device *netdev_next_upper_dev_rcu(struct net_device *dev,
return upper->dev;
}
+int netdev_walk_all_upper_dev(struct net_device *dev,
+ int (*fn)(struct net_device *dev,
+ void *data),
+ void *data)
+{
+ struct net_device *udev;
+ struct list_head *iter;
+ int ret;
+
+ for (iter = &dev->adj_list.upper,
+ udev = netdev_next_upper_dev(dev, &iter);
+ udev;
+ udev = netdev_next_upper_dev(dev, &iter)) {
+ /* first is the upper device itself */
+ ret = fn(udev, data);
+ if (ret)
+ return ret;
+
+ /* then look at all of its upper devices */
+ ret = netdev_walk_all_upper_dev(udev, fn, data);
+ if (ret)
+ return ret;
+ }
+
+ return 0;
+}
+
int netdev_walk_all_upper_dev_rcu(struct net_device *dev,
int (*fn)(struct net_device *dev,
void *data),
@@ -6785,6 +6828,52 @@ static struct net_device *netdev_next_lower_dev_rcu(struct net_device *dev,
return lower->dev;
}
+static u8 __netdev_upper_depth(struct net_device *dev)
+{
+ struct net_device *udev;
+ struct list_head *iter;
+ u8 max_depth = 0;
+
+ for (iter = &dev->adj_list.upper,
+ udev = netdev_next_upper_dev(dev, &iter);
+ udev;
+ udev = netdev_next_upper_dev(dev, &iter)) {
+ if (max_depth < udev->upper_level)
+ max_depth = udev->upper_level;
+ }
+
+ return max_depth;
+}
+
+static u8 __netdev_lower_depth(struct net_device *dev)
+{
+ struct net_device *ldev;
+ struct list_head *iter;
+ u8 max_depth = 0;
+
+ for (iter = &dev->adj_list.lower,
+ ldev = netdev_next_lower_dev(dev, &iter);
+ ldev;
+ ldev = netdev_next_lower_dev(dev, &iter)) {
+ if (max_depth < ldev->lower_level)
+ max_depth = ldev->lower_level;
+ }
+
+ return max_depth;
+}
+
+static int __netdev_update_upper_level(struct net_device *dev, void *data)
+{
+ dev->upper_level = __netdev_upper_depth(dev) + 1;
+ return 0;
+}
+
+static int __netdev_update_lower_level(struct net_device *dev, void *data)
+{
+ dev->lower_level = __netdev_lower_depth(dev) + 1;
+ return 0;
+}
+
int netdev_walk_all_lower_dev_rcu(struct net_device *dev,
int (*fn)(struct net_device *dev,
void *data),
@@ -7063,6 +7152,9 @@ static int __netdev_upper_dev_link(struct net_device *dev,
if (netdev_has_upper_dev(upper_dev, dev))
return -EBUSY;
+ if ((dev->lower_level + upper_dev->upper_level) > MAX_NEST_DEV)
+ return -EMLINK;
+
if (!master) {
if (netdev_has_upper_dev(dev, upper_dev))
return -EEXIST;
@@ -7089,6 +7181,12 @@ static int __netdev_upper_dev_link(struct net_device *dev,
if (ret)
goto rollback;
+ __netdev_update_upper_level(dev, NULL);
+ netdev_walk_all_lower_dev(dev, __netdev_update_upper_level, NULL);
+
+ __netdev_update_lower_level(upper_dev, NULL);
+ netdev_walk_all_upper_dev(upper_dev, __netdev_update_lower_level, NULL);
+
return 0;
rollback:
@@ -7171,6 +7269,12 @@ void netdev_upper_dev_unlink(struct net_device *dev,
call_netdevice_notifiers_info(NETDEV_CHANGEUPPER,
&changeupper_info.info);
+
+ __netdev_update_upper_level(dev, NULL);
+ netdev_walk_all_lower_dev(dev, __netdev_update_upper_level, NULL);
+
+ __netdev_update_lower_level(upper_dev, NULL);
+ netdev_walk_all_upper_dev(upper_dev, __netdev_update_lower_level, NULL);
}
EXPORT_SYMBOL(netdev_upper_dev_unlink);
@@ -9157,6 +9261,8 @@ struct net_device *alloc_netdev_mqs(int sizeof_priv, const char *name,
dev->gso_max_size = GSO_MAX_SIZE;
dev->gso_max_segs = GSO_MAX_SEGS;
+ dev->upper_level = 1;
+ dev->lower_level = 1;
INIT_LIST_HEAD(&dev->napi_list);
INIT_LIST_HEAD(&dev->unreg_list);
--
2.17.1
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox