* [PATCH net-next] net: vxlan: when lower dev unregisters remove vxlan dev as well
@ 2014-01-10 13:01 Daniel Borkmann
2014-01-10 18:51 ` Cong Wang
2014-01-10 19:10 ` Stephen Hemminger
0 siblings, 2 replies; 7+ messages in thread
From: Daniel Borkmann @ 2014-01-10 13:01 UTC (permalink / raw)
To: davem; +Cc: netdev
We can create a vxlan device with an explicit underlying carrier.
In that case, when the carrier link is being deleted from the
system (e.g. due to module unload) we should also clean up all
created vxlan devices on top of it since otherwise we're in an
inconsistent state in vxlan device. In that case, the user needs
to remove all such devices, while in case of other virtual devs
that sit on top of physical ones, it is usually the case that
these devices do unregister automatically as well and do not
leave the burden on the user.
This work is not necessary when vxlan device was not created with
a real underlying device, as connections can resume in that case
when driver is plugged again. But at least for the other cases,
we should go ahead and do the cleanup on removal.
`ip -d link show vxlan13` after carrier removal before this patch:
5: vxlan13: <BROADCAST,MULTICAST> mtu 1450 qdisc noop state DOWN mode DEFAULT group default
link/ether 1e:47:da:6d:4d:99 brd ff:ff:ff:ff:ff:ff promiscuity 0
vxlan id 13 group 239.0.0.10 dev 2 port 32768 61000 ageing 300
^^^^^
While at it, we should also use vxlan_dellink() handler in
vxlan_exit_net() as otherwise we seem to leak memory on module
unload since only half of the cleanup is being done.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
---
drivers/net/vxlan.c | 70 +++++++++++++++++++++++++++++++++++++++++++++++------
1 file changed, 63 insertions(+), 7 deletions(-)
diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index 481f85d..39035ba 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -2656,6 +2656,54 @@ static struct rtnl_link_ops vxlan_link_ops __read_mostly = {
.fill_info = vxlan_fill_info,
};
+static void vxlan_handle_lowerdev_unregister(struct vxlan_net *vn,
+ struct net_device *dev)
+{
+ struct vxlan_dev *vxlan, *next;
+ LIST_HEAD(list_kill);
+
+ BUG_ON(!rtnl_is_locked());
+
+ list_for_each_entry_safe(vxlan, next, &vn->vxlan_list, next) {
+ struct vxlan_rdst *dst = &vxlan->default_dst;
+
+ /* In case we created vxlan device with carrier
+ * and we loose the carrier due to module unload
+ * we also need to remove vxlan device. In other
+ * cases, it's not necessary and remote_ifindex
+ * is 0 here, so no matches.
+ */
+ if (dst->remote_ifindex == dev->ifindex)
+ vxlan_dellink(vxlan->dev, &list_kill);
+ }
+
+ unregister_netdevice_many(&list_kill);
+ list_del(&list_kill);
+}
+
+static int vxlan_lowerdev_event(struct notifier_block *unused,
+ unsigned long event, void *ptr)
+{
+ struct net_device *dev = netdev_notifier_info_to_dev(ptr);
+ struct vxlan_net *vn = net_generic(dev_net(dev), vxlan_net_id);
+
+ switch (event) {
+ case NETDEV_UNREGISTER:
+ /* Twiddle thumbs on netns device moves. */
+ if (dev->reg_state != NETREG_UNREGISTERING)
+ break;
+
+ vxlan_handle_lowerdev_unregister(vn, dev);
+ break;
+ }
+
+ return NOTIFY_DONE;
+}
+
+static struct notifier_block vxlan_notifier_block __read_mostly = {
+ .notifier_call = vxlan_lowerdev_event,
+};
+
static __net_init int vxlan_init_net(struct net *net)
{
struct vxlan_net *vn = net_generic(net, vxlan_net_id);
@@ -2673,14 +2721,16 @@ static __net_init int vxlan_init_net(struct net *net)
static __net_exit void vxlan_exit_net(struct net *net)
{
struct vxlan_net *vn = net_generic(net, vxlan_net_id);
- struct vxlan_dev *vxlan;
- LIST_HEAD(list);
+ struct vxlan_dev *vxlan, *next;
+ LIST_HEAD(list_kill);
rtnl_lock();
- list_for_each_entry(vxlan, &vn->vxlan_list, next)
- unregister_netdevice_queue(vxlan->dev, &list);
- unregister_netdevice_many(&list);
+ list_for_each_entry_safe(vxlan, next, &vn->vxlan_list, next)
+ vxlan_dellink(vxlan->dev, &list_kill);
+ unregister_netdevice_many(&list_kill);
rtnl_unlock();
+
+ list_del(&list_kill);
}
static struct pernet_operations vxlan_net_ops = {
@@ -2704,12 +2754,17 @@ static int __init vxlan_init_module(void)
if (rc)
goto out1;
- rc = rtnl_link_register(&vxlan_link_ops);
+ rc = register_netdevice_notifier(&vxlan_notifier_block);
if (rc)
goto out2;
- return 0;
+ rc = rtnl_link_register(&vxlan_link_ops);
+ if (rc)
+ goto out3;
+ return 0;
+out3:
+ unregister_netdevice_notifier(&vxlan_notifier_block);
out2:
unregister_pernet_device(&vxlan_net_ops);
out1:
@@ -2721,6 +2776,7 @@ late_initcall(vxlan_init_module);
static void __exit vxlan_cleanup_module(void)
{
rtnl_link_unregister(&vxlan_link_ops);
+ unregister_netdevice_notifier(&vxlan_notifier_block);
destroy_workqueue(vxlan_wq);
unregister_pernet_device(&vxlan_net_ops);
rcu_barrier();
--
1.8.3.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH net-next] net: vxlan: when lower dev unregisters remove vxlan dev as well
2014-01-10 13:01 [PATCH net-next] net: vxlan: when lower dev unregisters remove vxlan dev as well Daniel Borkmann
@ 2014-01-10 18:51 ` Cong Wang
2014-01-10 19:07 ` Daniel Borkmann
2014-01-10 19:10 ` Stephen Hemminger
1 sibling, 1 reply; 7+ messages in thread
From: Cong Wang @ 2014-01-10 18:51 UTC (permalink / raw)
To: Daniel Borkmann; +Cc: David Miller, netdev
On Fri, Jan 10, 2014 at 5:01 AM, Daniel Borkmann <dborkman@redhat.com> wrote:
> We can create a vxlan device with an explicit underlying carrier.
> In that case, when the carrier link is being deleted from the
> system (e.g. due to module unload) we should also clean up all
> created vxlan devices on top of it since otherwise we're in an
> inconsistent state in vxlan device. In that case, the user needs
> to remove all such devices, while in case of other virtual devs
> that sit on top of physical ones, it is usually the case that
> these devices do unregister automatically as well and do not
> leave the burden on the user.
>
> This work is not necessary when vxlan device was not created with
> a real underlying device, as connections can resume in that case
> when driver is plugged again. But at least for the other cases,
> we should go ahead and do the cleanup on removal.
>
So it makes sense to move the notifier register to vxlan_newlink()
from vxlan_init_module()?
...
>
> diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
> index 481f85d..39035ba 100644
> --- a/drivers/net/vxlan.c
> +++ b/drivers/net/vxlan.c
> @@ -2656,6 +2656,54 @@ static struct rtnl_link_ops vxlan_link_ops __read_mostly = {
> .fill_info = vxlan_fill_info,
> };
>
> +static void vxlan_handle_lowerdev_unregister(struct vxlan_net *vn,
> + struct net_device *dev)
> +{
> + struct vxlan_dev *vxlan, *next;
> + LIST_HEAD(list_kill);
> +
> + BUG_ON(!rtnl_is_locked());
This is not necessary at all, it is known that netdev notication
holds rtnl lock.
> +
> + list_for_each_entry_safe(vxlan, next, &vn->vxlan_list, next) {
> + struct vxlan_rdst *dst = &vxlan->default_dst;
> +
> + /* In case we created vxlan device with carrier
> + * and we loose the carrier due to module unload
> + * we also need to remove vxlan device. In other
> + * cases, it's not necessary and remote_ifindex
> + * is 0 here, so no matches.
> + */
> + if (dst->remote_ifindex == dev->ifindex)
> + vxlan_dellink(vxlan->dev, &list_kill);
> + }
> +
> + unregister_netdevice_many(&list_kill);
> + list_del(&list_kill);
I think you need to flush fdb as well?
> static __net_init int vxlan_init_net(struct net *net)
> {
> struct vxlan_net *vn = net_generic(net, vxlan_net_id);
> @@ -2673,14 +2721,16 @@ static __net_init int vxlan_init_net(struct net *net)
> static __net_exit void vxlan_exit_net(struct net *net)
> {
> struct vxlan_net *vn = net_generic(net, vxlan_net_id);
> - struct vxlan_dev *vxlan;
> - LIST_HEAD(list);
> + struct vxlan_dev *vxlan, *next;
> + LIST_HEAD(list_kill);
>
> rtnl_lock();
> - list_for_each_entry(vxlan, &vn->vxlan_list, next)
> - unregister_netdevice_queue(vxlan->dev, &list);
> - unregister_netdevice_many(&list);
> + list_for_each_entry_safe(vxlan, next, &vn->vxlan_list, next)
> + vxlan_dellink(vxlan->dev, &list_kill);
> + unregister_netdevice_many(&list_kill);
> rtnl_unlock();
> +
> + list_del(&list_kill);
Why these changes in vxlan_exit_net()?
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net-next] net: vxlan: when lower dev unregisters remove vxlan dev as well
2014-01-10 18:51 ` Cong Wang
@ 2014-01-10 19:07 ` Daniel Borkmann
2014-01-10 19:12 ` Stephen Hemminger
2014-01-10 19:16 ` Daniel Borkmann
0 siblings, 2 replies; 7+ messages in thread
From: Daniel Borkmann @ 2014-01-10 19:07 UTC (permalink / raw)
To: Cong Wang; +Cc: David Miller, netdev
On 01/10/2014 07:51 PM, Cong Wang wrote:
> On Fri, Jan 10, 2014 at 5:01 AM, Daniel Borkmann <dborkman@redhat.com> wrote:
>> We can create a vxlan device with an explicit underlying carrier.
>> In that case, when the carrier link is being deleted from the
>> system (e.g. due to module unload) we should also clean up all
>> created vxlan devices on top of it since otherwise we're in an
>> inconsistent state in vxlan device. In that case, the user needs
>> to remove all such devices, while in case of other virtual devs
>> that sit on top of physical ones, it is usually the case that
>> these devices do unregister automatically as well and do not
>> leave the burden on the user.
>>
>> This work is not necessary when vxlan device was not created with
>> a real underlying device, as connections can resume in that case
>> when driver is plugged again. But at least for the other cases,
>> we should go ahead and do the cleanup on removal.
>
> So it makes sense to move the notifier register to vxlan_newlink()
> from vxlan_init_module()?
That is probably a design descision ... I didn't want to bloat the
struct vxlan_dev, and by this approach, we can simply batch removals
through unregister_netdevice_many(), that's all.
> ...
>>
>> diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
>> index 481f85d..39035ba 100644
>> --- a/drivers/net/vxlan.c
>> +++ b/drivers/net/vxlan.c
>> @@ -2656,6 +2656,54 @@ static struct rtnl_link_ops vxlan_link_ops __read_mostly = {
>> .fill_info = vxlan_fill_info,
>> };
>>
>> +static void vxlan_handle_lowerdev_unregister(struct vxlan_net *vn,
>> + struct net_device *dev)
>> +{
>> + struct vxlan_dev *vxlan, *next;
>> + LIST_HEAD(list_kill);
>> +
>> + BUG_ON(!rtnl_is_locked());
>
>
> This is not necessary at all, it is known that netdev notication
> holds rtnl lock.
We're not in fast-path, and if someone would call that function outside
of the notifier chain, it might be good to check if the lock was taken,
but if there's a strong opinion to not have that, I'll just remove it.
>> +
>> + list_for_each_entry_safe(vxlan, next, &vn->vxlan_list, next) {
>> + struct vxlan_rdst *dst = &vxlan->default_dst;
>> +
>> + /* In case we created vxlan device with carrier
>> + * and we loose the carrier due to module unload
>> + * we also need to remove vxlan device. In other
>> + * cases, it's not necessary and remote_ifindex
>> + * is 0 here, so no matches.
>> + */
>> + if (dst->remote_ifindex == dev->ifindex)
>> + vxlan_dellink(vxlan->dev, &list_kill);
>> + }
>> +
>> + unregister_netdevice_many(&list_kill);
>> + list_del(&list_kill);
>
>
> I think you need to flush fdb as well?
Seems this is not done through normal rtnl_ops->dellink(); vxlan_dellink()
is what we call there as well ...
>> static __net_init int vxlan_init_net(struct net *net)
>> {
>> struct vxlan_net *vn = net_generic(net, vxlan_net_id);
>> @@ -2673,14 +2721,16 @@ static __net_init int vxlan_init_net(struct net *net)
>> static __net_exit void vxlan_exit_net(struct net *net)
>> {
>> struct vxlan_net *vn = net_generic(net, vxlan_net_id);
>> - struct vxlan_dev *vxlan;
>> - LIST_HEAD(list);
>> + struct vxlan_dev *vxlan, *next;
>> + LIST_HEAD(list_kill);
>>
>> rtnl_lock();
>> - list_for_each_entry(vxlan, &vn->vxlan_list, next)
>> - unregister_netdevice_queue(vxlan->dev, &list);
>> - unregister_netdevice_many(&list);
>> + list_for_each_entry_safe(vxlan, next, &vn->vxlan_list, next)
>> + vxlan_dellink(vxlan->dev, &list_kill);
>> + unregister_netdevice_many(&list_kill);
>> rtnl_unlock();
>> +
>> + list_del(&list_kill);
>
>
> Why these changes in vxlan_exit_net()?
I think this is way cleaner; instead of only doing half the work, we
should remove it in the same way as we remove devices through normal
rtnl ops. In case we would do allocations in future that are being
freed during vxlan_dellink(), we would simply leak them here otherwise.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net-next] net: vxlan: when lower dev unregisters remove vxlan dev as well
2014-01-10 13:01 [PATCH net-next] net: vxlan: when lower dev unregisters remove vxlan dev as well Daniel Borkmann
2014-01-10 18:51 ` Cong Wang
@ 2014-01-10 19:10 ` Stephen Hemminger
1 sibling, 0 replies; 7+ messages in thread
From: Stephen Hemminger @ 2014-01-10 19:10 UTC (permalink / raw)
To: Daniel Borkmann; +Cc: davem, netdev
On Fri, 10 Jan 2014 14:01:47 +0100
Daniel Borkmann <dborkman@redhat.com> wrote:
> We can create a vxlan device with an explicit underlying carrier.
> In that case, when the carrier link is being deleted from the
> system (e.g. due to module unload) we should also clean up all
> created vxlan devices on top of it since otherwise we're in an
> inconsistent state in vxlan device. In that case, the user needs
> to remove all such devices, while in case of other virtual devs
> that sit on top of physical ones, it is usually the case that
> these devices do unregister automatically as well and do not
> leave the burden on the user.
>
> This work is not necessary when vxlan device was not created with
> a real underlying device, as connections can resume in that case
> when driver is plugged again. But at least for the other cases,
> we should go ahead and do the cleanup on removal.
>
> `ip -d link show vxlan13` after carrier removal before this patch:
>
> 5: vxlan13: <BROADCAST,MULTICAST> mtu 1450 qdisc noop state DOWN mode DEFAULT group default
> link/ether 1e:47:da:6d:4d:99 brd ff:ff:ff:ff:ff:ff promiscuity 0
> vxlan id 13 group 239.0.0.10 dev 2 port 32768 61000 ageing 300
> ^^^^^
> While at it, we should also use vxlan_dellink() handler in
> vxlan_exit_net() as otherwise we seem to leak memory on module
> unload since only half of the cleanup is being done.
>
> Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
The issue is generic to all tunnels
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net-next] net: vxlan: when lower dev unregisters remove vxlan dev as well
2014-01-10 19:07 ` Daniel Borkmann
@ 2014-01-10 19:12 ` Stephen Hemminger
2014-01-10 19:16 ` Daniel Borkmann
2014-01-10 19:16 ` Daniel Borkmann
1 sibling, 1 reply; 7+ messages in thread
From: Stephen Hemminger @ 2014-01-10 19:12 UTC (permalink / raw)
To: Daniel Borkmann; +Cc: Cong Wang, David Miller, netdev
On Fri, 10 Jan 2014 20:07:12 +0100
Daniel Borkmann <dborkman@redhat.com> wrote:
> >> + BUG_ON(!rtnl_is_locked());
> >
> >
> > This is not necessary at all, it is known that netdev notication
> > holds rtnl lock.
>
> We're not in fast-path, and if someone would call that function outside
> of the notifier chain, it might be good to check if the lock was taken,
> but if there's a strong opinion to not have that, I'll just remove it
First, the standard way to do this is ASSERT_RTNL()
Second, it is unnecessary. The function is local, only called through
notifier and notifiers always have RTNL held.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net-next] net: vxlan: when lower dev unregisters remove vxlan dev as well
2014-01-10 19:07 ` Daniel Borkmann
2014-01-10 19:12 ` Stephen Hemminger
@ 2014-01-10 19:16 ` Daniel Borkmann
1 sibling, 0 replies; 7+ messages in thread
From: Daniel Borkmann @ 2014-01-10 19:16 UTC (permalink / raw)
To: Cong Wang; +Cc: David Miller, netdev
On 01/10/2014 08:07 PM, Daniel Borkmann wrote:
> On 01/10/2014 07:51 PM, Cong Wang wrote:
>> On Fri, Jan 10, 2014 at 5:01 AM, Daniel Borkmann <dborkman@redhat.com> wrote:
>> I think you need to flush fdb as well?
>
> Seems this is not done through normal rtnl_ops->dellink(); vxlan_dellink()
> is what we call there as well ...
Ok, this is being flushed during ndo_stop().
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net-next] net: vxlan: when lower dev unregisters remove vxlan dev as well
2014-01-10 19:12 ` Stephen Hemminger
@ 2014-01-10 19:16 ` Daniel Borkmann
0 siblings, 0 replies; 7+ messages in thread
From: Daniel Borkmann @ 2014-01-10 19:16 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: Cong Wang, David Miller, netdev
On 01/10/2014 08:12 PM, Stephen Hemminger wrote:
> On Fri, 10 Jan 2014 20:07:12 +0100
> Daniel Borkmann <dborkman@redhat.com> wrote:
>
>>>> + BUG_ON(!rtnl_is_locked());
>>>
>>>
>>> This is not necessary at all, it is known that netdev notication
>>> holds rtnl lock.
>>
>> We're not in fast-path, and if someone would call that function outside
>> of the notifier chain, it might be good to check if the lock was taken,
>> but if there's a strong opinion to not have that, I'll just remove it
>
> First, the standard way to do this is ASSERT_RTNL()
>
> Second, it is unnecessary. The function is local, only called through
> notifier and notifiers always have RTNL held.
Ok, I'll just remove it, and resend if you're fine with this.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2014-01-10 19:16 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-01-10 13:01 [PATCH net-next] net: vxlan: when lower dev unregisters remove vxlan dev as well Daniel Borkmann
2014-01-10 18:51 ` Cong Wang
2014-01-10 19:07 ` Daniel Borkmann
2014-01-10 19:12 ` Stephen Hemminger
2014-01-10 19:16 ` Daniel Borkmann
2014-01-10 19:16 ` Daniel Borkmann
2014-01-10 19:10 ` Stephen Hemminger
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).