* [PATCH net-next] openvswitch: fix vport-netdev unregister
@ 2013-10-09 3:07 Alexei Starovoitov
2013-10-10 1:24 ` Jesse Gross
2013-10-10 3:02 ` Pravin Shelar
0 siblings, 2 replies; 16+ messages in thread
From: Alexei Starovoitov @ 2013-10-09 3:07 UTC (permalink / raw)
To: David S. Miller; +Cc: Jesse Gross, Pravin B Shelar, Jiri Pirko, dev, netdev
The combination of two commits
commit 8e4e1713e4
("openvswitch: Simplify datapath locking.")
and
commit 2537b4dd0a
("openvswitch:: link upper device for port devices")
introduced a bug where upper_dev wasn't unlinked upon
netdev_unregister notification
The following steps:
modprobe openvswitch
ovs-dpctl add-dp test
ip tuntap add dev tap1 mode tap
ovs-dpctl add-if test tap1
ip tuntap del dev tap1 mode tap
are causing multiple warnings:
[ 62.747557] gre: GRE over IPv4 demultiplexor driver
[ 62.749579] openvswitch: Open vSwitch switching datapath
[ 62.755087] device test entered promiscuous mode
[ 62.765911] device tap1 entered promiscuous mode
[ 62.766033] IPv6: ADDRCONF(NETDEV_UP): tap1: link is not ready
[ 62.769017] ------------[ cut here ]------------
[ 62.769022] WARNING: CPU: 1 PID: 3267 at net/core/dev.c:5501 rollback_registered_many+0x20f/0x240()
[ 62.769023] Modules linked in: openvswitch gre vxlan ip_tunnel libcrc32c ip6table_filter ip6_tables ebtable_nat ebtables nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack xt_CHECKSUM iptable_mangle ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge stp llc vhost_net macvtap macvlan vhost kvm_intel kvm dm_crypt iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi hid_generic mxm_wmi eeepc_wmi asus_wmi sparse_keymap dm_multipath psmouse serio_raw usbhid hid parport_pc ppdev firewire_ohci lpc_ich firewire_core e1000e crc_itu_t binfmt_misc igb dca ptp pps_core mac_hid wmi lp parport i2o_config i2o_block video
[ 62.769051] CPU: 1 PID: 3267 Comm: ip Not tainted 3.12.0-rc3+ #60
[ 62.769052] Hardware name: System manufacturer System Product Name/P8Z77 WS, BIOS 3007 07/26/2012
[ 62.769053] 0000000000000009 ffff8807f25cbd28 ffffffff8175e575 0000000000000006
[ 62.769055] 0000000000000000 ffff8807f25cbd68 ffffffff8105314c ffff8807f25cbd58
[ 62.769057] ffff8807f2634000 ffff8807f25cbdc8 ffff8807f25cbd88 ffff8807f25cbdc8
[ 62.769059] Call Trace:
[ 62.769062] [<ffffffff8175e575>] dump_stack+0x55/0x76
[ 62.769065] [<ffffffff8105314c>] warn_slowpath_common+0x8c/0xc0
[ 62.769067] [<ffffffff8105319a>] warn_slowpath_null+0x1a/0x20
[ 62.769069] [<ffffffff8162a04f>] rollback_registered_many+0x20f/0x240
[ 62.769071] [<ffffffff8162a101>] rollback_registered+0x31/0x40
[ 62.769073] [<ffffffff8162a488>] unregister_netdevice_queue+0x58/0x90
[ 62.769075] [<ffffffff8154f900>] __tun_detach+0x140/0x340
[ 62.769077] [<ffffffff8154fb36>] tun_chr_close+0x36/0x60
[ 62.769080] [<ffffffff811bddaf>] __fput+0xff/0x260
[ 62.769082] [<ffffffff811bdf5e>] ____fput+0xe/0x10
[ 62.769084] [<ffffffff8107b515>] task_work_run+0xb5/0xe0
[ 62.769087] [<ffffffff810029b9>] do_notify_resume+0x59/0x80
[ 62.769089] [<ffffffff813a41fe>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[ 62.769091] [<ffffffff81770f5a>] int_signal+0x12/0x17
[ 62.769093] ---[ end trace 838756c62e156ffb ]---
[ 62.769481] ------------[ cut here ]------------
[ 62.769485] WARNING: CPU: 1 PID: 92 at fs/sysfs/inode.c:325 sysfs_hash_and_remove+0xa9/0xb0()
[ 62.769486] sysfs: can not remove 'master', no directory
[ 62.769486] Modules linked in: openvswitch gre vxlan ip_tunnel libcrc32c ip6table_filter ip6_tables ebtable_nat ebtables nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack xt_CHECKSUM iptable_mangle ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge stp llc vhost_net macvtap macvlan vhost kvm_intel kvm dm_crypt iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi hid_generic mxm_wmi eeepc_wmi asus_wmi sparse_keymap dm_multipath psmouse serio_raw usbhid hid parport_pc ppdev firewire_ohci lpc_ich firewire_core e1000e crc_itu_t binfmt_misc igb dca ptp pps_core mac_hid wmi lp parport i2o_config i2o_block video
[ 62.769514] CPU: 1 PID: 92 Comm: kworker/1:2 Tainted: G W 3.12.0-rc3+ #60
[ 62.769515] Hardware name: System manufacturer System Product Name/P8Z77 WS, BIOS 3007 07/26/2012
[ 62.769518] Workqueue: events ovs_dp_notify_wq [openvswitch]
[ 62.769519] 0000000000000009 ffff880807ad3ac8 ffffffff8175e575 0000000000000006
[ 62.769521] ffff880807ad3b18 ffff880807ad3b08 ffffffff8105314c ffff880807ad3b28
[ 62.769523] 0000000000000000 ffffffff81a87a1f ffff8807f2634000 ffff880037038500
[ 62.769525] Call Trace:
[ 62.769528] [<ffffffff8175e575>] dump_stack+0x55/0x76
[ 62.769529] [<ffffffff8105314c>] warn_slowpath_common+0x8c/0xc0
[ 62.769531] [<ffffffff81053236>] warn_slowpath_fmt+0x46/0x50
[ 62.769533] [<ffffffff8123e7e9>] sysfs_hash_and_remove+0xa9/0xb0
[ 62.769535] [<ffffffff81240e96>] sysfs_remove_link+0x26/0x30
[ 62.769538] [<ffffffff81631ef7>] __netdev_adjacent_dev_remove+0xf7/0x150
[ 62.769540] [<ffffffff81632037>] __netdev_adjacent_dev_unlink_lists+0x27/0x50
[ 62.769542] [<ffffffff8163213a>] __netdev_adjacent_dev_unlink_neighbour+0x3a/0x50
[ 62.769544] [<ffffffff8163218d>] netdev_upper_dev_unlink+0x3d/0x140
[ 62.769548] [<ffffffffa033c2db>] netdev_destroy+0x4b/0x80 [openvswitch]
[ 62.769550] [<ffffffffa033b696>] ovs_vport_del+0x46/0x60 [openvswitch]
[ 62.769552] [<ffffffffa0335314>] ovs_dp_detach_port+0x44/0x60 [openvswitch]
[ 62.769555] [<ffffffffa0336574>] ovs_dp_notify_wq+0xb4/0x150 [openvswitch]
[ 62.769557] [<ffffffff81075c28>] process_one_work+0x1d8/0x6a0
[ 62.769559] [<ffffffff81075bc8>] ? process_one_work+0x178/0x6a0
[ 62.769562] [<ffffffff8107659b>] worker_thread+0x11b/0x370
[ 62.769564] [<ffffffff81076480>] ? rescuer_thread+0x350/0x350
[ 62.769566] [<ffffffff8107f44a>] kthread+0xea/0xf0
[ 62.769568] [<ffffffff8107f360>] ? flush_kthread_worker+0x150/0x150
[ 62.769570] [<ffffffff81770bac>] ret_from_fork+0x7c/0xb0
[ 62.769572] [<ffffffff8107f360>] ? flush_kthread_worker+0x150/0x150
[ 62.769573] ---[ end trace 838756c62e156ffc ]---
[ 62.769574] ------------[ cut here ]------------
[ 62.769576] WARNING: CPU: 1 PID: 92 at fs/sysfs/inode.c:325 sysfs_hash_and_remove+0xa9/0xb0()
[ 62.769577] sysfs: can not remove 'upper_test', no directory
[ 62.769577] Modules linked in: openvswitch gre vxlan ip_tunnel libcrc32c ip6table_filter ip6_tables ebtable_nat ebtables nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack xt_CHECKSUM iptable_mangle ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge stp llc vhost_net macvtap macvlan vhost kvm_intel kvm dm_crypt iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi hid_generic mxm_wmi eeepc_wmi asus_wmi sparse_keymap dm_multipath psmouse serio_raw usbhid hid parport_pc ppdev firewire_ohci lpc_ich firewire_core e1000e crc_itu_t binfmt_misc igb dca ptp pps_core mac_hid wmi lp parport i2o_config i2o_block video
[ 62.769603] CPU: 1 PID: 92 Comm: kworker/1:2 Tainted: G W 3.12.0-rc3+ #60
[ 62.769604] Hardware name: System manufacturer System Product Name/P8Z77 WS, BIOS 3007 07/26/2012
[ 62.769606] Workqueue: events ovs_dp_notify_wq [openvswitch]
[ 62.769607] 0000000000000009 ffff880807ad3ac8 ffffffff8175e575 0000000000000006
[ 62.769609] ffff880807ad3b18 ffff880807ad3b08 ffffffff8105314c ffff880807ad3b58
[ 62.769611] 0000000000000000 ffff880807ad3bd9 ffff8807f2634000 ffff880037038500
[ 62.769613] Call Trace:
[ 62.769615] [<ffffffff8175e575>] dump_stack+0x55/0x76
[ 62.769617] [<ffffffff8105314c>] warn_slowpath_common+0x8c/0xc0
[ 62.769619] [<ffffffff81053236>] warn_slowpath_fmt+0x46/0x50
[ 62.769621] [<ffffffff8123e7e9>] sysfs_hash_and_remove+0xa9/0xb0
[ 62.769622] [<ffffffff81240e96>] sysfs_remove_link+0x26/0x30
[ 62.769624] [<ffffffff81631f22>] __netdev_adjacent_dev_remove+0x122/0x150
[ 62.769627] [<ffffffff81632037>] __netdev_adjacent_dev_unlink_lists+0x27/0x50
[ 62.769629] [<ffffffff8163213a>] __netdev_adjacent_dev_unlink_neighbour+0x3a/0x50
[ 62.769631] [<ffffffff8163218d>] netdev_upper_dev_unlink+0x3d/0x140
[ 62.769633] [<ffffffffa033c2db>] netdev_destroy+0x4b/0x80 [openvswitch]
[ 62.769636] [<ffffffffa033b696>] ovs_vport_del+0x46/0x60 [openvswitch]
[ 62.769638] [<ffffffffa0335314>] ovs_dp_detach_port+0x44/0x60 [openvswitch]
[ 62.769640] [<ffffffffa0336574>] ovs_dp_notify_wq+0xb4/0x150 [openvswitch]
[ 62.769642] [<ffffffff81075c28>] process_one_work+0x1d8/0x6a0
[ 62.769644] [<ffffffff81075bc8>] ? process_one_work+0x178/0x6a0
[ 62.769646] [<ffffffff8107659b>] worker_thread+0x11b/0x370
[ 62.769648] [<ffffffff81076480>] ? rescuer_thread+0x350/0x350
[ 62.769650] [<ffffffff8107f44a>] kthread+0xea/0xf0
[ 62.769652] [<ffffffff8107f360>] ? flush_kthread_worker+0x150/0x150
[ 62.769654] [<ffffffff81770bac>] ret_from_fork+0x7c/0xb0
[ 62.769656] [<ffffffff8107f360>] ? flush_kthread_worker+0x150/0x150
[ 62.769657] ---[ end trace 838756c62e156ffd ]---
[ 62.769724] device tap1 left promiscuous mode
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
---
net/openvswitch/dp_notify.c | 5 +++++
net/openvswitch/vport-netdev.c | 16 +++++++++++++---
net/openvswitch/vport-netdev.h | 1 +
3 files changed, 19 insertions(+), 3 deletions(-)
diff --git a/net/openvswitch/dp_notify.c b/net/openvswitch/dp_notify.c
index c323567..e9380bd 100644
--- a/net/openvswitch/dp_notify.c
+++ b/net/openvswitch/dp_notify.c
@@ -88,6 +88,11 @@ static int dp_device_event(struct notifier_block *unused, unsigned long event,
return NOTIFY_DONE;
if (event == NETDEV_UNREGISTER) {
+ /* rx_handler_unregister and upper_dev_unlink immediately */
+ if (dev->reg_state == NETREG_UNREGISTERING)
+ ovs_netdev_unlink_dev(vport);
+
+ /* schedule vport destroy, dev_put and genl notification */
ovs_net = net_generic(dev_net(dev), ovs_net_id);
queue_work(system_wq, &ovs_net->dp_notify_work);
}
diff --git a/net/openvswitch/vport-netdev.c b/net/openvswitch/vport-netdev.c
index 09d93c1..cce933a 100644
--- a/net/openvswitch/vport-netdev.c
+++ b/net/openvswitch/vport-netdev.c
@@ -150,15 +150,25 @@ static void free_port_rcu(struct rcu_head *rcu)
ovs_vport_free(vport_from_priv(netdev_vport));
}
-static void netdev_destroy(struct vport *vport)
+void ovs_netdev_unlink_dev(struct vport *vport)
{
struct netdev_vport *netdev_vport = netdev_vport_priv(vport);
- rtnl_lock();
+ ASSERT_RTNL();
netdev_vport->dev->priv_flags &= ~IFF_OVS_DATAPATH;
netdev_rx_handler_unregister(netdev_vport->dev);
- netdev_upper_dev_unlink(netdev_vport->dev, get_dpdev(vport->dp));
+ netdev_upper_dev_unlink(netdev_vport->dev,
+ netdev_master_upper_dev_get(netdev_vport->dev));
dev_set_promiscuity(netdev_vport->dev, -1);
+}
+
+static void netdev_destroy(struct vport *vport)
+{
+ struct netdev_vport *netdev_vport = netdev_vport_priv(vport);
+
+ rtnl_lock();
+ if (netdev_vport->dev->reg_state != NETREG_UNREGISTERING)
+ ovs_netdev_unlink_dev(vport);
rtnl_unlock();
call_rcu(&netdev_vport->rcu, free_port_rcu);
diff --git a/net/openvswitch/vport-netdev.h b/net/openvswitch/vport-netdev.h
index dd298b5..21e3770 100644
--- a/net/openvswitch/vport-netdev.h
+++ b/net/openvswitch/vport-netdev.h
@@ -39,5 +39,6 @@ netdev_vport_priv(const struct vport *vport)
}
const char *ovs_netdev_get_name(const struct vport *);
+void ovs_netdev_unlink_dev(struct vport *);
#endif /* vport_netdev.h */
--
1.7.9.5
^ permalink raw reply related [flat|nested] 16+ messages in thread* Re: [PATCH net-next] openvswitch: fix vport-netdev unregister 2013-10-09 3:07 [PATCH net-next] openvswitch: fix vport-netdev unregister Alexei Starovoitov @ 2013-10-10 1:24 ` Jesse Gross 2013-10-10 3:02 ` Pravin Shelar 1 sibling, 0 replies; 16+ messages in thread From: Jesse Gross @ 2013-10-10 1:24 UTC (permalink / raw) To: Alexei Starovoitov Cc: David S. Miller, Pravin B Shelar, Jiri Pirko, dev@openvswitch.org, netdev On Tue, Oct 8, 2013 at 8:07 PM, Alexei Starovoitov <ast@plumgrid.com> wrote: > The combination of two commits > > commit 8e4e1713e4 > ("openvswitch: Simplify datapath locking.") > > and > > commit 2537b4dd0a > ("openvswitch:: link upper device for port devices") > > introduced a bug where upper_dev wasn't unlinked upon > netdev_unregister notification > > The following steps: > > modprobe openvswitch > ovs-dpctl add-dp test > ip tuntap add dev tap1 mode tap > ovs-dpctl add-if test tap1 > ip tuntap del dev tap1 mode tap > > are causing multiple warnings: [...] Hmm, you are right. Pravin mentioned some concerns about layering with this patch so I believe he's trying to think of a way to minimize that. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH net-next] openvswitch: fix vport-netdev unregister 2013-10-09 3:07 [PATCH net-next] openvswitch: fix vport-netdev unregister Alexei Starovoitov 2013-10-10 1:24 ` Jesse Gross @ 2013-10-10 3:02 ` Pravin Shelar 2013-10-10 4:11 ` Alexei Starovoitov 1 sibling, 1 reply; 16+ messages in thread From: Pravin Shelar @ 2013-10-10 3:02 UTC (permalink / raw) To: Alexei Starovoitov Cc: David S. Miller, Jesse Gross, Jiri Pirko, dev@openvswitch.org, netdev On Tue, Oct 8, 2013 at 8:07 PM, Alexei Starovoitov <ast@plumgrid.com> wrote: > The combination of two commits > > commit 8e4e1713e4 > ("openvswitch: Simplify datapath locking.") > > and > > commit 2537b4dd0a > ("openvswitch:: link upper device for port devices") > > introduced a bug where upper_dev wasn't unlinked upon > netdev_unregister notification > > The following steps: > > modprobe openvswitch > ovs-dpctl add-dp test > ip tuntap add dev tap1 mode tap > ovs-dpctl add-if test tap1 > ip tuntap del dev tap1 mode tap > > are causing multiple warnings: > > [ 62.747557] gre: GRE over IPv4 demultiplexor driver > [ 62.749579] openvswitch: Open vSwitch switching datapath > [ 62.755087] device test entered promiscuous mode > [ 62.765911] device tap1 entered promiscuous mode > [ 62.766033] IPv6: ADDRCONF(NETDEV_UP): tap1: link is not ready > [ 62.769017] ------------[ cut here ]------------ > [ 62.769022] WARNING: CPU: 1 PID: 3267 at net/core/dev.c:5501 rollback_registered_many+0x20f/0x240() > [ 62.769023] Modules linked in: openvswitch gre vxlan ip_tunnel libcrc32c ip6table_filter ip6_tables ebtable_nat ebtables nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack xt_CHECKSUM iptable_mangle ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge stp llc vhost_net macvtap macvlan vhost kvm_intel kvm dm_crypt iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi hid_generic mxm_wmi eeepc_wmi asus_wmi sparse_keymap dm_multipath psmouse serio_raw usbhid hid parport_pc ppdev firewire_ohci lpc_ich firewire_core e1000e crc_itu_t binfmt_misc igb dca ptp pps_core mac_hid wmi lp parport i2o_config i2o_block video > [ 62.769051] CPU: 1 PID: 3267 Comm: ip Not tainted 3.12.0-rc3+ #60 > [ 62.769052] Hardware name: System manufacturer System Product Name/P8Z77 WS, BIOS 3007 07/26/2012 > [ 62.769053] 0000000000000009 ffff8807f25cbd28 ffffffff8175e575 0000000000000006 > [ 62.769055] 0000000000000000 ffff8807f25cbd68 ffffffff8105314c ffff8807f25cbd58 > [ 62.769057] ffff8807f2634000 ffff8807f25cbdc8 ffff8807f25cbd88 ffff8807f25cbdc8 > [ 62.769059] Call Trace: > [ 62.769062] [<ffffffff8175e575>] dump_stack+0x55/0x76 > [ 62.769065] [<ffffffff8105314c>] warn_slowpath_common+0x8c/0xc0 > [ 62.769067] [<ffffffff8105319a>] warn_slowpath_null+0x1a/0x20 > [ 62.769069] [<ffffffff8162a04f>] rollback_registered_many+0x20f/0x240 > [ 62.769071] [<ffffffff8162a101>] rollback_registered+0x31/0x40 > [ 62.769073] [<ffffffff8162a488>] unregister_netdevice_queue+0x58/0x90 > [ 62.769075] [<ffffffff8154f900>] __tun_detach+0x140/0x340 > [ 62.769077] [<ffffffff8154fb36>] tun_chr_close+0x36/0x60 > [ 62.769080] [<ffffffff811bddaf>] __fput+0xff/0x260 > [ 62.769082] [<ffffffff811bdf5e>] ____fput+0xe/0x10 > [ 62.769084] [<ffffffff8107b515>] task_work_run+0xb5/0xe0 > [ 62.769087] [<ffffffff810029b9>] do_notify_resume+0x59/0x80 > [ 62.769089] [<ffffffff813a41fe>] ? trace_hardirqs_on_thunk+0x3a/0x3f > [ 62.769091] [<ffffffff81770f5a>] int_signal+0x12/0x17 > [ 62.769093] ---[ end trace 838756c62e156ffb ]--- > [ 62.769481] ------------[ cut here ]------------ > [ 62.769485] WARNING: CPU: 1 PID: 92 at fs/sysfs/inode.c:325 sysfs_hash_and_remove+0xa9/0xb0() > [ 62.769486] sysfs: can not remove 'master', no directory > [ 62.769486] Modules linked in: openvswitch gre vxlan ip_tunnel libcrc32c ip6table_filter ip6_tables ebtable_nat ebtables nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack xt_CHECKSUM iptable_mangle ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge stp llc vhost_net macvtap macvlan vhost kvm_intel kvm dm_crypt iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi hid_generic mxm_wmi eeepc_wmi asus_wmi sparse_keymap dm_multipath psmouse serio_raw usbhid hid parport_pc ppdev firewire_ohci lpc_ich firewire_core e1000e crc_itu_t binfmt_misc igb dca ptp pps_core mac_hid wmi lp parport i2o_config i2o_block video > [ 62.769514] CPU: 1 PID: 92 Comm: kworker/1:2 Tainted: G W 3.12.0-rc3+ #60 > [ 62.769515] Hardware name: System manufacturer System Product Name/P8Z77 WS, BIOS 3007 07/26/2012 > [ 62.769518] Workqueue: events ovs_dp_notify_wq [openvswitch] > [ 62.769519] 0000000000000009 ffff880807ad3ac8 ffffffff8175e575 0000000000000006 > [ 62.769521] ffff880807ad3b18 ffff880807ad3b08 ffffffff8105314c ffff880807ad3b28 > [ 62.769523] 0000000000000000 ffffffff81a87a1f ffff8807f2634000 ffff880037038500 > [ 62.769525] Call Trace: > [ 62.769528] [<ffffffff8175e575>] dump_stack+0x55/0x76 > [ 62.769529] [<ffffffff8105314c>] warn_slowpath_common+0x8c/0xc0 > [ 62.769531] [<ffffffff81053236>] warn_slowpath_fmt+0x46/0x50 > [ 62.769533] [<ffffffff8123e7e9>] sysfs_hash_and_remove+0xa9/0xb0 > [ 62.769535] [<ffffffff81240e96>] sysfs_remove_link+0x26/0x30 > [ 62.769538] [<ffffffff81631ef7>] __netdev_adjacent_dev_remove+0xf7/0x150 > [ 62.769540] [<ffffffff81632037>] __netdev_adjacent_dev_unlink_lists+0x27/0x50 > [ 62.769542] [<ffffffff8163213a>] __netdev_adjacent_dev_unlink_neighbour+0x3a/0x50 > [ 62.769544] [<ffffffff8163218d>] netdev_upper_dev_unlink+0x3d/0x140 > [ 62.769548] [<ffffffffa033c2db>] netdev_destroy+0x4b/0x80 [openvswitch] > [ 62.769550] [<ffffffffa033b696>] ovs_vport_del+0x46/0x60 [openvswitch] > [ 62.769552] [<ffffffffa0335314>] ovs_dp_detach_port+0x44/0x60 [openvswitch] > [ 62.769555] [<ffffffffa0336574>] ovs_dp_notify_wq+0xb4/0x150 [openvswitch] > [ 62.769557] [<ffffffff81075c28>] process_one_work+0x1d8/0x6a0 > [ 62.769559] [<ffffffff81075bc8>] ? process_one_work+0x178/0x6a0 > [ 62.769562] [<ffffffff8107659b>] worker_thread+0x11b/0x370 > [ 62.769564] [<ffffffff81076480>] ? rescuer_thread+0x350/0x350 > [ 62.769566] [<ffffffff8107f44a>] kthread+0xea/0xf0 > [ 62.769568] [<ffffffff8107f360>] ? flush_kthread_worker+0x150/0x150 > [ 62.769570] [<ffffffff81770bac>] ret_from_fork+0x7c/0xb0 > [ 62.769572] [<ffffffff8107f360>] ? flush_kthread_worker+0x150/0x150 > [ 62.769573] ---[ end trace 838756c62e156ffc ]--- > [ 62.769574] ------------[ cut here ]------------ > [ 62.769576] WARNING: CPU: 1 PID: 92 at fs/sysfs/inode.c:325 sysfs_hash_and_remove+0xa9/0xb0() > [ 62.769577] sysfs: can not remove 'upper_test', no directory > [ 62.769577] Modules linked in: openvswitch gre vxlan ip_tunnel libcrc32c ip6table_filter ip6_tables ebtable_nat ebtables nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack xt_CHECKSUM iptable_mangle ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge stp llc vhost_net macvtap macvlan vhost kvm_intel kvm dm_crypt iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi hid_generic mxm_wmi eeepc_wmi asus_wmi sparse_keymap dm_multipath psmouse serio_raw usbhid hid parport_pc ppdev firewire_ohci lpc_ich firewire_core e1000e crc_itu_t binfmt_misc igb dca ptp pps_core mac_hid wmi lp parport i2o_config i2o_block video > [ 62.769603] CPU: 1 PID: 92 Comm: kworker/1:2 Tainted: G W 3.12.0-rc3+ #60 > [ 62.769604] Hardware name: System manufacturer System Product Name/P8Z77 WS, BIOS 3007 07/26/2012 > [ 62.769606] Workqueue: events ovs_dp_notify_wq [openvswitch] > [ 62.769607] 0000000000000009 ffff880807ad3ac8 ffffffff8175e575 0000000000000006 > [ 62.769609] ffff880807ad3b18 ffff880807ad3b08 ffffffff8105314c ffff880807ad3b58 > [ 62.769611] 0000000000000000 ffff880807ad3bd9 ffff8807f2634000 ffff880037038500 > [ 62.769613] Call Trace: > [ 62.769615] [<ffffffff8175e575>] dump_stack+0x55/0x76 > [ 62.769617] [<ffffffff8105314c>] warn_slowpath_common+0x8c/0xc0 > [ 62.769619] [<ffffffff81053236>] warn_slowpath_fmt+0x46/0x50 > [ 62.769621] [<ffffffff8123e7e9>] sysfs_hash_and_remove+0xa9/0xb0 > [ 62.769622] [<ffffffff81240e96>] sysfs_remove_link+0x26/0x30 > [ 62.769624] [<ffffffff81631f22>] __netdev_adjacent_dev_remove+0x122/0x150 > [ 62.769627] [<ffffffff81632037>] __netdev_adjacent_dev_unlink_lists+0x27/0x50 > [ 62.769629] [<ffffffff8163213a>] __netdev_adjacent_dev_unlink_neighbour+0x3a/0x50 > [ 62.769631] [<ffffffff8163218d>] netdev_upper_dev_unlink+0x3d/0x140 > [ 62.769633] [<ffffffffa033c2db>] netdev_destroy+0x4b/0x80 [openvswitch] > [ 62.769636] [<ffffffffa033b696>] ovs_vport_del+0x46/0x60 [openvswitch] > [ 62.769638] [<ffffffffa0335314>] ovs_dp_detach_port+0x44/0x60 [openvswitch] > [ 62.769640] [<ffffffffa0336574>] ovs_dp_notify_wq+0xb4/0x150 [openvswitch] > [ 62.769642] [<ffffffff81075c28>] process_one_work+0x1d8/0x6a0 > [ 62.769644] [<ffffffff81075bc8>] ? process_one_work+0x178/0x6a0 > [ 62.769646] [<ffffffff8107659b>] worker_thread+0x11b/0x370 > [ 62.769648] [<ffffffff81076480>] ? rescuer_thread+0x350/0x350 > [ 62.769650] [<ffffffff8107f44a>] kthread+0xea/0xf0 > [ 62.769652] [<ffffffff8107f360>] ? flush_kthread_worker+0x150/0x150 > [ 62.769654] [<ffffffff81770bac>] ret_from_fork+0x7c/0xb0 > [ 62.769656] [<ffffffff8107f360>] ? flush_kthread_worker+0x150/0x150 > [ 62.769657] ---[ end trace 838756c62e156ffd ]--- > [ 62.769724] device tap1 left promiscuous mode > > Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> > --- > net/openvswitch/dp_notify.c | 5 +++++ > net/openvswitch/vport-netdev.c | 16 +++++++++++++--- > net/openvswitch/vport-netdev.h | 1 + > 3 files changed, 19 insertions(+), 3 deletions(-) > > diff --git a/net/openvswitch/dp_notify.c b/net/openvswitch/dp_notify.c > index c323567..e9380bd 100644 > --- a/net/openvswitch/dp_notify.c > +++ b/net/openvswitch/dp_notify.c > @@ -88,6 +88,11 @@ static int dp_device_event(struct notifier_block *unused, unsigned long event, > return NOTIFY_DONE; > > if (event == NETDEV_UNREGISTER) { > + /* rx_handler_unregister and upper_dev_unlink immediately */ > + if (dev->reg_state == NETREG_UNREGISTERING) > + ovs_netdev_unlink_dev(vport); > + Rather than doing vport destroy here, we can just unlink upper device and let workq do rest of work. Thanks. > + /* schedule vport destroy, dev_put and genl notification */ > ovs_net = net_generic(dev_net(dev), ovs_net_id); > queue_work(system_wq, &ovs_net->dp_notify_work); > } > diff --git a/net/openvswitch/vport-netdev.c b/net/openvswitch/vport-netdev.c > index 09d93c1..cce933a 100644 > --- a/net/openvswitch/vport-netdev.c > +++ b/net/openvswitch/vport-netdev.c > @@ -150,15 +150,25 @@ static void free_port_rcu(struct rcu_head *rcu) > ovs_vport_free(vport_from_priv(netdev_vport)); > } > > -static void netdev_destroy(struct vport *vport) > +void ovs_netdev_unlink_dev(struct vport *vport) > { > struct netdev_vport *netdev_vport = netdev_vport_priv(vport); > > - rtnl_lock(); > + ASSERT_RTNL(); > netdev_vport->dev->priv_flags &= ~IFF_OVS_DATAPATH; > netdev_rx_handler_unregister(netdev_vport->dev); > - netdev_upper_dev_unlink(netdev_vport->dev, get_dpdev(vport->dp)); > + netdev_upper_dev_unlink(netdev_vport->dev, > + netdev_master_upper_dev_get(netdev_vport->dev)); > dev_set_promiscuity(netdev_vport->dev, -1); > +} > + > +static void netdev_destroy(struct vport *vport) > +{ > + struct netdev_vport *netdev_vport = netdev_vport_priv(vport); > + > + rtnl_lock(); > + if (netdev_vport->dev->reg_state != NETREG_UNREGISTERING) > + ovs_netdev_unlink_dev(vport); > rtnl_unlock(); > > call_rcu(&netdev_vport->rcu, free_port_rcu); > diff --git a/net/openvswitch/vport-netdev.h b/net/openvswitch/vport-netdev.h > index dd298b5..21e3770 100644 > --- a/net/openvswitch/vport-netdev.h > +++ b/net/openvswitch/vport-netdev.h > @@ -39,5 +39,6 @@ netdev_vport_priv(const struct vport *vport) > } > > const char *ovs_netdev_get_name(const struct vport *); > +void ovs_netdev_unlink_dev(struct vport *); > > #endif /* vport_netdev.h */ > -- > 1.7.9.5 > ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH net-next] openvswitch: fix vport-netdev unregister 2013-10-10 3:02 ` Pravin Shelar @ 2013-10-10 4:11 ` Alexei Starovoitov 2013-10-10 6:07 ` Pravin Shelar 0 siblings, 1 reply; 16+ messages in thread From: Alexei Starovoitov @ 2013-10-10 4:11 UTC (permalink / raw) To: Pravin Shelar Cc: David S. Miller, Jesse Gross, Jiri Pirko, dev@openvswitch.org, netdev On Wed, Oct 9, 2013 at 8:02 PM, Pravin Shelar <pshelar@nicira.com> wrote: > On Tue, Oct 8, 2013 at 8:07 PM, Alexei Starovoitov <ast@plumgrid.com> wrote: >> The combination of two commits >> >> commit 8e4e1713e4 >> ("openvswitch: Simplify datapath locking.") >> >> and >> >> commit 2537b4dd0a >> ("openvswitch:: link upper device for port devices") >> >> introduced a bug where upper_dev wasn't unlinked upon >> netdev_unregister notification >> >> The following steps: >> >> modprobe openvswitch >> ovs-dpctl add-dp test >> ip tuntap add dev tap1 mode tap >> ovs-dpctl add-if test tap1 >> ip tuntap del dev tap1 mode tap >> >> are causing multiple warnings: >> >> [ 62.747557] gre: GRE over IPv4 demultiplexor driver >> [ 62.749579] openvswitch: Open vSwitch switching datapath >> [ 62.755087] device test entered promiscuous mode >> [ 62.765911] device tap1 entered promiscuous mode >> [ 62.766033] IPv6: ADDRCONF(NETDEV_UP): tap1: link is not ready >> [ 62.769017] ------------[ cut here ]------------ >> [ 62.769022] WARNING: CPU: 1 PID: 3267 at net/core/dev.c:5501 rollback_registered_many+0x20f/0x240() >> [ 62.769023] Modules linked in: openvswitch gre vxlan ip_tunnel libcrc32c ip6table_filter ip6_tables ebtable_nat ebtables nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack xt_CHECKSUM iptable_mangle ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge stp llc vhost_net macvtap macvlan vhost kvm_intel kvm dm_crypt iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi hid_generic mxm_wmi eeepc_wmi asus_wmi sparse_keymap dm_multipath psmouse serio_raw usbhid hid parport_pc ppdev firewire_ohci lpc_ich firewire_core e1000e crc_itu_t binfmt_misc igb dca ptp pps_core mac_hid wmi lp parport i2o_config i2o_block video >> [ 62.769051] CPU: 1 PID: 3267 Comm: ip Not tainted 3.12.0-rc3+ #60 >> [ 62.769052] Hardware name: System manufacturer System Product Name/P8Z77 WS, BIOS 3007 07/26/2012 >> [ 62.769053] 0000000000000009 ffff8807f25cbd28 ffffffff8175e575 0000000000000006 >> [ 62.769055] 0000000000000000 ffff8807f25cbd68 ffffffff8105314c ffff8807f25cbd58 >> [ 62.769057] ffff8807f2634000 ffff8807f25cbdc8 ffff8807f25cbd88 ffff8807f25cbdc8 >> [ 62.769059] Call Trace: >> [ 62.769062] [<ffffffff8175e575>] dump_stack+0x55/0x76 >> [ 62.769065] [<ffffffff8105314c>] warn_slowpath_common+0x8c/0xc0 >> [ 62.769067] [<ffffffff8105319a>] warn_slowpath_null+0x1a/0x20 >> [ 62.769069] [<ffffffff8162a04f>] rollback_registered_many+0x20f/0x240 >> [ 62.769071] [<ffffffff8162a101>] rollback_registered+0x31/0x40 >> [ 62.769073] [<ffffffff8162a488>] unregister_netdevice_queue+0x58/0x90 >> [ 62.769075] [<ffffffff8154f900>] __tun_detach+0x140/0x340 >> [ 62.769077] [<ffffffff8154fb36>] tun_chr_close+0x36/0x60 >> [ 62.769080] [<ffffffff811bddaf>] __fput+0xff/0x260 >> [ 62.769082] [<ffffffff811bdf5e>] ____fput+0xe/0x10 >> [ 62.769084] [<ffffffff8107b515>] task_work_run+0xb5/0xe0 >> [ 62.769087] [<ffffffff810029b9>] do_notify_resume+0x59/0x80 >> [ 62.769089] [<ffffffff813a41fe>] ? trace_hardirqs_on_thunk+0x3a/0x3f >> [ 62.769091] [<ffffffff81770f5a>] int_signal+0x12/0x17 >> [ 62.769093] ---[ end trace 838756c62e156ffb ]--- >> [ 62.769481] ------------[ cut here ]------------ >> [ 62.769485] WARNING: CPU: 1 PID: 92 at fs/sysfs/inode.c:325 sysfs_hash_and_remove+0xa9/0xb0() >> [ 62.769486] sysfs: can not remove 'master', no directory >> [ 62.769486] Modules linked in: openvswitch gre vxlan ip_tunnel libcrc32c ip6table_filter ip6_tables ebtable_nat ebtables nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack xt_CHECKSUM iptable_mangle ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge stp llc vhost_net macvtap macvlan vhost kvm_intel kvm dm_crypt iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi hid_generic mxm_wmi eeepc_wmi asus_wmi sparse_keymap dm_multipath psmouse serio_raw usbhid hid parport_pc ppdev firewire_ohci lpc_ich firewire_core e1000e crc_itu_t binfmt_misc igb dca ptp pps_core mac_hid wmi lp parport i2o_config i2o_block video >> [ 62.769514] CPU: 1 PID: 92 Comm: kworker/1:2 Tainted: G W 3.12.0-rc3+ #60 >> [ 62.769515] Hardware name: System manufacturer System Product Name/P8Z77 WS, BIOS 3007 07/26/2012 >> [ 62.769518] Workqueue: events ovs_dp_notify_wq [openvswitch] >> [ 62.769519] 0000000000000009 ffff880807ad3ac8 ffffffff8175e575 0000000000000006 >> [ 62.769521] ffff880807ad3b18 ffff880807ad3b08 ffffffff8105314c ffff880807ad3b28 >> [ 62.769523] 0000000000000000 ffffffff81a87a1f ffff8807f2634000 ffff880037038500 >> [ 62.769525] Call Trace: >> [ 62.769528] [<ffffffff8175e575>] dump_stack+0x55/0x76 >> [ 62.769529] [<ffffffff8105314c>] warn_slowpath_common+0x8c/0xc0 >> [ 62.769531] [<ffffffff81053236>] warn_slowpath_fmt+0x46/0x50 >> [ 62.769533] [<ffffffff8123e7e9>] sysfs_hash_and_remove+0xa9/0xb0 >> [ 62.769535] [<ffffffff81240e96>] sysfs_remove_link+0x26/0x30 >> [ 62.769538] [<ffffffff81631ef7>] __netdev_adjacent_dev_remove+0xf7/0x150 >> [ 62.769540] [<ffffffff81632037>] __netdev_adjacent_dev_unlink_lists+0x27/0x50 >> [ 62.769542] [<ffffffff8163213a>] __netdev_adjacent_dev_unlink_neighbour+0x3a/0x50 >> [ 62.769544] [<ffffffff8163218d>] netdev_upper_dev_unlink+0x3d/0x140 >> [ 62.769548] [<ffffffffa033c2db>] netdev_destroy+0x4b/0x80 [openvswitch] >> [ 62.769550] [<ffffffffa033b696>] ovs_vport_del+0x46/0x60 [openvswitch] >> [ 62.769552] [<ffffffffa0335314>] ovs_dp_detach_port+0x44/0x60 [openvswitch] >> [ 62.769555] [<ffffffffa0336574>] ovs_dp_notify_wq+0xb4/0x150 [openvswitch] >> [ 62.769557] [<ffffffff81075c28>] process_one_work+0x1d8/0x6a0 >> [ 62.769559] [<ffffffff81075bc8>] ? process_one_work+0x178/0x6a0 >> [ 62.769562] [<ffffffff8107659b>] worker_thread+0x11b/0x370 >> [ 62.769564] [<ffffffff81076480>] ? rescuer_thread+0x350/0x350 >> [ 62.769566] [<ffffffff8107f44a>] kthread+0xea/0xf0 >> [ 62.769568] [<ffffffff8107f360>] ? flush_kthread_worker+0x150/0x150 >> [ 62.769570] [<ffffffff81770bac>] ret_from_fork+0x7c/0xb0 >> [ 62.769572] [<ffffffff8107f360>] ? flush_kthread_worker+0x150/0x150 >> [ 62.769573] ---[ end trace 838756c62e156ffc ]--- >> [ 62.769574] ------------[ cut here ]------------ >> [ 62.769576] WARNING: CPU: 1 PID: 92 at fs/sysfs/inode.c:325 sysfs_hash_and_remove+0xa9/0xb0() >> [ 62.769577] sysfs: can not remove 'upper_test', no directory >> [ 62.769577] Modules linked in: openvswitch gre vxlan ip_tunnel libcrc32c ip6table_filter ip6_tables ebtable_nat ebtables nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack xt_CHECKSUM iptable_mangle ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge stp llc vhost_net macvtap macvlan vhost kvm_intel kvm dm_crypt iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi hid_generic mxm_wmi eeepc_wmi asus_wmi sparse_keymap dm_multipath psmouse serio_raw usbhid hid parport_pc ppdev firewire_ohci lpc_ich firewire_core e1000e crc_itu_t binfmt_misc igb dca ptp pps_core mac_hid wmi lp parport i2o_config i2o_block video >> [ 62.769603] CPU: 1 PID: 92 Comm: kworker/1:2 Tainted: G W 3.12.0-rc3+ #60 >> [ 62.769604] Hardware name: System manufacturer System Product Name/P8Z77 WS, BIOS 3007 07/26/2012 >> [ 62.769606] Workqueue: events ovs_dp_notify_wq [openvswitch] >> [ 62.769607] 0000000000000009 ffff880807ad3ac8 ffffffff8175e575 0000000000000006 >> [ 62.769609] ffff880807ad3b18 ffff880807ad3b08 ffffffff8105314c ffff880807ad3b58 >> [ 62.769611] 0000000000000000 ffff880807ad3bd9 ffff8807f2634000 ffff880037038500 >> [ 62.769613] Call Trace: >> [ 62.769615] [<ffffffff8175e575>] dump_stack+0x55/0x76 >> [ 62.769617] [<ffffffff8105314c>] warn_slowpath_common+0x8c/0xc0 >> [ 62.769619] [<ffffffff81053236>] warn_slowpath_fmt+0x46/0x50 >> [ 62.769621] [<ffffffff8123e7e9>] sysfs_hash_and_remove+0xa9/0xb0 >> [ 62.769622] [<ffffffff81240e96>] sysfs_remove_link+0x26/0x30 >> [ 62.769624] [<ffffffff81631f22>] __netdev_adjacent_dev_remove+0x122/0x150 >> [ 62.769627] [<ffffffff81632037>] __netdev_adjacent_dev_unlink_lists+0x27/0x50 >> [ 62.769629] [<ffffffff8163213a>] __netdev_adjacent_dev_unlink_neighbour+0x3a/0x50 >> [ 62.769631] [<ffffffff8163218d>] netdev_upper_dev_unlink+0x3d/0x140 >> [ 62.769633] [<ffffffffa033c2db>] netdev_destroy+0x4b/0x80 [openvswitch] >> [ 62.769636] [<ffffffffa033b696>] ovs_vport_del+0x46/0x60 [openvswitch] >> [ 62.769638] [<ffffffffa0335314>] ovs_dp_detach_port+0x44/0x60 [openvswitch] >> [ 62.769640] [<ffffffffa0336574>] ovs_dp_notify_wq+0xb4/0x150 [openvswitch] >> [ 62.769642] [<ffffffff81075c28>] process_one_work+0x1d8/0x6a0 >> [ 62.769644] [<ffffffff81075bc8>] ? process_one_work+0x178/0x6a0 >> [ 62.769646] [<ffffffff8107659b>] worker_thread+0x11b/0x370 >> [ 62.769648] [<ffffffff81076480>] ? rescuer_thread+0x350/0x350 >> [ 62.769650] [<ffffffff8107f44a>] kthread+0xea/0xf0 >> [ 62.769652] [<ffffffff8107f360>] ? flush_kthread_worker+0x150/0x150 >> [ 62.769654] [<ffffffff81770bac>] ret_from_fork+0x7c/0xb0 >> [ 62.769656] [<ffffffff8107f360>] ? flush_kthread_worker+0x150/0x150 >> [ 62.769657] ---[ end trace 838756c62e156ffd ]--- >> [ 62.769724] device tap1 left promiscuous mode >> >> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> >> --- >> net/openvswitch/dp_notify.c | 5 +++++ >> net/openvswitch/vport-netdev.c | 16 +++++++++++++--- >> net/openvswitch/vport-netdev.h | 1 + >> 3 files changed, 19 insertions(+), 3 deletions(-) >> >> diff --git a/net/openvswitch/dp_notify.c b/net/openvswitch/dp_notify.c >> index c323567..e9380bd 100644 >> --- a/net/openvswitch/dp_notify.c >> +++ b/net/openvswitch/dp_notify.c >> @@ -88,6 +88,11 @@ static int dp_device_event(struct notifier_block *unused, unsigned long event, >> return NOTIFY_DONE; >> >> if (event == NETDEV_UNREGISTER) { >> + /* rx_handler_unregister and upper_dev_unlink immediately */ >> + if (dev->reg_state == NETREG_UNREGISTERING) >> + ovs_netdev_unlink_dev(vport); >> + > > Rather than doing vport destroy here, we can just unlink upper device > and let workq do rest of work. isn't it what it's doing? It does unregister and unlink immediately and schedules workq to do destroy. Obviously destroy cannot happen here, since we cannot grab ovs_lock here, since it would be in the wrong order vs the rest of the code. > > Thanks. > >> + /* schedule vport destroy, dev_put and genl notification */ >> ovs_net = net_generic(dev_net(dev), ovs_net_id); >> queue_work(system_wq, &ovs_net->dp_notify_work); >> } >> diff --git a/net/openvswitch/vport-netdev.c b/net/openvswitch/vport-netdev.c >> index 09d93c1..cce933a 100644 >> --- a/net/openvswitch/vport-netdev.c >> +++ b/net/openvswitch/vport-netdev.c >> @@ -150,15 +150,25 @@ static void free_port_rcu(struct rcu_head *rcu) >> ovs_vport_free(vport_from_priv(netdev_vport)); >> } >> >> -static void netdev_destroy(struct vport *vport) >> +void ovs_netdev_unlink_dev(struct vport *vport) >> { >> struct netdev_vport *netdev_vport = netdev_vport_priv(vport); >> >> - rtnl_lock(); >> + ASSERT_RTNL(); >> netdev_vport->dev->priv_flags &= ~IFF_OVS_DATAPATH; >> netdev_rx_handler_unregister(netdev_vport->dev); >> - netdev_upper_dev_unlink(netdev_vport->dev, get_dpdev(vport->dp)); >> + netdev_upper_dev_unlink(netdev_vport->dev, >> + netdev_master_upper_dev_get(netdev_vport->dev)); >> dev_set_promiscuity(netdev_vport->dev, -1); >> +} >> + >> +static void netdev_destroy(struct vport *vport) >> +{ >> + struct netdev_vport *netdev_vport = netdev_vport_priv(vport); >> + >> + rtnl_lock(); >> + if (netdev_vport->dev->reg_state != NETREG_UNREGISTERING) >> + ovs_netdev_unlink_dev(vport); >> rtnl_unlock(); >> >> call_rcu(&netdev_vport->rcu, free_port_rcu); >> diff --git a/net/openvswitch/vport-netdev.h b/net/openvswitch/vport-netdev.h >> index dd298b5..21e3770 100644 >> --- a/net/openvswitch/vport-netdev.h >> +++ b/net/openvswitch/vport-netdev.h >> @@ -39,5 +39,6 @@ netdev_vport_priv(const struct vport *vport) >> } >> >> const char *ovs_netdev_get_name(const struct vport *); >> +void ovs_netdev_unlink_dev(struct vport *); >> >> #endif /* vport_netdev.h */ >> -- >> 1.7.9.5 >> ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH net-next] openvswitch: fix vport-netdev unregister 2013-10-10 4:11 ` Alexei Starovoitov @ 2013-10-10 6:07 ` Pravin Shelar 2013-10-10 6:26 ` Alexei Starovoitov 0 siblings, 1 reply; 16+ messages in thread From: Pravin Shelar @ 2013-10-10 6:07 UTC (permalink / raw) To: Alexei Starovoitov Cc: David S. Miller, Jesse Gross, Jiri Pirko, dev@openvswitch.org, netdev On Wed, Oct 9, 2013 at 9:11 PM, Alexei Starovoitov <ast@plumgrid.com> wrote: > On Wed, Oct 9, 2013 at 8:02 PM, Pravin Shelar <pshelar@nicira.com> wrote: >> On Tue, Oct 8, 2013 at 8:07 PM, Alexei Starovoitov <ast@plumgrid.com> wrote: >>> The combination of two commits >>> >>> commit 8e4e1713e4 >>> ("openvswitch: Simplify datapath locking.") >>> >>> and >>> >>> commit 2537b4dd0a >>> ("openvswitch:: link upper device for port devices") >>> >>> introduced a bug where upper_dev wasn't unlinked upon >>> netdev_unregister notification >>> >>> The following steps: >>> >>> modprobe openvswitch >>> ovs-dpctl add-dp test >>> ip tuntap add dev tap1 mode tap >>> ovs-dpctl add-if test tap1 >>> ip tuntap del dev tap1 mode tap >>> >>> are causing multiple warnings: >>> >>> [ 62.747557] gre: GRE over IPv4 demultiplexor driver >>> [ 62.749579] openvswitch: Open vSwitch switching datapath >>> [ 62.755087] device test entered promiscuous mode >>> [ 62.765911] device tap1 entered promiscuous mode >>> [ 62.766033] IPv6: ADDRCONF(NETDEV_UP): tap1: link is not ready >>> [ 62.769017] ------------[ cut here ]------------ >>> [ 62.769022] WARNING: CPU: 1 PID: 3267 at net/core/dev.c:5501 rollback_registered_many+0x20f/0x240() >>> [ 62.769023] Modules linked in: openvswitch gre vxlan ip_tunnel libcrc32c ip6table_filter ip6_tables ebtable_nat ebtables nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack xt_CHECKSUM iptable_mangle ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge stp llc vhost_net macvtap macvlan vhost kvm_intel kvm dm_crypt iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi hid_generic mxm_wmi eeepc_wmi asus_wmi sparse_keymap dm_multipath psmouse serio_raw usbhid hid parport_pc ppdev firewire_ohci lpc_ich firewire_core e1000e crc_itu_t binfmt_misc igb dca ptp pps_core mac_hid wmi lp parport i2o_config i2o_block video >>> [ 62.769051] CPU: 1 PID: 3267 Comm: ip Not tainted 3.12.0-rc3+ #60 >>> [ 62.769052] Hardware name: System manufacturer System Product Name/P8Z77 WS, BIOS 3007 07/26/2012 >>> [ 62.769053] 0000000000000009 ffff8807f25cbd28 ffffffff8175e575 0000000000000006 >>> [ 62.769055] 0000000000000000 ffff8807f25cbd68 ffffffff8105314c ffff8807f25cbd58 >>> [ 62.769057] ffff8807f2634000 ffff8807f25cbdc8 ffff8807f25cbd88 ffff8807f25cbdc8 >>> [ 62.769059] Call Trace: >>> [ 62.769062] [<ffffffff8175e575>] dump_stack+0x55/0x76 >>> [ 62.769065] [<ffffffff8105314c>] warn_slowpath_common+0x8c/0xc0 >>> [ 62.769067] [<ffffffff8105319a>] warn_slowpath_null+0x1a/0x20 >>> [ 62.769069] [<ffffffff8162a04f>] rollback_registered_many+0x20f/0x240 >>> [ 62.769071] [<ffffffff8162a101>] rollback_registered+0x31/0x40 >>> [ 62.769073] [<ffffffff8162a488>] unregister_netdevice_queue+0x58/0x90 >>> [ 62.769075] [<ffffffff8154f900>] __tun_detach+0x140/0x340 >>> [ 62.769077] [<ffffffff8154fb36>] tun_chr_close+0x36/0x60 >>> [ 62.769080] [<ffffffff811bddaf>] __fput+0xff/0x260 >>> [ 62.769082] [<ffffffff811bdf5e>] ____fput+0xe/0x10 >>> [ 62.769084] [<ffffffff8107b515>] task_work_run+0xb5/0xe0 >>> [ 62.769087] [<ffffffff810029b9>] do_notify_resume+0x59/0x80 >>> [ 62.769089] [<ffffffff813a41fe>] ? trace_hardirqs_on_thunk+0x3a/0x3f >>> [ 62.769091] [<ffffffff81770f5a>] int_signal+0x12/0x17 >>> [ 62.769093] ---[ end trace 838756c62e156ffb ]--- >>> [ 62.769481] ------------[ cut here ]------------ >>> [ 62.769485] WARNING: CPU: 1 PID: 92 at fs/sysfs/inode.c:325 sysfs_hash_and_remove+0xa9/0xb0() >>> [ 62.769486] sysfs: can not remove 'master', no directory >>> [ 62.769486] Modules linked in: openvswitch gre vxlan ip_tunnel libcrc32c ip6table_filter ip6_tables ebtable_nat ebtables nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack xt_CHECKSUM iptable_mangle ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge stp llc vhost_net macvtap macvlan vhost kvm_intel kvm dm_crypt iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi hid_generic mxm_wmi eeepc_wmi asus_wmi sparse_keymap dm_multipath psmouse serio_raw usbhid hid parport_pc ppdev firewire_ohci lpc_ich firewire_core e1000e crc_itu_t binfmt_misc igb dca ptp pps_core mac_hid wmi lp parport i2o_config i2o_block video >>> [ 62.769514] CPU: 1 PID: 92 Comm: kworker/1:2 Tainted: G W 3.12.0-rc3+ #60 >>> [ 62.769515] Hardware name: System manufacturer System Product Name/P8Z77 WS, BIOS 3007 07/26/2012 >>> [ 62.769518] Workqueue: events ovs_dp_notify_wq [openvswitch] >>> [ 62.769519] 0000000000000009 ffff880807ad3ac8 ffffffff8175e575 0000000000000006 >>> [ 62.769521] ffff880807ad3b18 ffff880807ad3b08 ffffffff8105314c ffff880807ad3b28 >>> [ 62.769523] 0000000000000000 ffffffff81a87a1f ffff8807f2634000 ffff880037038500 >>> [ 62.769525] Call Trace: >>> [ 62.769528] [<ffffffff8175e575>] dump_stack+0x55/0x76 >>> [ 62.769529] [<ffffffff8105314c>] warn_slowpath_common+0x8c/0xc0 >>> [ 62.769531] [<ffffffff81053236>] warn_slowpath_fmt+0x46/0x50 >>> [ 62.769533] [<ffffffff8123e7e9>] sysfs_hash_and_remove+0xa9/0xb0 >>> [ 62.769535] [<ffffffff81240e96>] sysfs_remove_link+0x26/0x30 >>> [ 62.769538] [<ffffffff81631ef7>] __netdev_adjacent_dev_remove+0xf7/0x150 >>> [ 62.769540] [<ffffffff81632037>] __netdev_adjacent_dev_unlink_lists+0x27/0x50 >>> [ 62.769542] [<ffffffff8163213a>] __netdev_adjacent_dev_unlink_neighbour+0x3a/0x50 >>> [ 62.769544] [<ffffffff8163218d>] netdev_upper_dev_unlink+0x3d/0x140 >>> [ 62.769548] [<ffffffffa033c2db>] netdev_destroy+0x4b/0x80 [openvswitch] >>> [ 62.769550] [<ffffffffa033b696>] ovs_vport_del+0x46/0x60 [openvswitch] >>> [ 62.769552] [<ffffffffa0335314>] ovs_dp_detach_port+0x44/0x60 [openvswitch] >>> [ 62.769555] [<ffffffffa0336574>] ovs_dp_notify_wq+0xb4/0x150 [openvswitch] >>> [ 62.769557] [<ffffffff81075c28>] process_one_work+0x1d8/0x6a0 >>> [ 62.769559] [<ffffffff81075bc8>] ? process_one_work+0x178/0x6a0 >>> [ 62.769562] [<ffffffff8107659b>] worker_thread+0x11b/0x370 >>> [ 62.769564] [<ffffffff81076480>] ? rescuer_thread+0x350/0x350 >>> [ 62.769566] [<ffffffff8107f44a>] kthread+0xea/0xf0 >>> [ 62.769568] [<ffffffff8107f360>] ? flush_kthread_worker+0x150/0x150 >>> [ 62.769570] [<ffffffff81770bac>] ret_from_fork+0x7c/0xb0 >>> [ 62.769572] [<ffffffff8107f360>] ? flush_kthread_worker+0x150/0x150 >>> [ 62.769573] ---[ end trace 838756c62e156ffc ]--- >>> [ 62.769574] ------------[ cut here ]------------ >>> [ 62.769576] WARNING: CPU: 1 PID: 92 at fs/sysfs/inode.c:325 sysfs_hash_and_remove+0xa9/0xb0() >>> [ 62.769577] sysfs: can not remove 'upper_test', no directory >>> [ 62.769577] Modules linked in: openvswitch gre vxlan ip_tunnel libcrc32c ip6table_filter ip6_tables ebtable_nat ebtables nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack xt_CHECKSUM iptable_mangle ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge stp llc vhost_net macvtap macvlan vhost kvm_intel kvm dm_crypt iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi hid_generic mxm_wmi eeepc_wmi asus_wmi sparse_keymap dm_multipath psmouse serio_raw usbhid hid parport_pc ppdev firewire_ohci lpc_ich firewire_core e1000e crc_itu_t binfmt_misc igb dca ptp pps_core mac_hid wmi lp parport i2o_config i2o_block video >>> [ 62.769603] CPU: 1 PID: 92 Comm: kworker/1:2 Tainted: G W 3.12.0-rc3+ #60 >>> [ 62.769604] Hardware name: System manufacturer System Product Name/P8Z77 WS, BIOS 3007 07/26/2012 >>> [ 62.769606] Workqueue: events ovs_dp_notify_wq [openvswitch] >>> [ 62.769607] 0000000000000009 ffff880807ad3ac8 ffffffff8175e575 0000000000000006 >>> [ 62.769609] ffff880807ad3b18 ffff880807ad3b08 ffffffff8105314c ffff880807ad3b58 >>> [ 62.769611] 0000000000000000 ffff880807ad3bd9 ffff8807f2634000 ffff880037038500 >>> [ 62.769613] Call Trace: >>> [ 62.769615] [<ffffffff8175e575>] dump_stack+0x55/0x76 >>> [ 62.769617] [<ffffffff8105314c>] warn_slowpath_common+0x8c/0xc0 >>> [ 62.769619] [<ffffffff81053236>] warn_slowpath_fmt+0x46/0x50 >>> [ 62.769621] [<ffffffff8123e7e9>] sysfs_hash_and_remove+0xa9/0xb0 >>> [ 62.769622] [<ffffffff81240e96>] sysfs_remove_link+0x26/0x30 >>> [ 62.769624] [<ffffffff81631f22>] __netdev_adjacent_dev_remove+0x122/0x150 >>> [ 62.769627] [<ffffffff81632037>] __netdev_adjacent_dev_unlink_lists+0x27/0x50 >>> [ 62.769629] [<ffffffff8163213a>] __netdev_adjacent_dev_unlink_neighbour+0x3a/0x50 >>> [ 62.769631] [<ffffffff8163218d>] netdev_upper_dev_unlink+0x3d/0x140 >>> [ 62.769633] [<ffffffffa033c2db>] netdev_destroy+0x4b/0x80 [openvswitch] >>> [ 62.769636] [<ffffffffa033b696>] ovs_vport_del+0x46/0x60 [openvswitch] >>> [ 62.769638] [<ffffffffa0335314>] ovs_dp_detach_port+0x44/0x60 [openvswitch] >>> [ 62.769640] [<ffffffffa0336574>] ovs_dp_notify_wq+0xb4/0x150 [openvswitch] >>> [ 62.769642] [<ffffffff81075c28>] process_one_work+0x1d8/0x6a0 >>> [ 62.769644] [<ffffffff81075bc8>] ? process_one_work+0x178/0x6a0 >>> [ 62.769646] [<ffffffff8107659b>] worker_thread+0x11b/0x370 >>> [ 62.769648] [<ffffffff81076480>] ? rescuer_thread+0x350/0x350 >>> [ 62.769650] [<ffffffff8107f44a>] kthread+0xea/0xf0 >>> [ 62.769652] [<ffffffff8107f360>] ? flush_kthread_worker+0x150/0x150 >>> [ 62.769654] [<ffffffff81770bac>] ret_from_fork+0x7c/0xb0 >>> [ 62.769656] [<ffffffff8107f360>] ? flush_kthread_worker+0x150/0x150 >>> [ 62.769657] ---[ end trace 838756c62e156ffd ]--- >>> [ 62.769724] device tap1 left promiscuous mode >>> >>> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> >>> --- >>> net/openvswitch/dp_notify.c | 5 +++++ >>> net/openvswitch/vport-netdev.c | 16 +++++++++++++--- >>> net/openvswitch/vport-netdev.h | 1 + >>> 3 files changed, 19 insertions(+), 3 deletions(-) >>> >>> diff --git a/net/openvswitch/dp_notify.c b/net/openvswitch/dp_notify.c >>> index c323567..e9380bd 100644 >>> --- a/net/openvswitch/dp_notify.c >>> +++ b/net/openvswitch/dp_notify.c >>> @@ -88,6 +88,11 @@ static int dp_device_event(struct notifier_block *unused, unsigned long event, >>> return NOTIFY_DONE; >>> >>> if (event == NETDEV_UNREGISTER) { >>> + /* rx_handler_unregister and upper_dev_unlink immediately */ >>> + if (dev->reg_state == NETREG_UNREGISTERING) >>> + ovs_netdev_unlink_dev(vport); >>> + >> >> Rather than doing vport destroy here, we can just unlink upper device >> and let workq do rest of work. > > isn't it what it's doing? I meant just call netdev_upper_dev_unlink() here in event handler and rest of vport destroy can be done in workq. > It does unregister and unlink immediately and schedules workq to do destroy. > Obviously destroy cannot happen here, since we cannot grab ovs_lock here, > since it would be in the wrong order vs the rest of the code. > >> >> Thanks. >> ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH net-next] openvswitch: fix vport-netdev unregister 2013-10-10 6:07 ` Pravin Shelar @ 2013-10-10 6:26 ` Alexei Starovoitov 2013-10-10 18:21 ` Pravin Shelar 0 siblings, 1 reply; 16+ messages in thread From: Alexei Starovoitov @ 2013-10-10 6:26 UTC (permalink / raw) To: Pravin Shelar Cc: David S. Miller, Jesse Gross, Jiri Pirko, dev@openvswitch.org, netdev On Wed, Oct 9, 2013 at 11:07 PM, Pravin Shelar <pshelar@nicira.com> wrote: > On Wed, Oct 9, 2013 at 9:11 PM, Alexei Starovoitov <ast@plumgrid.com> wrote: >> On Wed, Oct 9, 2013 at 8:02 PM, Pravin Shelar <pshelar@nicira.com> wrote: >>> On Tue, Oct 8, 2013 at 8:07 PM, Alexei Starovoitov <ast@plumgrid.com> wrote: >>>> The combination of two commits >>>> >>>> commit 8e4e1713e4 >>>> ("openvswitch: Simplify datapath locking.") >>>> >>>> and >>>> >>>> commit 2537b4dd0a >>>> ("openvswitch:: link upper device for port devices") >>>> >>>> introduced a bug where upper_dev wasn't unlinked upon >>>> netdev_unregister notification >>>> >>>> The following steps: >>>> >>>> modprobe openvswitch >>>> ovs-dpctl add-dp test >>>> ip tuntap add dev tap1 mode tap >>>> ovs-dpctl add-if test tap1 >>>> ip tuntap del dev tap1 mode tap >>>> >>>> are causing multiple warnings: >>>> diff --git a/net/openvswitch/dp_notify.c b/net/openvswitch/dp_notify.c >>>> index c323567..e9380bd 100644 >>>> --- a/net/openvswitch/dp_notify.c >>>> +++ b/net/openvswitch/dp_notify.c >>>> @@ -88,6 +88,11 @@ static int dp_device_event(struct notifier_block *unused, unsigned long event, >>>> return NOTIFY_DONE; >>>> >>>> if (event == NETDEV_UNREGISTER) { >>>> + /* rx_handler_unregister and upper_dev_unlink immediately */ >>>> + if (dev->reg_state == NETREG_UNREGISTERING) >>>> + ovs_netdev_unlink_dev(vport); >>>> + >>> >>> Rather than doing vport destroy here, we can just unlink upper device >>> and let workq do rest of work. >> >> isn't it what it's doing? > > I meant just call netdev_upper_dev_unlink() here in event handler and > rest of vport destroy can be done in workq. netdev_upper_dev_unlink() without netdev_rx_handler_unregister() ?! that's dangerous. If that is acceptable, then there was no reason to link them in the first place. notifier asks to unregister. imo the only acceptable deferred task here is to delay dev_put, since ovs structures are still referring to it. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH net-next] openvswitch: fix vport-netdev unregister 2013-10-10 6:26 ` Alexei Starovoitov @ 2013-10-10 18:21 ` Pravin Shelar 2013-10-10 20:48 ` Alexei Starovoitov 0 siblings, 1 reply; 16+ messages in thread From: Pravin Shelar @ 2013-10-10 18:21 UTC (permalink / raw) To: Alexei Starovoitov Cc: David S. Miller, Jesse Gross, Jiri Pirko, dev@openvswitch.org, netdev On Wed, Oct 9, 2013 at 11:26 PM, Alexei Starovoitov <ast@plumgrid.com> wrote: > On Wed, Oct 9, 2013 at 11:07 PM, Pravin Shelar <pshelar@nicira.com> wrote: >> On Wed, Oct 9, 2013 at 9:11 PM, Alexei Starovoitov <ast@plumgrid.com> wrote: >>> On Wed, Oct 9, 2013 at 8:02 PM, Pravin Shelar <pshelar@nicira.com> wrote: >>>> On Tue, Oct 8, 2013 at 8:07 PM, Alexei Starovoitov <ast@plumgrid.com> wrote: >>>>> The combination of two commits >>>>> >>>>> commit 8e4e1713e4 >>>>> ("openvswitch: Simplify datapath locking.") >>>>> >>>>> and >>>>> >>>>> commit 2537b4dd0a >>>>> ("openvswitch:: link upper device for port devices") >>>>> >>>>> introduced a bug where upper_dev wasn't unlinked upon >>>>> netdev_unregister notification >>>>> >>>>> The following steps: >>>>> >>>>> modprobe openvswitch >>>>> ovs-dpctl add-dp test >>>>> ip tuntap add dev tap1 mode tap >>>>> ovs-dpctl add-if test tap1 >>>>> ip tuntap del dev tap1 mode tap >>>>> >>>>> are causing multiple warnings: >>>>> diff --git a/net/openvswitch/dp_notify.c b/net/openvswitch/dp_notify.c >>>>> index c323567..e9380bd 100644 >>>>> --- a/net/openvswitch/dp_notify.c >>>>> +++ b/net/openvswitch/dp_notify.c >>>>> @@ -88,6 +88,11 @@ static int dp_device_event(struct notifier_block *unused, unsigned long event, >>>>> return NOTIFY_DONE; >>>>> >>>>> if (event == NETDEV_UNREGISTER) { >>>>> + /* rx_handler_unregister and upper_dev_unlink immediately */ >>>>> + if (dev->reg_state == NETREG_UNREGISTERING) >>>>> + ovs_netdev_unlink_dev(vport); >>>>> + >>>> >>>> Rather than doing vport destroy here, we can just unlink upper device >>>> and let workq do rest of work. >>> >>> isn't it what it's doing? >> >> I meant just call netdev_upper_dev_unlink() here in event handler and >> rest of vport destroy can be done in workq. > > netdev_upper_dev_unlink() without netdev_rx_handler_unregister() ?! > that's dangerous. why is it dangerous? ovs already had ref to net-device. > If that is acceptable, then there was no reason to link them in the first place. > I do not see any harm in linking device hierarchy for ovs. > notifier asks to unregister. imo the only acceptable deferred task > here is to delay dev_put, > since ovs structures are still referring to it. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH net-next] openvswitch: fix vport-netdev unregister 2013-10-10 18:21 ` Pravin Shelar @ 2013-10-10 20:48 ` Alexei Starovoitov [not found] ` <CAMEtUuxH24G6pQtSFznrK7w5mdP3za4uQfKKEEaEngW4SJajbA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 16+ messages in thread From: Alexei Starovoitov @ 2013-10-10 20:48 UTC (permalink / raw) To: Pravin Shelar Cc: David S. Miller, Jesse Gross, Jiri Pirko, dev@openvswitch.org, netdev On Thu, Oct 10, 2013 at 11:21 AM, Pravin Shelar <pshelar@nicira.com> wrote: > On Wed, Oct 9, 2013 at 11:26 PM, Alexei Starovoitov <ast@plumgrid.com> wrote: >> On Wed, Oct 9, 2013 at 11:07 PM, Pravin Shelar <pshelar@nicira.com> wrote: >>> On Wed, Oct 9, 2013 at 9:11 PM, Alexei Starovoitov <ast@plumgrid.com> wrote: >>>> On Wed, Oct 9, 2013 at 8:02 PM, Pravin Shelar <pshelar@nicira.com> wrote: >>>>> On Tue, Oct 8, 2013 at 8:07 PM, Alexei Starovoitov <ast@plumgrid.com> wrote: >>>>>> The combination of two commits >>>>>> >>>>>> commit 8e4e1713e4 >>>>>> ("openvswitch: Simplify datapath locking.") >>>>>> >>>>>> and >>>>>> >>>>>> commit 2537b4dd0a >>>>>> ("openvswitch:: link upper device for port devices") >>>>>> >>>>>> introduced a bug where upper_dev wasn't unlinked upon >>>>>> netdev_unregister notification >>>>>> >>>>>> The following steps: >>>>>> >>>>>> modprobe openvswitch >>>>>> ovs-dpctl add-dp test >>>>>> ip tuntap add dev tap1 mode tap >>>>>> ovs-dpctl add-if test tap1 >>>>>> ip tuntap del dev tap1 mode tap >>>>>> >>>>>> are causing multiple warnings: >>>>>> diff --git a/net/openvswitch/dp_notify.c b/net/openvswitch/dp_notify.c >>>>>> index c323567..e9380bd 100644 >>>>>> --- a/net/openvswitch/dp_notify.c >>>>>> +++ b/net/openvswitch/dp_notify.c >>>>>> @@ -88,6 +88,11 @@ static int dp_device_event(struct notifier_block *unused, unsigned long event, >>>>>> return NOTIFY_DONE; >>>>>> >>>>>> if (event == NETDEV_UNREGISTER) { >>>>>> + /* rx_handler_unregister and upper_dev_unlink immediately */ >>>>>> + if (dev->reg_state == NETREG_UNREGISTERING) >>>>>> + ovs_netdev_unlink_dev(vport); >>>>>> + >>>>> >>>>> Rather than doing vport destroy here, we can just unlink upper device >>>>> and let workq do rest of work. >>>> >>>> isn't it what it's doing? >>> >>> I meant just call netdev_upper_dev_unlink() here in event handler and >>> rest of vport destroy can be done in workq. >> >> netdev_upper_dev_unlink() without netdev_rx_handler_unregister() ?! >> that's dangerous. > why is it dangerous? ovs already had ref to net-device. comment from dev.c: /* Notify protocols, that we are about to destroy this device. They should clean all the things. */ call_netdevice_notifiers(NETDEV_UNREGISTER, dev); so here you're suggesting to just netdev_upper_dev_unlink() to silence the warning, but then do dev_set_promisc(-1) in workqueue? well, as a minimum audit of promiscuity will be wrong. ndo_change_rx_flags will be called after ndo_uninit, all sorts of other cleanups are done. I cannot track all possible scenarios, but it seems much safer to cleanup everything possible as soon as ovs received NETDEV_UNREGISTER event. May be all these risks are worth taking, then please explain what is the problem with the proposed patch? Thanks Alex ^ permalink raw reply [flat|nested] 16+ messages in thread
[parent not found: <CAMEtUuxH24G6pQtSFznrK7w5mdP3za4uQfKKEEaEngW4SJajbA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH net-next] openvswitch: fix vport-netdev unregister [not found] ` <CAMEtUuxH24G6pQtSFznrK7w5mdP3za4uQfKKEEaEngW4SJajbA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2013-10-10 22:38 ` Pravin Shelar 2013-10-11 0:47 ` Alexei Starovoitov 0 siblings, 1 reply; 16+ messages in thread From: Pravin Shelar @ 2013-10-10 22:38 UTC (permalink / raw) To: Alexei Starovoitov Cc: dev-yBygre7rU0TnMu66kgdUjQ@public.gmane.org, netdev, Jiri Pirko, David S. Miller On Thu, Oct 10, 2013 at 1:48 PM, Alexei Starovoitov <ast-uqk4Ao+rVK5Wk0Htik3J/w@public.gmane.org> wrote: > On Thu, Oct 10, 2013 at 11:21 AM, Pravin Shelar <pshelar-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org> wrote: >> On Wed, Oct 9, 2013 at 11:26 PM, Alexei Starovoitov <ast-uqk4Ao+rVK5Wk0Htik3J/w@public.gmane.org> wrote: >>> On Wed, Oct 9, 2013 at 11:07 PM, Pravin Shelar <pshelar-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org> wrote: >>>> On Wed, Oct 9, 2013 at 9:11 PM, Alexei Starovoitov <ast-uqk4Ao+rVK5Wk0Htik3J/w@public.gmane.org> wrote: >>>>> On Wed, Oct 9, 2013 at 8:02 PM, Pravin Shelar <pshelar-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org> wrote: >>>>>> On Tue, Oct 8, 2013 at 8:07 PM, Alexei Starovoitov <ast-uqk4Ao+rVK5Wk0Htik3J/w@public.gmane.org> wrote: >>>>>>> The combination of two commits >>>>>>> >>>>>>> commit 8e4e1713e4 >>>>>>> ("openvswitch: Simplify datapath locking.") >>>>>>> >>>>>>> and >>>>>>> >>>>>>> commit 2537b4dd0a >>>>>>> ("openvswitch:: link upper device for port devices") >>>>>>> >>>>>>> introduced a bug where upper_dev wasn't unlinked upon >>>>>>> netdev_unregister notification >>>>>>> >>>>>>> The following steps: >>>>>>> >>>>>>> modprobe openvswitch >>>>>>> ovs-dpctl add-dp test >>>>>>> ip tuntap add dev tap1 mode tap >>>>>>> ovs-dpctl add-if test tap1 >>>>>>> ip tuntap del dev tap1 mode tap >>>>>>> >>>>>>> are causing multiple warnings: >>>>>>> diff --git a/net/openvswitch/dp_notify.c b/net/openvswitch/dp_notify.c >>>>>>> index c323567..e9380bd 100644 >>>>>>> --- a/net/openvswitch/dp_notify.c >>>>>>> +++ b/net/openvswitch/dp_notify.c >>>>>>> @@ -88,6 +88,11 @@ static int dp_device_event(struct notifier_block *unused, unsigned long event, >>>>>>> return NOTIFY_DONE; >>>>>>> >>>>>>> if (event == NETDEV_UNREGISTER) { >>>>>>> + /* rx_handler_unregister and upper_dev_unlink immediately */ >>>>>>> + if (dev->reg_state == NETREG_UNREGISTERING) >>>>>>> + ovs_netdev_unlink_dev(vport); >>>>>>> + >>>>>> >>>>>> Rather than doing vport destroy here, we can just unlink upper device >>>>>> and let workq do rest of work. >>>>> >>>>> isn't it what it's doing? >>>> >>>> I meant just call netdev_upper_dev_unlink() here in event handler and >>>> rest of vport destroy can be done in workq. >>> >>> netdev_upper_dev_unlink() without netdev_rx_handler_unregister() ?! >>> that's dangerous. >> why is it dangerous? ovs already had ref to net-device. > > comment from dev.c: > /* Notify protocols, that we are about to destroy > this device. They should clean all the things. > */ > call_netdevice_notifiers(NETDEV_UNREGISTER, dev); > > so here you're suggesting to just netdev_upper_dev_unlink() to silence > the warning, > but then do dev_set_promisc(-1) in workqueue? > promiscuity setting is different issue. If you want to have discussion then you can post separate patch for same. Lets fix the warning here. > well, as a minimum audit of promiscuity will be wrong. > ndo_change_rx_flags will be called after ndo_uninit, > all sorts of other cleanups are done. change_rx_flags() checks for UP flag for given device. > I cannot track all possible scenarios, but it seems much safer to > cleanup everything possible > as soon as ovs received NETDEV_UNREGISTER event. > > May be all these risks are worth taking, then please explain what is > the problem with the proposed patch? > Problem is that this is causing layering issues in OVS. dp_notify is suppose to work at dp layer. your patch directly calls vport-netdev implementation function from dp_notify. I could not think of a simple approach that will do this in completely clean manner. Therefore I am trying to minimize layering issues. So just calling netdev_upper_dev_unlink() looks better than doing anything extra. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH net-next] openvswitch: fix vport-netdev unregister 2013-10-10 22:38 ` Pravin Shelar @ 2013-10-11 0:47 ` Alexei Starovoitov 2013-10-11 3:56 ` Jesse Gross 0 siblings, 1 reply; 16+ messages in thread From: Alexei Starovoitov @ 2013-10-11 0:47 UTC (permalink / raw) To: Pravin Shelar Cc: David S. Miller, Jesse Gross, Jiri Pirko, dev@openvswitch.org, netdev On Thu, Oct 10, 2013 at 3:38 PM, Pravin Shelar <pshelar@nicira.com> wrote: > On Thu, Oct 10, 2013 at 1:48 PM, Alexei Starovoitov <ast@plumgrid.com> wrote: >> On Thu, Oct 10, 2013 at 11:21 AM, Pravin Shelar <pshelar@nicira.com> wrote: >>> On Wed, Oct 9, 2013 at 11:26 PM, Alexei Starovoitov <ast@plumgrid.com> wrote: >>>> On Wed, Oct 9, 2013 at 11:07 PM, Pravin Shelar <pshelar@nicira.com> wrote: >>>>> On Wed, Oct 9, 2013 at 9:11 PM, Alexei Starovoitov <ast@plumgrid.com> wrote: >>>>>> On Wed, Oct 9, 2013 at 8:02 PM, Pravin Shelar <pshelar@nicira.com> wrote: >>>>>>> On Tue, Oct 8, 2013 at 8:07 PM, Alexei Starovoitov <ast@plumgrid.com> wrote: >>>>>>>> The combination of two commits >>>>>>>> >>>>>>>> commit 8e4e1713e4 >>>>>>>> ("openvswitch: Simplify datapath locking.") >>>>>>>> >>>>>>>> and >>>>>>>> >>>>>>>> commit 2537b4dd0a >>>>>>>> ("openvswitch:: link upper device for port devices") >>>>>>>> >>>>>>>> introduced a bug where upper_dev wasn't unlinked upon >>>>>>>> netdev_unregister notification >>>>>>>> >>>>>>>> The following steps: >>>>>>>> >>>>>>>> modprobe openvswitch >>>>>>>> ovs-dpctl add-dp test >>>>>>>> ip tuntap add dev tap1 mode tap >>>>>>>> ovs-dpctl add-if test tap1 >>>>>>>> ip tuntap del dev tap1 mode tap >>>>>>>> >>>>>>>> are causing multiple warnings: >>>>>>>> diff --git a/net/openvswitch/dp_notify.c b/net/openvswitch/dp_notify.c >>>>>>>> index c323567..e9380bd 100644 >>>>>>>> --- a/net/openvswitch/dp_notify.c >>>>>>>> +++ b/net/openvswitch/dp_notify.c >>>>>>>> @@ -88,6 +88,11 @@ static int dp_device_event(struct notifier_block *unused, unsigned long event, >>>>>>>> return NOTIFY_DONE; >>>>>>>> >>>>>>>> if (event == NETDEV_UNREGISTER) { >>>>>>>> + /* rx_handler_unregister and upper_dev_unlink immediately */ >>>>>>>> + if (dev->reg_state == NETREG_UNREGISTERING) >>>>>>>> + ovs_netdev_unlink_dev(vport); >>>>>>>> + >>>>>>> >>>>>>> Rather than doing vport destroy here, we can just unlink upper device >>>>>>> and let workq do rest of work. >>>>>> >>>>>> isn't it what it's doing? >>>>> >>>>> I meant just call netdev_upper_dev_unlink() here in event handler and >>>>> rest of vport destroy can be done in workq. >>>> >>>> netdev_upper_dev_unlink() without netdev_rx_handler_unregister() ?! >>>> that's dangerous. >>> why is it dangerous? ovs already had ref to net-device. >> >> comment from dev.c: >> /* Notify protocols, that we are about to destroy >> this device. They should clean all the things. >> */ >> call_netdevice_notifiers(NETDEV_UNREGISTER, dev); >> >> so here you're suggesting to just netdev_upper_dev_unlink() to silence >> the warning, >> but then do dev_set_promisc(-1) in workqueue? >> > promiscuity setting is different issue. If you want to have discussion > then you can post separate patch for same. Lets fix the warning here. > >> well, as a minimum audit of promiscuity will be wrong. >> ndo_change_rx_flags will be called after ndo_uninit, >> all sorts of other cleanups are done. > > change_rx_flags() checks for UP flag for given device. > >> I cannot track all possible scenarios, but it seems much safer to >> cleanup everything possible >> as soon as ovs received NETDEV_UNREGISTER event. >> >> May be all these risks are worth taking, then please explain what is >> the problem with the proposed patch? >> > Problem is that this is causing layering issues in OVS. dp_notify is > suppose to work at dp layer. your patch directly calls vport-netdev > implementation function from dp_notify. > I could not think of a simple approach that will do this in completely > clean manner. Therefore I am trying to minimize layering issues. So > just calling netdev_upper_dev_unlink() looks better than doing > anything extra. dp_notify is per net, not per dp. notifier can only be called for net_device and the first thing it does: if (!ovs_is_internal_dev(dev)) vport = ovs_netdev_get_vport(dev); where ovs_netdev_get_vport() is defined in vport-netdev.c once it gets into workq, it checks for: if (vport->ops->type != OVS_VPORT_TYPE_NETDEV) continue; and netdev_vport = netdev_vport_priv(vport); where netdev_vport_priv() is defined in netdev-vport.h only then it proceeds with generic ovs_dp_detach_port(). Is that the layering you talking about? So to minimize layering issues you want to call 'upper_dev_link' from netdev_create() in vport-netdev.c and 'upper_dev_unlink' directly from dp_device_event() in dp_notify.c? That will look better than calling ovs_netdev_unlink_dev() ? Having both register+link and unregister+unlink in the same vport-netdev.c, is not an advantage? I'm still missing something here. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH net-next] openvswitch: fix vport-netdev unregister 2013-10-11 0:47 ` Alexei Starovoitov @ 2013-10-11 3:56 ` Jesse Gross [not found] ` <CAEP_g=_meOUPrYc7S8rM4NwT_7qG1JZrHJkOg6CmbjDc-rFjAQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 16+ messages in thread From: Jesse Gross @ 2013-10-11 3:56 UTC (permalink / raw) To: Alexei Starovoitov Cc: Pravin Shelar, David S. Miller, Jiri Pirko, dev@openvswitch.org, netdev On Thu, Oct 10, 2013 at 5:47 PM, Alexei Starovoitov <ast@plumgrid.com> wrote: > On Thu, Oct 10, 2013 at 3:38 PM, Pravin Shelar <pshelar@nicira.com> wrote: >> On Thu, Oct 10, 2013 at 1:48 PM, Alexei Starovoitov <ast@plumgrid.com> wrote: >>> On Thu, Oct 10, 2013 at 11:21 AM, Pravin Shelar <pshelar@nicira.com> wrote: >>>> On Wed, Oct 9, 2013 at 11:26 PM, Alexei Starovoitov <ast@plumgrid.com> wrote: >>>>> On Wed, Oct 9, 2013 at 11:07 PM, Pravin Shelar <pshelar@nicira.com> wrote: >>>>>> On Wed, Oct 9, 2013 at 9:11 PM, Alexei Starovoitov <ast@plumgrid.com> wrote: >>>>>>> On Wed, Oct 9, 2013 at 8:02 PM, Pravin Shelar <pshelar@nicira.com> wrote: >>>>>>>> On Tue, Oct 8, 2013 at 8:07 PM, Alexei Starovoitov <ast@plumgrid.com> wrote: >>>>>>>>> The combination of two commits >>>>>>>>> >>>>>>>>> commit 8e4e1713e4 >>>>>>>>> ("openvswitch: Simplify datapath locking.") >>>>>>>>> >>>>>>>>> and >>>>>>>>> >>>>>>>>> commit 2537b4dd0a >>>>>>>>> ("openvswitch:: link upper device for port devices") >>>>>>>>> >>>>>>>>> introduced a bug where upper_dev wasn't unlinked upon >>>>>>>>> netdev_unregister notification >>>>>>>>> >>>>>>>>> The following steps: >>>>>>>>> >>>>>>>>> modprobe openvswitch >>>>>>>>> ovs-dpctl add-dp test >>>>>>>>> ip tuntap add dev tap1 mode tap >>>>>>>>> ovs-dpctl add-if test tap1 >>>>>>>>> ip tuntap del dev tap1 mode tap >>>>>>>>> >>>>>>>>> are causing multiple warnings: >>>>>>>>> diff --git a/net/openvswitch/dp_notify.c b/net/openvswitch/dp_notify.c >>>>>>>>> index c323567..e9380bd 100644 >>>>>>>>> --- a/net/openvswitch/dp_notify.c >>>>>>>>> +++ b/net/openvswitch/dp_notify.c >>>>>>>>> @@ -88,6 +88,11 @@ static int dp_device_event(struct notifier_block *unused, unsigned long event, >>>>>>>>> return NOTIFY_DONE; >>>>>>>>> >>>>>>>>> if (event == NETDEV_UNREGISTER) { >>>>>>>>> + /* rx_handler_unregister and upper_dev_unlink immediately */ >>>>>>>>> + if (dev->reg_state == NETREG_UNREGISTERING) >>>>>>>>> + ovs_netdev_unlink_dev(vport); >>>>>>>>> + >>>>>>>> >>>>>>>> Rather than doing vport destroy here, we can just unlink upper device >>>>>>>> and let workq do rest of work. >>>>>>> >>>>>>> isn't it what it's doing? >>>>>> >>>>>> I meant just call netdev_upper_dev_unlink() here in event handler and >>>>>> rest of vport destroy can be done in workq. >>>>> >>>>> netdev_upper_dev_unlink() without netdev_rx_handler_unregister() ?! >>>>> that's dangerous. >>>> why is it dangerous? ovs already had ref to net-device. >>> >>> comment from dev.c: >>> /* Notify protocols, that we are about to destroy >>> this device. They should clean all the things. >>> */ >>> call_netdevice_notifiers(NETDEV_UNREGISTER, dev); >>> >>> so here you're suggesting to just netdev_upper_dev_unlink() to silence >>> the warning, >>> but then do dev_set_promisc(-1) in workqueue? >>> >> promiscuity setting is different issue. If you want to have discussion >> then you can post separate patch for same. Lets fix the warning here. >> >>> well, as a minimum audit of promiscuity will be wrong. >>> ndo_change_rx_flags will be called after ndo_uninit, >>> all sorts of other cleanups are done. >> >> change_rx_flags() checks for UP flag for given device. >> >>> I cannot track all possible scenarios, but it seems much safer to >>> cleanup everything possible >>> as soon as ovs received NETDEV_UNREGISTER event. >>> >>> May be all these risks are worth taking, then please explain what is >>> the problem with the proposed patch? >>> >> Problem is that this is causing layering issues in OVS. dp_notify is >> suppose to work at dp layer. your patch directly calls vport-netdev >> implementation function from dp_notify. >> I could not think of a simple approach that will do this in completely >> clean manner. Therefore I am trying to minimize layering issues. So >> just calling netdev_upper_dev_unlink() looks better than doing >> anything extra. > > dp_notify is per net, not per dp. > notifier can only be called for net_device and the first thing it does: > if (!ovs_is_internal_dev(dev)) > vport = ovs_netdev_get_vport(dev); > where ovs_netdev_get_vport() is defined in vport-netdev.c > > once it gets into workq, it checks for: > if (vport->ops->type != OVS_VPORT_TYPE_NETDEV) > continue; > and > netdev_vport = netdev_vport_priv(vport); > where netdev_vport_priv() is defined in netdev-vport.h > > only then it proceeds with generic ovs_dp_detach_port(). > > Is that the layering you talking about? > > So to minimize layering issues you want to call 'upper_dev_link' from > netdev_create() in vport-netdev.c > and 'upper_dev_unlink' directly from dp_device_event() in dp_notify.c? > > That will look better than calling ovs_netdev_unlink_dev() ? At a minimum, this function name is misleading because it does a bunch of other things beyond unlink the device. At this point, it basically seems like six of one, half dozen of the other. I don't think that either solution is ideal but it doesn't really matter all that much. However, the check dev->reg_state in netdev_destroy() looks racy to me, as it could already be in NETREG_UNREGISTERED even if we already processed this device. ^ permalink raw reply [flat|nested] 16+ messages in thread
[parent not found: <CAEP_g=_meOUPrYc7S8rM4NwT_7qG1JZrHJkOg6CmbjDc-rFjAQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH net-next] openvswitch: fix vport-netdev unregister [not found] ` <CAEP_g=_meOUPrYc7S8rM4NwT_7qG1JZrHJkOg6CmbjDc-rFjAQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2013-10-11 4:48 ` Alexei Starovoitov 2013-10-11 18:11 ` Jesse Gross 0 siblings, 1 reply; 16+ messages in thread From: Alexei Starovoitov @ 2013-10-11 4:48 UTC (permalink / raw) To: Jesse Gross Cc: dev-yBygre7rU0TnMu66kgdUjQ@public.gmane.org, netdev, Jiri Pirko, David S. Miller On Thu, Oct 10, 2013 at 8:56 PM, Jesse Gross <jesse-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org> wrote: > On Thu, Oct 10, 2013 at 5:47 PM, Alexei Starovoitov <ast-uqk4Ao+rVK5Wk0Htik3J/w@public.gmane.org> wrote: >> On Thu, Oct 10, 2013 at 3:38 PM, Pravin Shelar <pshelar-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org> wrote: >>> On Thu, Oct 10, 2013 at 1:48 PM, Alexei Starovoitov <ast-uqk4Ao+rVK5Wk0Htik3J/w@public.gmane.org> wrote: >>>> On Thu, Oct 10, 2013 at 11:21 AM, Pravin Shelar <pshelar-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org> wrote: >>>>> On Wed, Oct 9, 2013 at 11:26 PM, Alexei Starovoitov <ast-uqk4Ao+rVK5Wk0Htik3J/w@public.gmane.org> wrote: >>>>>> On Wed, Oct 9, 2013 at 11:07 PM, Pravin Shelar <pshelar-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org> wrote: >>>>>>> On Wed, Oct 9, 2013 at 9:11 PM, Alexei Starovoitov <ast-uqk4Ao+rVK5Wk0Htik3J/w@public.gmane.org> wrote: >>>>>>>> On Wed, Oct 9, 2013 at 8:02 PM, Pravin Shelar <pshelar-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org> wrote: >>>>>>>>> On Tue, Oct 8, 2013 at 8:07 PM, Alexei Starovoitov <ast-uqk4Ao+rVK5Wk0Htik3J/w@public.gmane.org> wrote: >>>>>>>>>> The combination of two commits >>>>>>>>>> >>>>>>>>>> commit 8e4e1713e4 >>>>>>>>>> ("openvswitch: Simplify datapath locking.") >>>>>>>>>> >>>>>>>>>> and >>>>>>>>>> >>>>>>>>>> commit 2537b4dd0a >>>>>>>>>> ("openvswitch:: link upper device for port devices") >>>>>>>>>> >>>>>>>>>> introduced a bug where upper_dev wasn't unlinked upon >>>>>>>>>> netdev_unregister notification >>>>>>>>>> >>>>>>>>>> The following steps: >>>>>>>>>> >>>>>>>>>> modprobe openvswitch >>>>>>>>>> ovs-dpctl add-dp test >>>>>>>>>> ip tuntap add dev tap1 mode tap >>>>>>>>>> ovs-dpctl add-if test tap1 >>>>>>>>>> ip tuntap del dev tap1 mode tap >>>>>>>>>> >>>>>>>>>> are causing multiple warnings: >>>>>>>>>> diff --git a/net/openvswitch/dp_notify.c b/net/openvswitch/dp_notify.c >>>>>>>>>> index c323567..e9380bd 100644 >>>>>>>>>> --- a/net/openvswitch/dp_notify.c >>>>>>>>>> +++ b/net/openvswitch/dp_notify.c >>>>>>>>>> @@ -88,6 +88,11 @@ static int dp_device_event(struct notifier_block *unused, unsigned long event, >>>>>>>>>> return NOTIFY_DONE; >>>>>>>>>> >>>>>>>>>> if (event == NETDEV_UNREGISTER) { >>>>>>>>>> + /* rx_handler_unregister and upper_dev_unlink immediately */ >>>>>>>>>> + if (dev->reg_state == NETREG_UNREGISTERING) >>>>>>>>>> + ovs_netdev_unlink_dev(vport); >>>>>>>>>> + >>>>>>>>> >>>>>>>>> Rather than doing vport destroy here, we can just unlink upper device >>>>>>>>> and let workq do rest of work. >>>>>>>> >>>>>>>> isn't it what it's doing? >>>>>>> >>>>>>> I meant just call netdev_upper_dev_unlink() here in event handler and >>>>>>> rest of vport destroy can be done in workq. >>>>>> >>>>>> netdev_upper_dev_unlink() without netdev_rx_handler_unregister() ?! >>>>>> that's dangerous. >>>>> why is it dangerous? ovs already had ref to net-device. >>>> >>>> comment from dev.c: >>>> /* Notify protocols, that we are about to destroy >>>> this device. They should clean all the things. >>>> */ >>>> call_netdevice_notifiers(NETDEV_UNREGISTER, dev); >>>> >>>> so here you're suggesting to just netdev_upper_dev_unlink() to silence >>>> the warning, >>>> but then do dev_set_promisc(-1) in workqueue? >>>> >>> promiscuity setting is different issue. If you want to have discussion >>> then you can post separate patch for same. Lets fix the warning here. >>> >>>> well, as a minimum audit of promiscuity will be wrong. >>>> ndo_change_rx_flags will be called after ndo_uninit, >>>> all sorts of other cleanups are done. >>> >>> change_rx_flags() checks for UP flag for given device. >>> >>>> I cannot track all possible scenarios, but it seems much safer to >>>> cleanup everything possible >>>> as soon as ovs received NETDEV_UNREGISTER event. >>>> >>>> May be all these risks are worth taking, then please explain what is >>>> the problem with the proposed patch? >>>> >>> Problem is that this is causing layering issues in OVS. dp_notify is >>> suppose to work at dp layer. your patch directly calls vport-netdev >>> implementation function from dp_notify. >>> I could not think of a simple approach that will do this in completely >>> clean manner. Therefore I am trying to minimize layering issues. So >>> just calling netdev_upper_dev_unlink() looks better than doing >>> anything extra. >> >> dp_notify is per net, not per dp. >> notifier can only be called for net_device and the first thing it does: >> if (!ovs_is_internal_dev(dev)) >> vport = ovs_netdev_get_vport(dev); >> where ovs_netdev_get_vport() is defined in vport-netdev.c >> >> once it gets into workq, it checks for: >> if (vport->ops->type != OVS_VPORT_TYPE_NETDEV) >> continue; >> and >> netdev_vport = netdev_vport_priv(vport); >> where netdev_vport_priv() is defined in netdev-vport.h >> >> only then it proceeds with generic ovs_dp_detach_port(). >> >> Is that the layering you talking about? >> >> So to minimize layering issues you want to call 'upper_dev_link' from >> netdev_create() in vport-netdev.c >> and 'upper_dev_unlink' directly from dp_device_event() in dp_notify.c? >> >> That will look better than calling ovs_netdev_unlink_dev() ? > > At a minimum, this function name is misleading because it does a bunch > of other things beyond unlink the device. sure. Please propose a different name. > At this point, it basically seems like six of one, half dozen of the > other. I don't think that either solution is ideal but it doesn't > really matter all that much. > > However, the check dev->reg_state in netdev_destroy() looks racy to > me, as it could already be in NETREG_UNREGISTERED even if we already > processed this device. you mean that netdev_destroy() will see reg_state == netreg_unregistered, while dp_device_event() didn't see reg_state == netreg_unregistering yet? or dp_device_event() saw it, proceeded to do unlink and netdev_destroy() ran in parallel? well, that's why reg_state == netreg_unregistering check in netdev_destroy() is done with rtnl_lock() held. reg_state cannot go into netreg_unregistered state skipping netreg_unregistering and notifier. therefore I don't think it's racy. In ovs_dp_notify_wq() you're checking for both unregistering and unregistered and that makes sense, since workq can run after unregistering notifier called and netdev_run_todo() already changed the state to unregistered. But here it's not the case. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH net-next] openvswitch: fix vport-netdev unregister 2013-10-11 4:48 ` Alexei Starovoitov @ 2013-10-11 18:11 ` Jesse Gross 2013-10-11 20:03 ` Alexei Starovoitov 0 siblings, 1 reply; 16+ messages in thread From: Jesse Gross @ 2013-10-11 18:11 UTC (permalink / raw) To: Alexei Starovoitov Cc: Pravin Shelar, David S. Miller, Jiri Pirko, dev@openvswitch.org, netdev On Thu, Oct 10, 2013 at 9:48 PM, Alexei Starovoitov <ast@plumgrid.com> wrote: > On Thu, Oct 10, 2013 at 8:56 PM, Jesse Gross <jesse@nicira.com> wrote: >> However, the check dev->reg_state in netdev_destroy() looks racy to >> me, as it could already be in NETREG_UNREGISTERED even if we already >> processed this device. > > you mean that netdev_destroy() will see reg_state == netreg_unregistered, > while dp_device_event() didn't see reg_state == netreg_unregistering yet? > or dp_device_event() saw it, proceeded to do unlink and > netdev_destroy() ran in parallel? > well, that's why reg_state == netreg_unregistering check in netdev_destroy() > is done with rtnl_lock() held. > reg_state cannot go into netreg_unregistered state skipping > netreg_unregistering and notifier. > therefore I don't think it's racy. > > In ovs_dp_notify_wq() you're checking for both unregistering and > unregistered and that makes > sense, since workq can run after unregistering notifier called and > netdev_run_todo() > already changed the state to unregistered. > But here it's not the case. ovs_dp_notify_wq() calls ovs_dp_detach_port(), which indirectly calls netdev_destroy() so it seems like it actually is the same case to me. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH net-next] openvswitch: fix vport-netdev unregister 2013-10-11 18:11 ` Jesse Gross @ 2013-10-11 20:03 ` Alexei Starovoitov [not found] ` <CAMEtUuwAJr0uMv4b_SfYG378p06os3bvT3kyvVGpKguAZbNcSw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 16+ messages in thread From: Alexei Starovoitov @ 2013-10-11 20:03 UTC (permalink / raw) To: Jesse Gross Cc: Pravin Shelar, David S. Miller, Jiri Pirko, dev@openvswitch.org, netdev On Fri, Oct 11, 2013 at 11:11 AM, Jesse Gross <jesse@nicira.com> wrote: > On Thu, Oct 10, 2013 at 9:48 PM, Alexei Starovoitov <ast@plumgrid.com> wrote: >> On Thu, Oct 10, 2013 at 8:56 PM, Jesse Gross <jesse@nicira.com> wrote: >>> However, the check dev->reg_state in netdev_destroy() looks racy to >>> me, as it could already be in NETREG_UNREGISTERED even if we already >>> processed this device. >> >> you mean that netdev_destroy() will see reg_state == netreg_unregistered, >> while dp_device_event() didn't see reg_state == netreg_unregistering yet? >> or dp_device_event() saw it, proceeded to do unlink and >> netdev_destroy() ran in parallel? >> well, that's why reg_state == netreg_unregistering check in netdev_destroy() >> is done with rtnl_lock() held. >> reg_state cannot go into netreg_unregistered state skipping >> netreg_unregistering and notifier. >> therefore I don't think it's racy. >> >> In ovs_dp_notify_wq() you're checking for both unregistering and >> unregistered and that makes >> sense, since workq can run after unregistering notifier called and >> netdev_run_todo() >> already changed the state to unregistered. >> But here it's not the case. > > ovs_dp_notify_wq() calls ovs_dp_detach_port(), which indirectly calls > netdev_destroy() so it seems like it actually is the same case to me. yes. makes sense. how about: - if (netdev_vport->dev->reg_state != NETREG_UNREGISTERING) + if (netdev_vport->dev->priv_flags & IFF_OVS_DATAPATH) or you would prefer (state != unregistering && state != unregistered) check instead? ovs_netdev_destroy_dev() name instead ovs_netdev_unlink_dev() name? Thanks Alexei ^ permalink raw reply [flat|nested] 16+ messages in thread
[parent not found: <CAMEtUuwAJr0uMv4b_SfYG378p06os3bvT3kyvVGpKguAZbNcSw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH net-next] openvswitch: fix vport-netdev unregister [not found] ` <CAMEtUuwAJr0uMv4b_SfYG378p06os3bvT3kyvVGpKguAZbNcSw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2013-10-11 22:02 ` Jesse Gross 2013-10-11 22:48 ` Alexei Starovoitov 0 siblings, 1 reply; 16+ messages in thread From: Jesse Gross @ 2013-10-11 22:02 UTC (permalink / raw) To: Alexei Starovoitov Cc: dev-yBygre7rU0TnMu66kgdUjQ@public.gmane.org, netdev, Jiri Pirko, David S. Miller On Fri, Oct 11, 2013 at 1:03 PM, Alexei Starovoitov <ast-uqk4Ao+rVK5Wk0Htik3J/w@public.gmane.org> wrote: > On Fri, Oct 11, 2013 at 11:11 AM, Jesse Gross <jesse-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org> wrote: >> On Thu, Oct 10, 2013 at 9:48 PM, Alexei Starovoitov <ast-uqk4Ao+rVK5Wk0Htik3J/w@public.gmane.org> wrote: >>> On Thu, Oct 10, 2013 at 8:56 PM, Jesse Gross <jesse-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org> wrote: >>>> However, the check dev->reg_state in netdev_destroy() looks racy to >>>> me, as it could already be in NETREG_UNREGISTERED even if we already >>>> processed this device. >>> >>> you mean that netdev_destroy() will see reg_state == netreg_unregistered, >>> while dp_device_event() didn't see reg_state == netreg_unregistering yet? >>> or dp_device_event() saw it, proceeded to do unlink and >>> netdev_destroy() ran in parallel? >>> well, that's why reg_state == netreg_unregistering check in netdev_destroy() >>> is done with rtnl_lock() held. >>> reg_state cannot go into netreg_unregistered state skipping >>> netreg_unregistering and notifier. >>> therefore I don't think it's racy. >>> >>> In ovs_dp_notify_wq() you're checking for both unregistering and >>> unregistered and that makes >>> sense, since workq can run after unregistering notifier called and >>> netdev_run_todo() >>> already changed the state to unregistered. >>> But here it's not the case. >> >> ovs_dp_notify_wq() calls ovs_dp_detach_port(), which indirectly calls >> netdev_destroy() so it seems like it actually is the same case to me. > > yes. makes sense. > how about: > - if (netdev_vport->dev->reg_state != NETREG_UNREGISTERING) > + if (netdev_vport->dev->priv_flags & IFF_OVS_DATAPATH) Yes, this seems safer. Is the check for NETREG_UNREGISTERING in dp_device_event() still needed given that we are checking the event? > ovs_netdev_destroy_dev() name instead ovs_netdev_unlink_dev() name? How about detach_dev? ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH net-next] openvswitch: fix vport-netdev unregister 2013-10-11 22:02 ` Jesse Gross @ 2013-10-11 22:48 ` Alexei Starovoitov 0 siblings, 0 replies; 16+ messages in thread From: Alexei Starovoitov @ 2013-10-11 22:48 UTC (permalink / raw) To: Jesse Gross Cc: Pravin Shelar, David S. Miller, Jiri Pirko, dev@openvswitch.org, netdev On Fri, Oct 11, 2013 at 3:02 PM, Jesse Gross <jesse@nicira.com> wrote: > On Fri, Oct 11, 2013 at 1:03 PM, Alexei Starovoitov <ast@plumgrid.com> wrote: >> On Fri, Oct 11, 2013 at 11:11 AM, Jesse Gross <jesse@nicira.com> wrote: >>> On Thu, Oct 10, 2013 at 9:48 PM, Alexei Starovoitov <ast@plumgrid.com> wrote: >>>> On Thu, Oct 10, 2013 at 8:56 PM, Jesse Gross <jesse@nicira.com> wrote: >>>>> However, the check dev->reg_state in netdev_destroy() looks racy to >>>>> me, as it could already be in NETREG_UNREGISTERED even if we already >>>>> processed this device. >>>> >>>> you mean that netdev_destroy() will see reg_state == netreg_unregistered, >>>> while dp_device_event() didn't see reg_state == netreg_unregistering yet? >>>> or dp_device_event() saw it, proceeded to do unlink and >>>> netdev_destroy() ran in parallel? >>>> well, that's why reg_state == netreg_unregistering check in netdev_destroy() >>>> is done with rtnl_lock() held. >>>> reg_state cannot go into netreg_unregistered state skipping >>>> netreg_unregistering and notifier. >>>> therefore I don't think it's racy. >>>> >>>> In ovs_dp_notify_wq() you're checking for both unregistering and >>>> unregistered and that makes >>>> sense, since workq can run after unregistering notifier called and >>>> netdev_run_todo() >>>> already changed the state to unregistered. >>>> But here it's not the case. >>> >>> ovs_dp_notify_wq() calls ovs_dp_detach_port(), which indirectly calls >>> netdev_destroy() so it seems like it actually is the same case to me. >> >> yes. makes sense. >> how about: >> - if (netdev_vport->dev->reg_state != NETREG_UNREGISTERING) >> + if (netdev_vport->dev->priv_flags & IFF_OVS_DATAPATH) > > Yes, this seems safer. Is the check for NETREG_UNREGISTERING in > dp_device_event() still needed given that we are checking the event? at least some check is needed, since NETDEV_UNREGISTER event can be received again as rebroadcast with reg_state=netreg_unregistered if wq got delayed. Probably better to combine checks event == unreg and state == unregistering under one 'if' to avoid unnecessary workq wakeup. Or may be better to do it as if (event == NETDEV_UNREGISTER && dev->priv_flags & IFF_OVS_DATAPATH) { ovs_netdev_detach_dev(); queue_work(); } since we're at it... what should be the behavior for namespace moves? If dev attached to ovs and being moved into a different net namespace, I think ovs should detach and forget the dev... Today ovs ignores this notification and we may have ovs-dp in one net and attached dev in a different net. So if you do: ovs-dpctl add-if test tap1 ip link set tap1 netns 3512 and then try to remove tap1 inside the namespace: ip tuntap del dev tap1 mode tap it will just hang: [ 852.572476] unregister_netdevice: waiting for tap1 to become free. Usage count = 3 [ 862.578769] unregister_netdevice: waiting for tap1 to become free. Usage count = 3 >> ovs_netdev_destroy_dev() name instead ovs_netdev_unlink_dev() name? > > How about detach_dev? that's better name indeed. Will respin V2. Thanks Alexei ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2013-10-11 22:48 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-10-09 3:07 [PATCH net-next] openvswitch: fix vport-netdev unregister Alexei Starovoitov
2013-10-10 1:24 ` Jesse Gross
2013-10-10 3:02 ` Pravin Shelar
2013-10-10 4:11 ` Alexei Starovoitov
2013-10-10 6:07 ` Pravin Shelar
2013-10-10 6:26 ` Alexei Starovoitov
2013-10-10 18:21 ` Pravin Shelar
2013-10-10 20:48 ` Alexei Starovoitov
[not found] ` <CAMEtUuxH24G6pQtSFznrK7w5mdP3za4uQfKKEEaEngW4SJajbA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-10-10 22:38 ` Pravin Shelar
2013-10-11 0:47 ` Alexei Starovoitov
2013-10-11 3:56 ` Jesse Gross
[not found] ` <CAEP_g=_meOUPrYc7S8rM4NwT_7qG1JZrHJkOg6CmbjDc-rFjAQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-10-11 4:48 ` Alexei Starovoitov
2013-10-11 18:11 ` Jesse Gross
2013-10-11 20:03 ` Alexei Starovoitov
[not found] ` <CAMEtUuwAJr0uMv4b_SfYG378p06os3bvT3kyvVGpKguAZbNcSw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-10-11 22:02 ` Jesse Gross
2013-10-11 22:48 ` Alexei Starovoitov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox