* unregister_netdevice: waiting for lo to become free. Usage count = 8 @ 2011-04-15 7:01 Hans Schillstrom 2011-04-15 7:27 ` Eric W. Biederman 2011-04-15 20:11 ` Julian Anastasov 0 siblings, 2 replies; 8+ messages in thread From: Hans Schillstrom @ 2011-04-15 7:01 UTC (permalink / raw) To: Julian Anastasov; +Cc: Simon Horman, netdev, lvs-devel, Eric W. Biederman Hello Julian I'm trying to fix the cleanup process when a namespace get "killed", which is a new feature for ipvs. However an old problem appears again When there has been traffic trough ipvs where the destination is unreachable the usage count on loopback dev increases one for every packet.... I guess thats because of this rule : # ip route list table all ... unreachable default dev lo table 0 proto kernel metric 4294967295 error -101 hoplimit 25 ... I made a test just forwarding packets through the same container (ipvs loaded) to an unreachable destination and that test had a balanced count i.e. it was possible to reboot the container. Do you have an idea why this happens in the ipvs case ? Regards Hans Schillstrom <hans@schillstrom.com> ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: unregister_netdevice: waiting for lo to become free. Usage count = 8 2011-04-15 7:01 unregister_netdevice: waiting for lo to become free. Usage count = 8 Hans Schillstrom @ 2011-04-15 7:27 ` Eric W. Biederman 2011-04-15 20:11 ` Julian Anastasov 1 sibling, 0 replies; 8+ messages in thread From: Eric W. Biederman @ 2011-04-15 7:27 UTC (permalink / raw) To: Hans Schillstrom; +Cc: Julian Anastasov, Simon Horman, netdev, lvs-devel Hans Schillstrom <hans@schillstrom.com> writes: > Hello Julian > > I'm trying to fix the cleanup process when a namespace get "killed", > which is a new feature for ipvs. However an old problem appears again > > When there has been traffic trough ipvs where the destination is unreachable > the usage count on loopback dev increases one for every packet.... > I guess thats because of this rule : > > # ip route list table all > ... > unreachable default dev lo table 0 proto kernel metric 4294967295 error -101 hoplimit 25 > ... > > I made a test just forwarding packets through the same container (ipvs loaded) > to an unreachable destination and that test had a balanced count i.e. it was possible to reboot the container. > > Do you have an idea why this happens in the ipvs case ? Hans. I do know that most outstanding references when you clean up a container get moved to the loopback device. So it may not originally be the loopback device itself where the reference counting is wrong. Eric ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: unregister_netdevice: waiting for lo to become free. Usage count = 8 2011-04-15 7:01 unregister_netdevice: waiting for lo to become free. Usage count = 8 Hans Schillstrom 2011-04-15 7:27 ` Eric W. Biederman @ 2011-04-15 20:11 ` Julian Anastasov 2011-04-18 6:10 ` Hans Schillstrom 1 sibling, 1 reply; 8+ messages in thread From: Julian Anastasov @ 2011-04-15 20:11 UTC (permalink / raw) To: Hans Schillstrom; +Cc: Simon Horman, netdev, lvs-devel, Eric W. Biederman Hello, On Fri, 15 Apr 2011, Hans Schillstrom wrote: > Hello Julian > > I'm trying to fix the cleanup process when a namespace get "killed", > which is a new feature for ipvs. However an old problem appears again > > When there has been traffic trough ipvs where the destination is unreachable > the usage count on loopback dev increases one for every packet.... What is the kernel version? > I guess thats because of this rule : > > # ip route list table all > ... > unreachable default dev lo table 0 proto kernel metric 4294967295 error -101 hoplimit 25 > ... > > I made a test just forwarding packets through the same container (ipvs loaded) > to an unreachable destination and that test had a balanced count i.e. it was possible to reboot the container. Can you explain, what do you mean with unreachable destination? Are you adding some rejecting route? > Do you have an idea why this happens in the ipvs case ? Do you see with debug level 3 the "Removing destination" messages. Only real servers can hold dest->dst_cache reference for dev which can be a problem because the real servers are not deleted immediately - on traffic they are moved to trash list. But ip_vs_trash_cleanup() should remove any left structures. You should check in debug that all servers are deleted. If all real server structures are freed but problem remains we should look more deeply in the dest->dst_cache usage. DR or NAT is used? I assume cleanup really happens in this order: ip_vs_cleanup(): nf_unregister_hooks() ... ip_vs_conn_cleanup() ... ip_vs_control_cleanup() Regards -- Julian Anastasov <ja@ssi.bg> ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: unregister_netdevice: waiting for lo to become free. Usage count = 8 2011-04-15 20:11 ` Julian Anastasov @ 2011-04-18 6:10 ` Hans Schillstrom 2011-04-18 10:43 ` Hans Schillstrom 0 siblings, 1 reply; 8+ messages in thread From: Hans Schillstrom @ 2011-04-18 6:10 UTC (permalink / raw) To: Julian Anastasov; +Cc: Simon Horman, netdev, lvs-devel, Eric W. Biederman On Friday, April 15, 2011 22:11:32 Julian Anastasov wrote: > > Hello, > > On Fri, 15 Apr 2011, Hans Schillstrom wrote: > > > Hello Julian > > > > I'm trying to fix the cleanup process when a namespace get "killed", > > which is a new feature for ipvs. However an old problem appears again > > > > When there has been traffic trough ipvs where the destination is unreachable > > the usage count on loopback dev increases one for every packet.... > > What is the kernel version? net-next-2.6 i.e. 2.6.39-rc2 > > > I guess thats because of this rule : > > > > # ip route list table all > > ... > > unreachable default dev lo table 0 proto kernel metric 4294967295 error -101 hoplimit 25 > > ... > > > > I made a test just forwarding packets through the same container (ipvs loaded) > > to an unreachable destination and that test had a balanced count i.e. it was possible to reboot the container. > > Can you explain, what do you mean with unreachable > destination? Are you adding some rejecting route? This comment from Eric, do explain what happens: "Hans. I do know that most outstanding references when you clean up a container get moved to the loopback device. So it may not originally be the loopback device itself where the reference counting is wrong. Eric" > > > Do you have an idea why this happens in the ipvs case ? > > Do you see with debug level 3 the "Removing destination" > messages. Only real servers can hold dest->dst_cache reference > for dev which can be a problem because the real servers are not > deleted immediately - on traffic they are moved to trash > list. But ip_vs_trash_cleanup() should remove any left > structures. You should check in debug that all servers are > deleted. If all real server structures are freed but > problem remains we should look more deeply in the > dest->dst_cache usage. DR or NAT is used? I have got some wise words from Eric, i.e. moved all ipvs register/unregister from subsys to device that solved plenty of my issues (Thanks Eric) I'll will post a Patch later on regarding this. > > I assume cleanup really happens in this order: > > ip_vs_cleanup(): > nf_unregister_hooks() This will not happens in a namespace since nf_unregister_hooks() is not per netns. We might need a flag but I don't think so, further test will show.... > ... > ip_vs_conn_cleanup() > ... > ip_vs_control_cleanup() > Regards Hans ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: unregister_netdevice: waiting for lo to become free. Usage count = 8 2011-04-18 6:10 ` Hans Schillstrom @ 2011-04-18 10:43 ` Hans Schillstrom 2011-04-18 21:12 ` Julian Anastasov 0 siblings, 1 reply; 8+ messages in thread From: Hans Schillstrom @ 2011-04-18 10:43 UTC (permalink / raw) To: Julian Anastasov; +Cc: Simon Horman, netdev, lvs-devel, Eric W. Biederman Hello On Monday, April 18, 2011 08:10:26 Hans Schillstrom wrote: > On Friday, April 15, 2011 22:11:32 Julian Anastasov wrote: > > > > Hello, > > > > On Fri, 15 Apr 2011, Hans Schillstrom wrote: > > > > > Hello Julian > > > > > > I'm trying to fix the cleanup process when a namespace get "killed", > > > which is a new feature for ipvs. However an old problem appears again > > > > > > When there has been traffic trough ipvs where the destination is unreachable > > > the usage count on loopback dev increases one for every packet.... [snip] > > > > > Do you have an idea why this happens in the ipvs case ? > > > > Do you see with debug level 3 the "Removing destination" > > messages. Only real servers can hold dest->dst_cache reference > > for dev which can be a problem because the real servers are not > > deleted immediately - on traffic they are moved to trash > > list. Actually I forgot to tell there is a need for a ip_vs_service_cleanup() due to above. Do you see any drawbacks with it ? /* * Delete service by {netns} in the service table. */ static void ip_vs_service_cleanup(struct net *net) { unsigned hash; struct ip_vs_service *svc, *tmp; EnterFunction(2); /* Check for "full" addressed entries */ for (hash = 0; hash<IP_VS_SVC_TAB_SIZE; hash++) { write_lock_bh(&__ip_vs_svc_lock); list_for_each_entry_safe(svc, tmp, &ip_vs_svc_table[hash], s_list) { if (net_eq(svc->net, net)) { ip_vs_svc_unhash(svc); __ip_vs_del_service(svc); } } list_for_each_entry_safe(svc, tmp, &ip_vs_svc_fwm_table[hash], f_list) { if (net_eq(svc->net, net)) { ip_vs_svc_unhash(svc); __ip_vs_del_service(svc); } } write_unlock_bh(&__ip_vs_svc_lock); } LeaveFunction(2); } Called just after the __ip_vs_control_cleanup_sysctl() static void __net_exit __ip_vs_control_cleanup(struct net *net) { struct netns_ipvs *ipvs = net_ipvs(net); ip_vs_trash_cleanup(net); ip_vs_stop_estimator(net, &ipvs->tot_stats); __ip_vs_control_cleanup_sysctl(net); ip_vs_service_cleanup(net); proc_net_remove(net, "ip_vs_stats_percpu"); proc_net_remove(net, "ip_vs_stats"); proc_net_remove(net, "ip_vs"); free_percpu(ipvs->tot_stats.cpustats); } > > But ip_vs_trash_cleanup() should remove any left > > structures. You should check in debug that all servers are > > deleted. If all real server structures are freed but > > problem remains we should look more deeply in the > > dest->dst_cache usage. DR or NAT is used? > > I have got some wise words from Eric, > i.e. moved all ipvs register/unregister from subsys to device > that solved plenty of my issues > (Thanks Eric) > > I'll will post a Patch later on regarding this. > > > > > I assume cleanup really happens in this order: > > > > ip_vs_cleanup(): > > nf_unregister_hooks() > > This will not happens in a namespace since nf_unregister_hooks() is not per netns. > We might need a flag but I don't think so, further test will show.... > > > ... > > ip_vs_conn_cleanup() > > ... > > ip_vs_control_cleanup() > > > Regards Hans ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: unregister_netdevice: waiting for lo to become free. Usage count = 8 2011-04-18 10:43 ` Hans Schillstrom @ 2011-04-18 21:12 ` Julian Anastasov 2011-04-18 21:48 ` Hans Schillstrom 0 siblings, 1 reply; 8+ messages in thread From: Julian Anastasov @ 2011-04-18 21:12 UTC (permalink / raw) To: Hans Schillstrom; +Cc: Simon Horman, netdev, lvs-devel, Eric W. Biederman Hello, On Mon, 18 Apr 2011, Hans Schillstrom wrote: > Actually I forgot to tell there is a need for a > ip_vs_service_cleanup() due to above. > Do you see any drawbacks with it ? May be ip_vs_service_cleanup() should call only ip_vs_flush(), under __ip_vs_mutex. > /* > * Delete service by {netns} in the service table. > */ > static void ip_vs_service_cleanup(struct net *net) > { > unsigned hash; > struct ip_vs_service *svc, *tmp; > > EnterFunction(2); > /* Check for "full" addressed entries */ > for (hash = 0; hash<IP_VS_SVC_TAB_SIZE; hash++) { > write_lock_bh(&__ip_vs_svc_lock); > list_for_each_entry_safe(svc, tmp, &ip_vs_svc_table[hash], > s_list) { > if (net_eq(svc->net, net)) { > ip_vs_svc_unhash(svc); > __ip_vs_del_service(svc); > } > } > list_for_each_entry_safe(svc, tmp, &ip_vs_svc_fwm_table[hash], > f_list) { > if (net_eq(svc->net, net)) { > ip_vs_svc_unhash(svc); > __ip_vs_del_service(svc); > } > } > write_unlock_bh(&__ip_vs_svc_lock); > } > LeaveFunction(2); > } > > Called just after the __ip_vs_control_cleanup_sysctl() Hm, no. ip_vs_service_cleanup() should be called by ip_vs_cleanup() before or after nf_unregister_hooks(). The rule is that ip_vs_flush() should be called before ip_vs_conn_flush() because after ip_vs_flush() no more connections can be created and even if hooks are still registered the packets can not create conns in the netns. Then ip_vs_conn_flush() will remove all existing connections and ip_vs_control_cleanup() can remove all real servers with ip_vs_trash_cleanup(). I mean, per-netns calls. Also, may be all code that was called in old kernels by ip_vs_cleanup() should be now called by __ip_vs_cleanup(net), i.e. we can preserve the needed order of all functions but now also per-netns. For example, for ip_vs_ctl.c ip_vs_control_init() can remain as global but it should not register ipvs_control_ops. Then we can rename __ip_vs_control_init to ip_vs_control_init_net() and to call it from __ip_vs_init(). I.e. all such files will have global function and also _init_net and _cleanup_net. Now there are many register_pernet_subsys() calls and I'm not sure we preserve the needed order for cleanup. Are the ->exit methods called in reverse order? I don't see it in ops_exit_list() and we can not rely on such registration order. I think, ip_vs_init() should call global functions as now but __ip_vs_init() and __ip_vs_cleanup() should call the _net methods in right order. Regards -- Julian Anastasov <ja@ssi.bg> ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: unregister_netdevice: waiting for lo to become free. Usage count = 8 2011-04-18 21:12 ` Julian Anastasov @ 2011-04-18 21:48 ` Hans Schillstrom 2011-04-18 22:23 ` Julian Anastasov 0 siblings, 1 reply; 8+ messages in thread From: Hans Schillstrom @ 2011-04-18 21:48 UTC (permalink / raw) To: Julian Anastasov; +Cc: Simon Horman, netdev, lvs-devel, Eric W. Biederman On Monday, April 18, 2011 23:12:27 Julian Anastasov wrote: > > Hello, > > On Mon, 18 Apr 2011, Hans Schillstrom wrote: > > > Actually I forgot to tell there is a need for a > > ip_vs_service_cleanup() due to above. > > Do you see any drawbacks with it ? > > May be ip_vs_service_cleanup() should call only > ip_vs_flush(), under __ip_vs_mutex. Hmm, I'm not sure if the IP_VS_WAIT_WHILE() in ip_vs_flush is a good idea in this case... That was why I wrote ip_vs_service_cleanup() > > > /* > > * Delete service by {netns} in the service table. > > */ > > static void ip_vs_service_cleanup(struct net *net) > > { > > unsigned hash; > > struct ip_vs_service *svc, *tmp; > > > > EnterFunction(2); > > /* Check for "full" addressed entries */ > > for (hash = 0; hash<IP_VS_SVC_TAB_SIZE; hash++) { > > write_lock_bh(&__ip_vs_svc_lock); > > list_for_each_entry_safe(svc, tmp, &ip_vs_svc_table[hash], > > s_list) { > > if (net_eq(svc->net, net)) { > > ip_vs_svc_unhash(svc); > > __ip_vs_del_service(svc); > > } > > } > > list_for_each_entry_safe(svc, tmp, &ip_vs_svc_fwm_table[hash], > > f_list) { > > if (net_eq(svc->net, net)) { > > ip_vs_svc_unhash(svc); > > __ip_vs_del_service(svc); > > } > > } > > write_unlock_bh(&__ip_vs_svc_lock); > > } > > LeaveFunction(2); > > } > > > > Called just after the __ip_vs_control_cleanup_sysctl() > > Hm, no. ip_vs_service_cleanup() should be called > by ip_vs_cleanup() before or after nf_unregister_hooks(). > The rule is that ip_vs_flush() should be called before > ip_vs_conn_flush() because after ip_vs_flush() no more > connections can be created and even if hooks are still > registered the packets can not create conns in the netns. Then > ip_vs_conn_flush() will remove all existing connections and > ip_vs_control_cleanup() can remove all real servers with > ip_vs_trash_cleanup(). I mean, per-netns calls. > OK I will try to do that, both with and without throttle in ip_vs_in() > Also, may be all code that was called in old > kernels by ip_vs_cleanup() should be now called by > __ip_vs_cleanup(net), i.e. we can preserve the needed order > of all functions but now also per-netns. For example, for > ip_vs_ctl.c ip_vs_control_init() can remain as global but it > should not register ipvs_control_ops. Then we > can rename __ip_vs_control_init to ip_vs_control_init_net() > and to call it from __ip_vs_init(). I.e. all such files > will have global function and also _init_net and > _cleanup_net. Now there are many register_pernet_subsys() > calls and I'm not sure we preserve the needed order for > cleanup. Are the ->exit methods called in reverse order? Yes > I don't see it in ops_exit_list() and we can not rely > on such registration order. I think, ip_vs_init() should > call global functions as now but __ip_vs_init() and > __ip_vs_cleanup() should call the _net methods in right > order. Exactly, I have already done that in my next patch, and some other small changes :-) For the ip_vs.ko there is only one register/unregister now, the schedulers still have their own. Hopefully the patch is ready to morrow Regards Hans ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: unregister_netdevice: waiting for lo to become free. Usage count = 8 2011-04-18 21:48 ` Hans Schillstrom @ 2011-04-18 22:23 ` Julian Anastasov 0 siblings, 0 replies; 8+ messages in thread From: Julian Anastasov @ 2011-04-18 22:23 UTC (permalink / raw) To: Hans Schillstrom; +Cc: Simon Horman, netdev, lvs-devel, Eric W. Biederman Hello, On Mon, 18 Apr 2011, Hans Schillstrom wrote: > On Monday, April 18, 2011 23:12:27 Julian Anastasov wrote: > > > > Hello, > > > > On Mon, 18 Apr 2011, Hans Schillstrom wrote: > > > > > Actually I forgot to tell there is a need for a > > > ip_vs_service_cleanup() due to above. > > > Do you see any drawbacks with it ? > > > > May be ip_vs_service_cleanup() should call only > > ip_vs_flush(), under __ip_vs_mutex. > > Hmm, > I'm not sure if the IP_VS_WAIT_WHILE() in ip_vs_flush is a good idea in this case... > That was why I wrote ip_vs_service_cleanup() IP_VS_WAIT_WHILE should be called because some schedulers do not use locks for ->schedule, eg. WLC. They rely on svc->usecnt reference to hold the virtual service. Nothing changes now with netns. It is currently the only way to modify or delete virtual service or scheduler. Of course, __ip_vs_svc_lock is now global but we do not have a choice. > > _cleanup_net. Now there are many register_pernet_subsys() > > calls and I'm not sure we preserve the needed order for > > cleanup. Are the ->exit methods called in reverse order? > > Yes You mean, only in the planned patch, yes? > > I don't see it in ops_exit_list() and we can not rely > > on such registration order. I think, ip_vs_init() should > > call global functions as now but __ip_vs_init() and > > __ip_vs_cleanup() should call the _net methods in right > > order. > > Exactly, > I have already done that in my next patch, and some other small changes :-) > For the ip_vs.ko there is only one register/unregister now, the schedulers still have their own. > Hopefully the patch is ready to morrow OK, very good. Regards -- Julian Anastasov <ja@ssi.bg> ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2011-04-18 22:23 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-04-15 7:01 unregister_netdevice: waiting for lo to become free. Usage count = 8 Hans Schillstrom 2011-04-15 7:27 ` Eric W. Biederman 2011-04-15 20:11 ` Julian Anastasov 2011-04-18 6:10 ` Hans Schillstrom 2011-04-18 10:43 ` Hans Schillstrom 2011-04-18 21:12 ` Julian Anastasov 2011-04-18 21:48 ` Hans Schillstrom 2011-04-18 22:23 ` Julian Anastasov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox