* unregister_netdevice: waiting for lo to become free. Usage count = 8
@ 2011-04-15 7:01 Hans Schillstrom
2011-04-15 7:27 ` Eric W. Biederman
2011-04-15 20:11 ` Julian Anastasov
0 siblings, 2 replies; 8+ messages in thread
From: Hans Schillstrom @ 2011-04-15 7:01 UTC (permalink / raw)
To: Julian Anastasov; +Cc: Simon Horman, netdev, lvs-devel, Eric W. Biederman
Hello Julian
I'm trying to fix the cleanup process when a namespace get "killed",
which is a new feature for ipvs. However an old problem appears again
When there has been traffic trough ipvs where the destination is unreachable
the usage count on loopback dev increases one for every packet....
I guess thats because of this rule :
# ip route list table all
...
unreachable default dev lo table 0 proto kernel metric 4294967295 error -101 hoplimit 25
...
I made a test just forwarding packets through the same container (ipvs loaded)
to an unreachable destination and that test had a balanced count i.e. it was possible to reboot the container.
Do you have an idea why this happens in the ipvs case ?
Regards
Hans Schillstrom <hans@schillstrom.com>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: unregister_netdevice: waiting for lo to become free. Usage count = 8
2011-04-15 7:01 unregister_netdevice: waiting for lo to become free. Usage count = 8 Hans Schillstrom
@ 2011-04-15 7:27 ` Eric W. Biederman
2011-04-15 20:11 ` Julian Anastasov
1 sibling, 0 replies; 8+ messages in thread
From: Eric W. Biederman @ 2011-04-15 7:27 UTC (permalink / raw)
To: Hans Schillstrom; +Cc: Julian Anastasov, Simon Horman, netdev, lvs-devel
Hans Schillstrom <hans@schillstrom.com> writes:
> Hello Julian
>
> I'm trying to fix the cleanup process when a namespace get "killed",
> which is a new feature for ipvs. However an old problem appears again
>
> When there has been traffic trough ipvs where the destination is unreachable
> the usage count on loopback dev increases one for every packet....
> I guess thats because of this rule :
>
> # ip route list table all
> ...
> unreachable default dev lo table 0 proto kernel metric 4294967295 error -101 hoplimit 25
> ...
>
> I made a test just forwarding packets through the same container (ipvs loaded)
> to an unreachable destination and that test had a balanced count i.e. it was possible to reboot the container.
>
> Do you have an idea why this happens in the ipvs case ?
Hans. I do know that most outstanding references when you clean up a
container get moved to the loopback device. So it may not originally
be the loopback device itself where the reference counting is wrong.
Eric
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: unregister_netdevice: waiting for lo to become free. Usage count = 8
2011-04-15 7:01 unregister_netdevice: waiting for lo to become free. Usage count = 8 Hans Schillstrom
2011-04-15 7:27 ` Eric W. Biederman
@ 2011-04-15 20:11 ` Julian Anastasov
2011-04-18 6:10 ` Hans Schillstrom
1 sibling, 1 reply; 8+ messages in thread
From: Julian Anastasov @ 2011-04-15 20:11 UTC (permalink / raw)
To: Hans Schillstrom; +Cc: Simon Horman, netdev, lvs-devel, Eric W. Biederman
Hello,
On Fri, 15 Apr 2011, Hans Schillstrom wrote:
> Hello Julian
>
> I'm trying to fix the cleanup process when a namespace get "killed",
> which is a new feature for ipvs. However an old problem appears again
>
> When there has been traffic trough ipvs where the destination is unreachable
> the usage count on loopback dev increases one for every packet....
What is the kernel version?
> I guess thats because of this rule :
>
> # ip route list table all
> ...
> unreachable default dev lo table 0 proto kernel metric 4294967295 error -101 hoplimit 25
> ...
>
> I made a test just forwarding packets through the same container (ipvs loaded)
> to an unreachable destination and that test had a balanced count i.e. it was possible to reboot the container.
Can you explain, what do you mean with unreachable
destination? Are you adding some rejecting route?
> Do you have an idea why this happens in the ipvs case ?
Do you see with debug level 3 the "Removing destination"
messages. Only real servers can hold dest->dst_cache reference
for dev which can be a problem because the real servers are not
deleted immediately - on traffic they are moved to trash
list. But ip_vs_trash_cleanup() should remove any left
structures. You should check in debug that all servers are
deleted. If all real server structures are freed but
problem remains we should look more deeply in the
dest->dst_cache usage. DR or NAT is used?
I assume cleanup really happens in this order:
ip_vs_cleanup():
nf_unregister_hooks()
...
ip_vs_conn_cleanup()
...
ip_vs_control_cleanup()
Regards
--
Julian Anastasov <ja@ssi.bg>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: unregister_netdevice: waiting for lo to become free. Usage count = 8
2011-04-15 20:11 ` Julian Anastasov
@ 2011-04-18 6:10 ` Hans Schillstrom
2011-04-18 10:43 ` Hans Schillstrom
0 siblings, 1 reply; 8+ messages in thread
From: Hans Schillstrom @ 2011-04-18 6:10 UTC (permalink / raw)
To: Julian Anastasov; +Cc: Simon Horman, netdev, lvs-devel, Eric W. Biederman
On Friday, April 15, 2011 22:11:32 Julian Anastasov wrote:
>
> Hello,
>
> On Fri, 15 Apr 2011, Hans Schillstrom wrote:
>
> > Hello Julian
> >
> > I'm trying to fix the cleanup process when a namespace get "killed",
> > which is a new feature for ipvs. However an old problem appears again
> >
> > When there has been traffic trough ipvs where the destination is unreachable
> > the usage count on loopback dev increases one for every packet....
>
> What is the kernel version?
net-next-2.6 i.e. 2.6.39-rc2
>
> > I guess thats because of this rule :
> >
> > # ip route list table all
> > ...
> > unreachable default dev lo table 0 proto kernel metric 4294967295 error -101 hoplimit 25
> > ...
> >
> > I made a test just forwarding packets through the same container (ipvs loaded)
> > to an unreachable destination and that test had a balanced count i.e. it was possible to reboot the container.
>
> Can you explain, what do you mean with unreachable
> destination? Are you adding some rejecting route?
This comment from Eric, do explain what happens:
"Hans.
I do know that most outstanding references when you clean up a
container get moved to the loopback device. So it may not originally
be the loopback device itself where the reference counting is wrong.
Eric"
>
> > Do you have an idea why this happens in the ipvs case ?
>
> Do you see with debug level 3 the "Removing destination"
> messages. Only real servers can hold dest->dst_cache reference
> for dev which can be a problem because the real servers are not
> deleted immediately - on traffic they are moved to trash
> list. But ip_vs_trash_cleanup() should remove any left
> structures. You should check in debug that all servers are
> deleted. If all real server structures are freed but
> problem remains we should look more deeply in the
> dest->dst_cache usage. DR or NAT is used?
I have got some wise words from Eric,
i.e. moved all ipvs register/unregister from subsys to device
that solved plenty of my issues
(Thanks Eric)
I'll will post a Patch later on regarding this.
>
> I assume cleanup really happens in this order:
>
> ip_vs_cleanup():
> nf_unregister_hooks()
This will not happens in a namespace since nf_unregister_hooks() is not per netns.
We might need a flag but I don't think so, further test will show....
> ...
> ip_vs_conn_cleanup()
> ...
> ip_vs_control_cleanup()
>
Regards
Hans
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: unregister_netdevice: waiting for lo to become free. Usage count = 8
2011-04-18 6:10 ` Hans Schillstrom
@ 2011-04-18 10:43 ` Hans Schillstrom
2011-04-18 21:12 ` Julian Anastasov
0 siblings, 1 reply; 8+ messages in thread
From: Hans Schillstrom @ 2011-04-18 10:43 UTC (permalink / raw)
To: Julian Anastasov; +Cc: Simon Horman, netdev, lvs-devel, Eric W. Biederman
Hello
On Monday, April 18, 2011 08:10:26 Hans Schillstrom wrote:
> On Friday, April 15, 2011 22:11:32 Julian Anastasov wrote:
> >
> > Hello,
> >
> > On Fri, 15 Apr 2011, Hans Schillstrom wrote:
> >
> > > Hello Julian
> > >
> > > I'm trying to fix the cleanup process when a namespace get "killed",
> > > which is a new feature for ipvs. However an old problem appears again
> > >
> > > When there has been traffic trough ipvs where the destination is unreachable
> > > the usage count on loopback dev increases one for every packet....
[snip]
> >
> > > Do you have an idea why this happens in the ipvs case ?
> >
> > Do you see with debug level 3 the "Removing destination"
> > messages. Only real servers can hold dest->dst_cache reference
> > for dev which can be a problem because the real servers are not
> > deleted immediately - on traffic they are moved to trash
> > list.
Actually I forgot to tell there is a need for a
ip_vs_service_cleanup() due to above.
Do you see any drawbacks with it ?
/*
* Delete service by {netns} in the service table.
*/
static void ip_vs_service_cleanup(struct net *net)
{
unsigned hash;
struct ip_vs_service *svc, *tmp;
EnterFunction(2);
/* Check for "full" addressed entries */
for (hash = 0; hash<IP_VS_SVC_TAB_SIZE; hash++) {
write_lock_bh(&__ip_vs_svc_lock);
list_for_each_entry_safe(svc, tmp, &ip_vs_svc_table[hash],
s_list) {
if (net_eq(svc->net, net)) {
ip_vs_svc_unhash(svc);
__ip_vs_del_service(svc);
}
}
list_for_each_entry_safe(svc, tmp, &ip_vs_svc_fwm_table[hash],
f_list) {
if (net_eq(svc->net, net)) {
ip_vs_svc_unhash(svc);
__ip_vs_del_service(svc);
}
}
write_unlock_bh(&__ip_vs_svc_lock);
}
LeaveFunction(2);
}
Called just after the __ip_vs_control_cleanup_sysctl()
static void __net_exit __ip_vs_control_cleanup(struct net *net)
{
struct netns_ipvs *ipvs = net_ipvs(net);
ip_vs_trash_cleanup(net);
ip_vs_stop_estimator(net, &ipvs->tot_stats);
__ip_vs_control_cleanup_sysctl(net);
ip_vs_service_cleanup(net);
proc_net_remove(net, "ip_vs_stats_percpu");
proc_net_remove(net, "ip_vs_stats");
proc_net_remove(net, "ip_vs");
free_percpu(ipvs->tot_stats.cpustats);
}
> > But ip_vs_trash_cleanup() should remove any left
> > structures. You should check in debug that all servers are
> > deleted. If all real server structures are freed but
> > problem remains we should look more deeply in the
> > dest->dst_cache usage. DR or NAT is used?
>
> I have got some wise words from Eric,
> i.e. moved all ipvs register/unregister from subsys to device
> that solved plenty of my issues
> (Thanks Eric)
>
> I'll will post a Patch later on regarding this.
>
> >
> > I assume cleanup really happens in this order:
> >
> > ip_vs_cleanup():
> > nf_unregister_hooks()
>
> This will not happens in a namespace since nf_unregister_hooks() is not per netns.
> We might need a flag but I don't think so, further test will show....
>
> > ...
> > ip_vs_conn_cleanup()
> > ...
> > ip_vs_control_cleanup()
> >
>
Regards
Hans
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: unregister_netdevice: waiting for lo to become free. Usage count = 8
2011-04-18 10:43 ` Hans Schillstrom
@ 2011-04-18 21:12 ` Julian Anastasov
2011-04-18 21:48 ` Hans Schillstrom
0 siblings, 1 reply; 8+ messages in thread
From: Julian Anastasov @ 2011-04-18 21:12 UTC (permalink / raw)
To: Hans Schillstrom; +Cc: Simon Horman, netdev, lvs-devel, Eric W. Biederman
Hello,
On Mon, 18 Apr 2011, Hans Schillstrom wrote:
> Actually I forgot to tell there is a need for a
> ip_vs_service_cleanup() due to above.
> Do you see any drawbacks with it ?
May be ip_vs_service_cleanup() should call only
ip_vs_flush(), under __ip_vs_mutex.
> /*
> * Delete service by {netns} in the service table.
> */
> static void ip_vs_service_cleanup(struct net *net)
> {
> unsigned hash;
> struct ip_vs_service *svc, *tmp;
>
> EnterFunction(2);
> /* Check for "full" addressed entries */
> for (hash = 0; hash<IP_VS_SVC_TAB_SIZE; hash++) {
> write_lock_bh(&__ip_vs_svc_lock);
> list_for_each_entry_safe(svc, tmp, &ip_vs_svc_table[hash],
> s_list) {
> if (net_eq(svc->net, net)) {
> ip_vs_svc_unhash(svc);
> __ip_vs_del_service(svc);
> }
> }
> list_for_each_entry_safe(svc, tmp, &ip_vs_svc_fwm_table[hash],
> f_list) {
> if (net_eq(svc->net, net)) {
> ip_vs_svc_unhash(svc);
> __ip_vs_del_service(svc);
> }
> }
> write_unlock_bh(&__ip_vs_svc_lock);
> }
> LeaveFunction(2);
> }
>
> Called just after the __ip_vs_control_cleanup_sysctl()
Hm, no. ip_vs_service_cleanup() should be called
by ip_vs_cleanup() before or after nf_unregister_hooks().
The rule is that ip_vs_flush() should be called before
ip_vs_conn_flush() because after ip_vs_flush() no more
connections can be created and even if hooks are still
registered the packets can not create conns in the netns. Then
ip_vs_conn_flush() will remove all existing connections and
ip_vs_control_cleanup() can remove all real servers with
ip_vs_trash_cleanup(). I mean, per-netns calls.
Also, may be all code that was called in old
kernels by ip_vs_cleanup() should be now called by
__ip_vs_cleanup(net), i.e. we can preserve the needed order
of all functions but now also per-netns. For example, for
ip_vs_ctl.c ip_vs_control_init() can remain as global but it
should not register ipvs_control_ops. Then we
can rename __ip_vs_control_init to ip_vs_control_init_net()
and to call it from __ip_vs_init(). I.e. all such files
will have global function and also _init_net and
_cleanup_net. Now there are many register_pernet_subsys()
calls and I'm not sure we preserve the needed order for
cleanup. Are the ->exit methods called in reverse order?
I don't see it in ops_exit_list() and we can not rely
on such registration order. I think, ip_vs_init() should
call global functions as now but __ip_vs_init() and
__ip_vs_cleanup() should call the _net methods in right
order.
Regards
--
Julian Anastasov <ja@ssi.bg>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: unregister_netdevice: waiting for lo to become free. Usage count = 8
2011-04-18 21:12 ` Julian Anastasov
@ 2011-04-18 21:48 ` Hans Schillstrom
2011-04-18 22:23 ` Julian Anastasov
0 siblings, 1 reply; 8+ messages in thread
From: Hans Schillstrom @ 2011-04-18 21:48 UTC (permalink / raw)
To: Julian Anastasov; +Cc: Simon Horman, netdev, lvs-devel, Eric W. Biederman
On Monday, April 18, 2011 23:12:27 Julian Anastasov wrote:
>
> Hello,
>
> On Mon, 18 Apr 2011, Hans Schillstrom wrote:
>
> > Actually I forgot to tell there is a need for a
> > ip_vs_service_cleanup() due to above.
> > Do you see any drawbacks with it ?
>
> May be ip_vs_service_cleanup() should call only
> ip_vs_flush(), under __ip_vs_mutex.
Hmm,
I'm not sure if the IP_VS_WAIT_WHILE() in ip_vs_flush is a good idea in this case...
That was why I wrote ip_vs_service_cleanup()
>
> > /*
> > * Delete service by {netns} in the service table.
> > */
> > static void ip_vs_service_cleanup(struct net *net)
> > {
> > unsigned hash;
> > struct ip_vs_service *svc, *tmp;
> >
> > EnterFunction(2);
> > /* Check for "full" addressed entries */
> > for (hash = 0; hash<IP_VS_SVC_TAB_SIZE; hash++) {
> > write_lock_bh(&__ip_vs_svc_lock);
> > list_for_each_entry_safe(svc, tmp, &ip_vs_svc_table[hash],
> > s_list) {
> > if (net_eq(svc->net, net)) {
> > ip_vs_svc_unhash(svc);
> > __ip_vs_del_service(svc);
> > }
> > }
> > list_for_each_entry_safe(svc, tmp, &ip_vs_svc_fwm_table[hash],
> > f_list) {
> > if (net_eq(svc->net, net)) {
> > ip_vs_svc_unhash(svc);
> > __ip_vs_del_service(svc);
> > }
> > }
> > write_unlock_bh(&__ip_vs_svc_lock);
> > }
> > LeaveFunction(2);
> > }
> >
> > Called just after the __ip_vs_control_cleanup_sysctl()
>
> Hm, no. ip_vs_service_cleanup() should be called
> by ip_vs_cleanup() before or after nf_unregister_hooks().
> The rule is that ip_vs_flush() should be called before
> ip_vs_conn_flush() because after ip_vs_flush() no more
> connections can be created and even if hooks are still
> registered the packets can not create conns in the netns. Then
> ip_vs_conn_flush() will remove all existing connections and
> ip_vs_control_cleanup() can remove all real servers with
> ip_vs_trash_cleanup(). I mean, per-netns calls.
>
OK I will try to do that, both with and without throttle in ip_vs_in()
> Also, may be all code that was called in old
> kernels by ip_vs_cleanup() should be now called by
> __ip_vs_cleanup(net), i.e. we can preserve the needed order
> of all functions but now also per-netns. For example, for
> ip_vs_ctl.c ip_vs_control_init() can remain as global but it
> should not register ipvs_control_ops. Then we
> can rename __ip_vs_control_init to ip_vs_control_init_net()
> and to call it from __ip_vs_init(). I.e. all such files
> will have global function and also _init_net and
> _cleanup_net. Now there are many register_pernet_subsys()
> calls and I'm not sure we preserve the needed order for
> cleanup. Are the ->exit methods called in reverse order?
Yes
> I don't see it in ops_exit_list() and we can not rely
> on such registration order. I think, ip_vs_init() should
> call global functions as now but __ip_vs_init() and
> __ip_vs_cleanup() should call the _net methods in right
> order.
Exactly,
I have already done that in my next patch, and some other small changes :-)
For the ip_vs.ko there is only one register/unregister now, the schedulers still have their own.
Hopefully the patch is ready to morrow
Regards
Hans
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: unregister_netdevice: waiting for lo to become free. Usage count = 8
2011-04-18 21:48 ` Hans Schillstrom
@ 2011-04-18 22:23 ` Julian Anastasov
0 siblings, 0 replies; 8+ messages in thread
From: Julian Anastasov @ 2011-04-18 22:23 UTC (permalink / raw)
To: Hans Schillstrom; +Cc: Simon Horman, netdev, lvs-devel, Eric W. Biederman
Hello,
On Mon, 18 Apr 2011, Hans Schillstrom wrote:
> On Monday, April 18, 2011 23:12:27 Julian Anastasov wrote:
> >
> > Hello,
> >
> > On Mon, 18 Apr 2011, Hans Schillstrom wrote:
> >
> > > Actually I forgot to tell there is a need for a
> > > ip_vs_service_cleanup() due to above.
> > > Do you see any drawbacks with it ?
> >
> > May be ip_vs_service_cleanup() should call only
> > ip_vs_flush(), under __ip_vs_mutex.
>
> Hmm,
> I'm not sure if the IP_VS_WAIT_WHILE() in ip_vs_flush is a good idea in this case...
> That was why I wrote ip_vs_service_cleanup()
IP_VS_WAIT_WHILE should be called because some
schedulers do not use locks for ->schedule, eg. WLC.
They rely on svc->usecnt reference to hold the virtual
service. Nothing changes now with netns. It is currently the
only way to modify or delete virtual service or scheduler.
Of course, __ip_vs_svc_lock is now global but we do not
have a choice.
> > _cleanup_net. Now there are many register_pernet_subsys()
> > calls and I'm not sure we preserve the needed order for
> > cleanup. Are the ->exit methods called in reverse order?
>
> Yes
You mean, only in the planned patch, yes?
> > I don't see it in ops_exit_list() and we can not rely
> > on such registration order. I think, ip_vs_init() should
> > call global functions as now but __ip_vs_init() and
> > __ip_vs_cleanup() should call the _net methods in right
> > order.
>
> Exactly,
> I have already done that in my next patch, and some other small changes :-)
> For the ip_vs.ko there is only one register/unregister now, the schedulers still have their own.
> Hopefully the patch is ready to morrow
OK, very good.
Regards
--
Julian Anastasov <ja@ssi.bg>
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2011-04-18 22:23 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-04-15 7:01 unregister_netdevice: waiting for lo to become free. Usage count = 8 Hans Schillstrom
2011-04-15 7:27 ` Eric W. Biederman
2011-04-15 20:11 ` Julian Anastasov
2011-04-18 6:10 ` Hans Schillstrom
2011-04-18 10:43 ` Hans Schillstrom
2011-04-18 21:12 ` Julian Anastasov
2011-04-18 21:48 ` Hans Schillstrom
2011-04-18 22:23 ` Julian Anastasov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox