netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Benjamin Thery <benjamin.thery@bull.net>
To: David Miller <davem@davemloft.net>
Cc: dada1@cosmosbay.com, netdev@vger.kernel.org
Subject: Re: [PATCH 1/1] net: fix scheduling of dst_gc_task by __dst_free
Date: Mon, 15 Sep 2008 15:26:01 +0200	[thread overview]
Message-ID: <48CE6269.4050006@bull.net> (raw)
In-Reply-To: <20080912.161452.04603170.davem@davemloft.net>

David Miller wrote:
> From: Eric Dumazet <dada1@cosmosbay.com>
> Date: Fri, 12 Sep 2008 16:46:52 +0200
> 
>> Benjamin Thery a écrit :
>>> The dst garbage collector dst_gc_task() may not be scheduled as we
>>> expect it to be in __dst_free().
>>> Indeed, when the dst_gc_timer was replaced by the delayed_work
>>> dst_gc_work, the mod_timer() call used to schedule the garbage
>>> collector at an earlier date was replaced by a schedule_delayed_work()
>>> (see commit 86bba269d08f0c545ae76c90b56727f65d62d57f).
>>> But, the behaviour of mod_timer() and schedule_delayed_work() is
>>> different in the way they handle the delay. mod_timer() stops the timer and re-arm it with the new given delay,
>>> whereas schedule_delayed_work() only check if the work is already
>>> queued in the workqueue (and queue it (with delay) if it is not)
>>> BUT it does NOT take into account the new delay (even if the new delay
>>> is earlier in time).
>>> schedule_delayed_work() returns 0 if it didn't queue the work,
>>> but we don't check the return code in __dst_free().
>>> If I understand the code in __dst_free() correctly, we want dst_gc_task
>>> to be queued after DST_GC_INC jiffies if we pass the test (and not in
>>> some undetermined time in the future), so I think we should add a call
>>> to cancel_delayed_work() before schedule_delayed_work(). Patch below.
>>>
>> Well, you are right that time is undetermined (but < ~120 seconds), so your patch
>> makes sense.
>>
>> Acked-by: Eric Dumazet <dada1@cosmosbay.com>
> 
> I'll add this to net-next-2.6 for now.  Benjamin, do you know of any
> real cases where users are being tripped up by our not using the
> shorter scheduling of the workqueue?

I found this issue while tracking an issue that sometimes occurs at 
network namespace exit.

When a network namespace exits, the routes need to be freed as fast as 
possible to complete the unregistration of the net devices present in 
the namespace (ie. the loopback).

Sometimes, the routes garbage collection gets delayed (because of the 
issue described here) and the refcount on the device isn't decremented 
as expected when we reach netdev_wait_allrefs() and we get the infamous 
"unregister_netdevice: waiting for lo to become free."

This fix in __dst_free() fixes part of the problem.

Benjamin

> 
>> Then we should ask why we reset the timer back to its minimum value
>> every time we call __dst_free(). On machines with many dormant tcp
>> sessions, dst_garbage.list can contain huge number of non freeable
>> entries :(
>>
>> Maybe we should count the entries and change the timer only if really needed.
> 
> Yet another area of black magic in our routing cache :)
> 
>>> (Sorry, I think I've been a bit verbose to expose this simple issue :)
> 
> No, do not apologize, I wish every commit message were this verbose.
> 
> 


-- 
B e n j a m i n   T h e r y  - BULL/DT/Open Software R&D

    http://www.bull.com

      reply	other threads:[~2008-09-15 13:27 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20080912123113.770453085@theryb.frec.bull.fr>
2008-09-12 12:31 ` [PATCH 1/1] net: fix scheduling of dst_gc_task by __dst_free Benjamin Thery
2008-09-12 14:46   ` Eric Dumazet
2008-09-12 23:14     ` David Miller
2008-09-15 13:26       ` Benjamin Thery [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=48CE6269.4050006@bull.net \
    --to=benjamin.thery@bull.net \
    --cc=dada1@cosmosbay.com \
    --cc=davem@davemloft.net \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).