All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chris Mason <clm@fb.com>
To: <netdev@vger.kernel.org>
Subject: Re: [PATCH RFC] ipv6_fib limit spinlock hold times for /proc/net/ipv6_route
Date: Thu, 24 Apr 2014 10:41:11 -0400	[thread overview]
Message-ID: <53592287.2050902@fb.com> (raw)
In-Reply-To: <20140424142030.GD1960@order.stressinduktion.org>

On 04/24/2014 10:20 AM, Hannes Frederic Sowa wrote:
> Hi!
>
> On Thu, Apr 24, 2014 at 09:59:24AM -0400, Chris Mason wrote:
>> The ipv6 code to dump routes in /proc/net/ipv6_route can hold
>> a read lock on the table for a very long time.  This ends up blocking
>> writers and triggering softlockups.
>>
>> This patch is a simple work around to limit the number of entries
>> we'll walk while processing /proc/net/ipv6_route.  It intentionally
>> slows down proc file reading to make sure we don't lock out the
>> real ipv6 traffic.
>
> I guess most time is spent in formatting and printing the rt6_info details
> to the procfs file. Have you tried excluding !(rt6_info->rt6i_flags &
> RTF_CACHE) routes?

We do have a separate patch from Paul Saab that excludes the cached 
routes and it has a big impact (~10x fewer entries).  But the 
softlockups still flow.

I was going to discuss the cache exclusion on a separate thread, but the 
short version is that I don't have any clue of how many people we'd 
upset by unconditionally leaving out the cached entries.

>
> Maybe this is a viable alternative. A patch could also check for
> RTF_DYNAMIC and RTF_MODIFIED so we would still show redirected and
> mtu-caching nodes.
>
>> This patch is also horrible, and doesn't actually fix the entire
>> problem.  We still have rcu_read_lock held the whole time we cat
>> /proc/net/ipv6_route.  On an unpatched machine, I've clocked the
>> time required to cat /proc/net/ipv6_route at 14 minutes.
>>
>> java cats this proc file on startup to search for local routes, and the
>> resulting contention on the table lock makes our boxes fall over.
>
> Urks, does plain openjdk do that or is this something in your application?
>

Seems to be built into the jdk, and not our app.

>>
>> So, I'm sending the partial fix to get discussion started.
>
> I am planing to submit patches which reduce the caching of DST_HOST
> entries in the ipv6 fib next month which will result in a much smaller
> fib to walk by then.

Great.

-chris

  parent reply	other threads:[~2014-04-24 14:40 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-04-24 13:59 [PATCH RFC] ipv6_fib limit spinlock hold times for /proc/net/ipv6_route Chris Mason
2014-04-24 14:20 ` Hannes Frederic Sowa
2014-04-24 14:30   ` Eric Dumazet
2014-04-24 14:41   ` Chris Mason [this message]
2014-04-25 21:31     ` Hannes Frederic Sowa
2014-04-26  4:06       ` David Miller
2014-04-24 14:20 ` Eric Dumazet
2014-04-25 19:53 ` David Miller
2014-04-25 20:09 ` David Miller
2014-04-25 20:27   ` Chris Mason
2014-04-25 21:52     ` Hannes Frederic Sowa
2014-04-26  4:11     ` David Miller
2014-04-28 17:21       ` Chris Mason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53592287.2050902@fb.com \
    --to=clm@fb.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.