netfilter-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: David Miller <davem@davemloft.net>
Cc: tgraf@suug.ch, josh@joshtriplett.org,
	alexei.starovoitov@gmail.com, herbert@gondor.apana.org.au,
	kaber@trash.net, ying.xue@windriver.com, netdev@vger.kernel.org,
	netfilter-devel@vger.kernel.org
Subject: Re: Ottawa and slow hash-table resize
Date: Mon, 23 Feb 2015 15:06:19 -0800	[thread overview]
Message-ID: <20150223230619.GD15405@linux.vnet.ibm.com> (raw)
In-Reply-To: <20150223.173252.397503088215638994.davem@davemloft.net>

On Mon, Feb 23, 2015 at 05:32:52PM -0500, David Miller wrote:
> From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> Date: Mon, 23 Feb 2015 13:52:49 -0800
> 
> > On Mon, Feb 23, 2015 at 09:03:58PM +0000, Thomas Graf wrote:
> >> On 02/23/15 at 11:12am, josh@joshtriplett.org wrote:
> >> > In theory, resizes should only take the locks for the buckets they're
> >> > currently unzipping, and adds should take those same locks.  Neither one
> >> > should take a whole-table lock, other than resize excluding concurrent
> >> > resizes.  Is that still insufficient?
> >> 
> >> Correct, this is what happens. The problem is basically that
> >> if we insert from atomic context we cannot slow down inserts
> >> and the table may not grow quickly enough.
> >> 
> >> > Yeah, the add/remove statistics used for tracking would need some
> >> > special handling to avoid being a table-wide bottleneck.
> >> 
> >> Daniel is working on a patch to do per-cpu element counting
> >> with a batched update cycle.
> > 
> > One approach is simply to count only when a resize operation is in
> > flight.  Another is to keep a per-bucket count, which can be summed
> > at the beginning of the next resize operation.
> 
> I think we should think exactly about what we should do when someone
> loops non-stop adding 1 million entries to the hash table and the
> initial table size is very small.
> 
> This is a common use case for at least one of the current rhashtable
> users (nft_hash).  When you load an nftables rule with a large set
> of IP addresses attached, this is what happens.
> 
> Yes I understand that nftables could give a hint and start with a
> larger hash size from the start when it knows this is going to happen,
> but I still believe that we should behave reasonably when starting
> from a small table.
> 
> I'd say that with the way things work right now, in this situation it
> actually hurts to allow asynchronous inserts during a resize.  Because
> we end up with extremely long hash table chains, and thus make the
> resize work and the lookups both take an excruciatingly long amount of
> time to complete.
> 
> I just did a quick scan of all code paths that do inserts into an
> rhashtable, and it seems like all of them can easily block.  So why
> don't we do that?  Make inserts sleep on an rhashtable expansion
> waitq.
> 
> There could even be a counter of pending inserts, so the expander can
> decide to expand further before waking the inserting threads up.

Should be reasonably simple, and certainly seems worth a try!

							Thanx, Paul

  reply	other threads:[~2015-02-23 23:06 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-02-23 18:49 Ottawa and slow hash-table resize Paul E. McKenney
2015-02-23 19:12 ` josh
2015-02-23 21:03   ` Thomas Graf
2015-02-23 21:52     ` Paul E. McKenney
2015-02-23 22:32       ` David Miller
2015-02-23 23:06         ` Paul E. McKenney [this message]
2015-02-24  8:37           ` Thomas Graf
2015-02-24 10:39             ` Patrick McHardy
2015-02-24 10:46               ` David Laight
2015-02-24 10:48                 ` Patrick McHardy
2015-02-24 17:09               ` David Miller
2015-02-24 17:50                 ` Thomas Graf
2015-02-24 18:26                   ` David Miller
2015-02-24 18:45                     ` josh
2015-02-24 22:34                       ` Thomas Graf
2015-02-25  8:56                         ` Herbert Xu
2015-02-25 17:38                           ` Thomas Graf
2015-02-24 18:33                   ` josh
2015-02-25  8:55                 ` Herbert Xu
2015-02-25 17:38                   ` Thomas Graf
2015-02-23 21:00 ` Thomas Graf
2015-02-23 22:35   ` Paul E. McKenney
2015-02-24  8:59 ` Thomas Graf
2015-02-24  9:38   ` Daniel Borkmann
2015-02-24 10:42     ` Patrick McHardy
2015-02-24 16:14       ` Josh Hunt
2015-02-24 16:25         ` Patrick McHardy
2015-02-24 16:57           ` David Miller
  -- strict thread matches above, loose matches on Subject: below --
2015-02-23 22:17 Alexei Starovoitov
2015-02-23 22:34 ` David Miller
2015-02-23 22:37 ` Paul E. McKenney
2015-02-23 23:07 Alexei Starovoitov
2015-02-23 23:15 ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150223230619.GD15405@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=davem@davemloft.net \
    --cc=herbert@gondor.apana.org.au \
    --cc=josh@joshtriplett.org \
    --cc=kaber@trash.net \
    --cc=netdev@vger.kernel.org \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=tgraf@suug.ch \
    --cc=ying.xue@windriver.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).