Re: Ottawa and slow hash-table resize

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: David Miller <davem@davemloft.net>
Cc: tgraf@suug.ch, josh@joshtriplett.org,
	alexei.starovoitov@gmail.com, herbert@gondor.apana.org.au,
	kaber@trash.net, ying.xue@windriver.com, netdev@vger.kernel.org,
	netfilter-devel@vger.kernel.org
Subject: Re: Ottawa and slow hash-table resize
Date: Mon, 23 Feb 2015 15:06:19 -0800	[thread overview]
Message-ID: <20150223230619.GD15405@linux.vnet.ibm.com> (raw)
In-Reply-To: <20150223.173252.397503088215638994.davem@davemloft.net>

On Mon, Feb 23, 2015 at 05:32:52PM -0500, David Miller wrote:
> From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> Date: Mon, 23 Feb 2015 13:52:49 -0800
> 
> > On Mon, Feb 23, 2015 at 09:03:58PM +0000, Thomas Graf wrote:
> >> On 02/23/15 at 11:12am, josh@joshtriplett.org wrote:
> >> > In theory, resizes should only take the locks for the buckets they're
> >> > currently unzipping, and adds should take those same locks.  Neither one
> >> > should take a whole-table lock, other than resize excluding concurrent
> >> > resizes.  Is that still insufficient?
> >> 
> >> Correct, this is what happens. The problem is basically that
> >> if we insert from atomic context we cannot slow down inserts
> >> and the table may not grow quickly enough.
> >> 
> >> > Yeah, the add/remove statistics used for tracking would need some
> >> > special handling to avoid being a table-wide bottleneck.
> >> 
> >> Daniel is working on a patch to do per-cpu element counting
> >> with a batched update cycle.
> > 
> > One approach is simply to count only when a resize operation is in
> > flight.  Another is to keep a per-bucket count, which can be summed
> > at the beginning of the next resize operation.
> 
> I think we should think exactly about what we should do when someone
> loops non-stop adding 1 million entries to the hash table and the
> initial table size is very small.
> 
> This is a common use case for at least one of the current rhashtable
> users (nft_hash).  When you load an nftables rule with a large set
> of IP addresses attached, this is what happens.
> 
> Yes I understand that nftables could give a hint and start with a
> larger hash size from the start when it knows this is going to happen,
> but I still believe that we should behave reasonably when starting
> from a small table.
> 
> I'd say that with the way things work right now, in this situation it
> actually hurts to allow asynchronous inserts during a resize.  Because
> we end up with extremely long hash table chains, and thus make the
> resize work and the lookups both take an excruciatingly long amount of
> time to complete.
> 
> I just did a quick scan of all code paths that do inserts into an
> rhashtable, and it seems like all of them can easily block.  So why
> don't we do that?  Make inserts sleep on an rhashtable expansion
> waitq.
> 
> There could even be a counter of pending inserts, so the expander can
> decide to expand further before waking the inserting threads up.

Should be reasonably simple, and certainly seems worth a try!

							Thanx, Paul

next prev parent reply	other threads:[~2015-02-23 23:06 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-02-23 18:49 Ottawa and slow hash-table resize Paul E. McKenney
2015-02-23 19:12 ` josh
2015-02-23 21:03   ` Thomas Graf
2015-02-23 21:52     ` Paul E. McKenney
2015-02-23 22:32       ` David Miller
2015-02-23 23:06         ` Paul E. McKenney [this message]
2015-02-24  8:37           ` Thomas Graf
2015-02-24 10:39             ` Patrick McHardy
2015-02-24 10:46               ` David Laight
2015-02-24 10:48                 ` Patrick McHardy
2015-02-24 17:09               ` David Miller
2015-02-24 17:50                 ` Thomas Graf
2015-02-24 18:26                   ` David Miller
2015-02-24 18:45                     ` josh
2015-02-24 22:34                       ` Thomas Graf
2015-02-25  8:56                         ` Herbert Xu
2015-02-25 17:38                           ` Thomas Graf
2015-02-24 18:33                   ` josh
2015-02-25  8:55                 ` Herbert Xu
2015-02-25 17:38                   ` Thomas Graf
2015-02-23 21:00 ` Thomas Graf
2015-02-23 22:35   ` Paul E. McKenney
2015-02-24  8:59 ` Thomas Graf
2015-02-24  9:38   ` Daniel Borkmann
2015-02-24 10:42     ` Patrick McHardy
2015-02-24 16:14       ` Josh Hunt
2015-02-24 16:25         ` Patrick McHardy
2015-02-24 16:57           ` David Miller
  -- strict thread matches above, loose matches on Subject: below --
2015-02-23 22:17 Alexei Starovoitov
2015-02-23 22:34 ` David Miller
2015-02-23 22:37 ` Paul E. McKenney
2015-02-23 23:07 Alexei Starovoitov
2015-02-23 23:15 ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150223230619.GD15405@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=davem@davemloft.net \
    --cc=herbert@gondor.apana.org.au \
    --cc=josh@joshtriplett.org \
    --cc=kaber@trash.net \
    --cc=netdev@vger.kernel.org \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=tgraf@suug.ch \
    --cc=ying.xue@windriver.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.