netfilter-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Ottawa and slow hash-table resize
@ 2015-02-23 18:49 Paul E. McKenney
  2015-02-23 19:12 ` josh
                   ` (2 more replies)
  0 siblings, 3 replies; 33+ messages in thread
From: Paul E. McKenney @ 2015-02-23 18:49 UTC (permalink / raw)
  To: alexei.starovoitov, herbert, tgraf, kaber, davem, ying.xue
  Cc: netdev, netfilter-devel, josh

Hello!

Alexei mentioned that there was some excitement a couple of weeks ago in
Ottawa, something about the resizing taking forever when there were large
numbers of concurrent additions.  One approach comes to mind:

o	Currently, the hash table does not allow additions concurrently
	with resize operations.  One way to allow this would be to
	have the addition operations add to the new hash table at the
	head of the lists.  This would clearly require also updating the
	pointers used to control the unzip operation.

o	Count the number of entries added during the resize operation.
	Then, at the end of the resize operation, if enough entries have
	been added, do a resize, but by multiple factors of two if
	need be.

This should allow the table to take arbitrarily large numbers of updates
during a resize operation.  There are some other possibilities if this
approach does not work out.

Thoughts?

							Thanx, Paul

^ permalink raw reply	[flat|nested] 33+ messages in thread
* Re: Ottawa and slow hash-table resize
@ 2015-02-23 22:17 Alexei Starovoitov
  2015-02-23 22:34 ` David Miller
  2015-02-23 22:37 ` Paul E. McKenney
  0 siblings, 2 replies; 33+ messages in thread
From: Alexei Starovoitov @ 2015-02-23 22:17 UTC (permalink / raw)
  To: Paul McKenney
  Cc: Thomas Graf, Josh Triplett, Herbert Xu, Patrick McHardy,
	David S. Miller, ying.xue, netdev@vger.kernel.org,
	netfilter-devel

On Mon, Feb 23, 2015 at 1:52 PM, Paul E. McKenney
<paulmck@linux.vnet.ibm.com> wrote:
> On Mon, Feb 23, 2015 at 09:03:58PM +0000, Thomas Graf wrote:
>> On 02/23/15 at 11:12am, josh@joshtriplett.org wrote:
>> > In theory, resizes should only take the locks for the buckets they're
>> > currently unzipping, and adds should take those same locks.  Neither one
>> > should take a whole-table lock, other than resize excluding concurrent
>> > resizes.  Is that still insufficient?
>>
>> Correct, this is what happens. The problem is basically that
>> if we insert from atomic context we cannot slow down inserts
>> and the table may not grow quickly enough.
>>
>> > Yeah, the add/remove statistics used for tracking would need some
>> > special handling to avoid being a table-wide bottleneck.
>>
>> Daniel is working on a patch to do per-cpu element counting
>> with a batched update cycle.
>
> One approach is simply to count only when a resize operation is in
> flight.  Another is to keep a per-bucket count, which can be summed
> at the beginning of the next resize operation.

I'm not sure all of these counting optimizations will help at the end.
Say we have a small table. A lot of inserts are coming in at the same
time. rhashtable_expand kicks in and all new inserts are going
into future table, while expansion is happening.
Since expand will kick in quickly the old table will not have long
chains per bucket, so few unzips and corresponding
synchronize_rcu and we're done with expand.
Now future table becomes the only table, but it still has a lot
of entries, since insertions were happening and this table has
long per bucket chains, so next expand will have a lot of
synchronize_rcu and will take very long time.
So whether we count while inserting or not and whether
we grow by 2x or grow by 8x we still have an underlying
problem of very large number of synchronize_rcu.
Malicious user that knows this can stall the whole system.
Please tell me I'm missing something.

^ permalink raw reply	[flat|nested] 33+ messages in thread
* Re: Ottawa and slow hash-table resize
@ 2015-02-23 23:07 Alexei Starovoitov
  2015-02-23 23:15 ` David Miller
  0 siblings, 1 reply; 33+ messages in thread
From: Alexei Starovoitov @ 2015-02-23 23:07 UTC (permalink / raw)
  To: David Miller
  Cc: Paul McKenney, Thomas Graf, Josh Triplett, Herbert Xu,
	Patrick McHardy, ying.xue, netdev@vger.kernel.org,
	netfilter-devel

On Mon, Feb 23, 2015 at 2:34 PM, David Miller <davem@davemloft.net> wrote:
> From: Alexei Starovoitov <alexei.starovoitov@gmail.com>
> Date: Mon, 23 Feb 2015 14:17:06 -0800
>
>> I'm not sure all of these counting optimizations will help at the end.
>> Say we have a small table. A lot of inserts are coming in at the same
>> time. rhashtable_expand kicks in and all new inserts are going
>> into future table, while expansion is happening.
>> Since expand will kick in quickly the old table will not have long
>> chains per bucket, so few unzips and corresponding
>> synchronize_rcu and we're done with expand.
>> Now future table becomes the only table, but it still has a lot
>> of entries, since insertions were happening and this table has
>> long per bucket chains, so next expand will have a lot of
>> synchronize_rcu and will take very long time.
>> So whether we count while inserting or not and whether
>> we grow by 2x or grow by 8x we still have an underlying
>> problem of very large number of synchronize_rcu.
>> Malicious user that knows this can stall the whole system.
>> Please tell me I'm missing something.
>
> This is why I have just suggested that we make inserts block,
> and the expander looks at the count of pending inserts in an
> effort to keep expanding the table further if necessary before
> releasing the blocked insertion threads.

yes. blocking inserts would solve the problem, but it will make
rhashtable applicability to be limited.
Also 'count of pending inserts' is going to be very small and
meaningless, since it will count the number of threads that
are sleeping on insert and not very useful to predict future
expansions.
As an alternative we can store two chains of pointers
per element (one chain for old table and another chain
for future table). That will avoid all unzipping logic at the cost
of extra memory. May be there are other solutions.
I think as a minimum something like 'blocking inserts' are
needed for stable. imo the current situation with nft_hash
is not just annoying, it's a bug.

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2015-02-25 17:38 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-02-23 18:49 Ottawa and slow hash-table resize Paul E. McKenney
2015-02-23 19:12 ` josh
2015-02-23 21:03   ` Thomas Graf
2015-02-23 21:52     ` Paul E. McKenney
2015-02-23 22:32       ` David Miller
2015-02-23 23:06         ` Paul E. McKenney
2015-02-24  8:37           ` Thomas Graf
2015-02-24 10:39             ` Patrick McHardy
2015-02-24 10:46               ` David Laight
2015-02-24 10:48                 ` Patrick McHardy
2015-02-24 17:09               ` David Miller
2015-02-24 17:50                 ` Thomas Graf
2015-02-24 18:26                   ` David Miller
2015-02-24 18:45                     ` josh
2015-02-24 22:34                       ` Thomas Graf
2015-02-25  8:56                         ` Herbert Xu
2015-02-25 17:38                           ` Thomas Graf
2015-02-24 18:33                   ` josh
2015-02-25  8:55                 ` Herbert Xu
2015-02-25 17:38                   ` Thomas Graf
2015-02-23 21:00 ` Thomas Graf
2015-02-23 22:35   ` Paul E. McKenney
2015-02-24  8:59 ` Thomas Graf
2015-02-24  9:38   ` Daniel Borkmann
2015-02-24 10:42     ` Patrick McHardy
2015-02-24 16:14       ` Josh Hunt
2015-02-24 16:25         ` Patrick McHardy
2015-02-24 16:57           ` David Miller
  -- strict thread matches above, loose matches on Subject: below --
2015-02-23 22:17 Alexei Starovoitov
2015-02-23 22:34 ` David Miller
2015-02-23 22:37 ` Paul E. McKenney
2015-02-23 23:07 Alexei Starovoitov
2015-02-23 23:15 ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).