Re: [PATCH 5/6] rhashtable: support guaranteed successful insertion.

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: NeilBrown <neilb@suse.com>
To: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Thomas Graf <tgraf@suug.ch>,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 5/6] rhashtable: support guaranteed successful insertion.
Date: Thu, 29 Mar 2018 08:26:21 +1100	[thread overview]
Message-ID: <87370j51tu.fsf@notabene.neil.brown.name> (raw)
In-Reply-To: <20180328072727.GA17306@gondor.apana.org.au>

[-- Attachment #1: Type: text/plain, Size: 3887 bytes --]

On Wed, Mar 28 2018, Herbert Xu wrote:

> On Wed, Mar 28, 2018 at 06:04:40PM +1100, NeilBrown wrote:
>>
>> I disagree.  My patch 6 only makes it common instead of exceedingly
>> rare.  If any table in the list other than the first has a chain with 16
>> elements, then trying to insert an element with a hash which matches
>> that chain will fail with -EBUSY.  This is theoretically possible
>> already, though astronomically unlikely.  So that case will never be
>> tested for.
>
> No that's not true.  If the table is correctly sized then the
> probability of having a chain with 16 elements is extremely low.

I say "astronomically unlikely", you say "probability .. is extremely
low".  I think we are in agreement here.

The point remains that if an error *can* be returned then I have to
write code to handle it and test that code.  I'd rather not.

>
> Even if it does happen we won't fail because we will perform
> an immediate rehash.  We only fail if it happens right away
> after the rehash (that is, at least another 16 elements have
> been inserted and you're trying to insert a 17th element, all
> while the new hash table has not been completely populated),
> which means that somebody has figured out our hash secret and
> failing in that case makes sense.
>
>> It is hard to know if it is necessary.  And making the new table larger
>> will make the error less likely, but still won't make it impossible.  So
>> callers will have to handle it - just like they currently have to handle
>> -ENOMEM even though it is highly unlikely (and not strictly necessary).
>
> Callers should not handle an ENOMEM error by retrying.  Nor should
> they retry an EBUSY return value.

I never suggested retrying, but I would have to handle it somehow.  I'd
rather not.

>
>> Are these errors ever actually useful?  I thought I had convinced myself
>> before that they were (to throttle attacks on the hash function), but
>> they happen even less often than I thought.
>
> The EBUSY error indicates that the hash table has essentially
> degenereated into a linked list because somebody has worked out
> our hash secret.

While I have no doubt that there are hashtables where someone could try
to attack the hash, I am quite sure there are others where is such an
attack is meaningless - any code which could generate the required range of
keys, could do far worse things more easily.

>
>> Maybe. Reading a percpu counter isn't cheap.  Reading it whenever a hash
>> chain reaches 16 is reasonable, but I think we would want to read it a
>> lot more often than that.  So probably store the last-sampled time (with
>> no locking) and only sample the counter if last-sampled is more than
>>  jiffies - 10*HZ (???)
>
> We could also take the spinlock table approach and have a counter
> per bucket spinlock.  This should be sufficient as you'll contend
> on the bucket spinlock table anyway.

Yes, storing a sharded count in the spinlock table does seem like an
appropriate granularity.  However that leads me to ask: why do we have
the spinlock table?  Why not bit spinlocks in the hashchain head like
include/linux/list_bl uses?

>
> This also allows us to estimate the total table size and not have
> to always do a last-ditch growth when it's too late.

I don't understand how it can ever be "too late", though I appreciate
that in some cases "sooner" is better than "later"
If we give up on the single atomic_t counter, then we must accept that
the number of elements could exceed any given value.  The only promise
we can provide is that it wont exceed N% of the table size for more than
T seconds.

Thanks,
NeilBrown


>
> Cheers,
> -- 
> Email: Herbert Xu <herbert@gondor.apana.org.au>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

next prev parent reply	other threads:[~2018-03-28 21:26 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-26 23:33 [PATCH 0/6] rhashtable: assorted fixes and enhancements NeilBrown
2018-03-26 23:33 ` [PATCH 1/6] rhashtable: improve documentation for rhashtable_walk_peek() NeilBrown
2018-03-27 10:55   ` Sergei Shtylyov
2018-03-27 15:30   ` Herbert Xu
2018-03-27 15:47   ` David Miller
2018-03-27 21:45   ` [PATCH 1/6 v2] " NeilBrown
2018-03-27 22:46   ` [PATCH 1/6] " Andreas Grünbacher
2018-03-28  0:49     ` NeilBrown
2018-03-26 23:33 ` [PATCH 3/6] rhashtable: reset intr when rhashtable_walk_start sees new table NeilBrown
2018-03-27 15:47   ` Herbert Xu
2018-03-26 23:33 ` [PATCH 2/6] rhashtable: remove outdated comments about grow_decision etc NeilBrown
2018-03-26 23:33 ` [PATCH 4/6] rhashtable: allow a walk of the hash table without missing objects NeilBrown
2018-03-27 15:49   ` David Miller
2018-03-27 15:54     ` Herbert Xu
2018-03-27 21:50     ` NeilBrown
2018-03-27 15:51   ` Herbert Xu
2018-03-27 21:54     ` NeilBrown
2018-03-28  6:07       ` Herbert Xu
2018-03-28  7:17         ` NeilBrown
2018-03-28  7:30           ` Herbert Xu
2018-03-28 21:34             ` NeilBrown
2018-03-29  1:13               ` NeilBrown
2018-03-26 23:33 ` [PATCH 6/6] rhashtable: allow element counting to be disabled NeilBrown
2018-03-26 23:33 ` [PATCH 5/6] rhashtable: support guaranteed successful insertion NeilBrown
2018-03-27 15:56   ` Herbert Xu
2018-03-27 21:34     ` NeilBrown
2018-03-28  6:04       ` Herbert Xu
2018-03-28  7:04         ` NeilBrown
2018-03-28  7:27           ` Herbert Xu
2018-03-28 21:26             ` NeilBrown [this message]
2018-03-29  5:22               ` Herbert Xu
2018-04-06  3:11                 ` NeilBrown
2018-04-06  4:13                   ` Herbert Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87370j51tu.fsf@notabene.neil.brown.name \
    --to=neilb@suse.com \
    --cc=herbert@gondor.apana.org.au \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=tgraf@suug.ch \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).