netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: NeilBrown <neilb@suse.com>
To: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Thomas Graf <tgraf@suug.ch>,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 6/8] rhashtable: further improve stability of rhashtable_walk
Date: Tue, 08 May 2018 10:54:21 +1000	[thread overview]
Message-ID: <87wowfarwi.fsf@notabene.neil.brown.name> (raw)
In-Reply-To: <20180507093019.qkz75qm7rocnbfnl@gondor.apana.org.au>

[-- Attachment #1: Type: text/plain, Size: 3398 bytes --]

On Mon, May 07 2018, Herbert Xu wrote:

> On Sun, May 06, 2018 at 07:50:54AM +1000, NeilBrown wrote:
>>
>> Do we?  How could we fix it for both rhashtable and rhltable?
>
> Well I suggested earlier to insert the walker object into the
> hash table, which would be applicable regardless of whether it
> is a normal rhashtable of a rhltable.

Oh yes, that idea.  I had considered it and discarded it.
Moving a walker object around in an rcu-list is far from straight
forward.  A lockless iteration of the list would need to be very careful
not to be tripped up by the walker-object.  I don't think we want to
make the search code more complex.

I explored the idea a bit more and I think that any solution that we
came up with along those lines would look quite different for regular
rhash_tables than for rhl_tables.  So it is probably best to keep them
quite separate.
i.e. use the scheme I proposed for rhashtables and use something else
entirely for rhltables.

My current thinking for a stable walk for rhl tables goes like this:

- we embed
     struct rhl_cursor {
         struct rhl_cursor *next;
         int unseen;
     }
  in struct rhashtable_iter.

- on rhashtable_stop(), we:
  + lock the current hash chain
  + find the next object that is still in the table (after ->p).
  + count the number of objects from there to the end to the
     end of the rhlist_head.next list.
  + store that count in rhl_cursor.unseen
  + change the ->next pointer at the end of the list to point to the
    rhl_cursor, but with the lsb set.  If there was already an
    rhl_cursor there, they are chained together on ->next.

- on rhashtable_start(), we lock the hash chain and search the hash
  chain and sub-chains for  ->p or the rhl_cursor.
  If ->p is still there, that can be used as a reliable anchor.
  If ->p isn't found but the rhl_cursor is, then the 'unseen' counter
  tells us where to start in that rhl_list.
  If neither are found, then we start at the beginning of the hash chain.
  If the rhl_cursor is found, it is removed.

- lookup and insert can almost completely ignore the cursor.
  rhl_for_each_rcu() needs to detect the end-of-list by looking for
  lsb set, but that is the only change.
  As _insert() adds new objects to the head of the rhlist, that
  won't disturb the accuracy of the 'unseen' count.  The newly inserted
  object won't be seen by the walk, but that is expected.

- delete needs some extra handling.
  + When an object is deleted from an rhlist and the list does not become
    empty, then it counts the number of objects to the end of the list,
    and possibly decrements the 'unseen' counter on any rhl_cursors that
    are at the end of the list.
  + when an object is deleted resulting in an empty rhlist, any
    cursors at the end of the list needs to be moved to the next
    list in the hash chain, and their 'unseen' count needs to be set to
    the length of the list.
    If there is no next object in the hash chain, then the iter.slot is
    incremented and the rhlist_curs is left unconnected.

- rehash needs to remove any cursors from the end of an rhlist before
  moving them to the new table.  The cursors are left disconnected.

I'm happy to implement this if you think the approach is workable and
the functionality is necessary.  However there are no current users of
rhltable_walk_inter that would benefit from this.


Thanks,
NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

  reply	other threads:[~2018-05-08  0:54 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-04  3:54 [PATCH 0/8] Assorted rhashtable fixes and cleanups NeilBrown
2018-05-04  3:54 ` [PATCH 1/8] rhashtable: silence RCU warning in rhashtable_test NeilBrown
2018-05-05  9:10   ` Herbert Xu
2018-05-05 21:49     ` NeilBrown
2018-05-04  3:54 ` [PATCH 4/8] rhashtable: fix race in nested_table_alloc() NeilBrown
2018-05-05  9:29   ` Herbert Xu
2018-05-05 21:48     ` NeilBrown
2018-05-06  5:18       ` Herbert Xu
2018-05-06 22:02         ` NeilBrown
2018-05-04  3:54 ` [PATCH 5/8] rhashtable: remove rhashtable_walk_peek() NeilBrown
2018-05-05  9:30   ` Herbert Xu
2018-05-04  3:54 ` [PATCH 7/8] rhashtable: add rhashtable_walk_prev() NeilBrown
2018-05-05  9:43   ` Herbert Xu
2018-05-05 15:40     ` Tom Herbert
2018-05-06 22:16       ` NeilBrown
2018-05-04  3:54 ` [PATCH 8/8] rhashtable: don't hold lock on first table throughout insertion NeilBrown
2018-05-05  9:41   ` Herbert Xu
2018-05-05 22:00     ` NeilBrown
2018-05-06  5:20       ` Herbert Xu
2018-05-06 22:24         ` NeilBrown
2018-05-07  9:29           ` Herbert Xu
2018-05-08  0:23             ` NeilBrown
2018-05-04  3:54 ` [PATCH 2/8] rhashtable: remove nulls_base and related code NeilBrown
2018-05-05  9:12   ` Herbert Xu
2018-05-05 21:37     ` NeilBrown
2018-05-07  9:27       ` Herbert Xu
2018-05-08  1:14         ` NeilBrown
2018-05-04  3:54 ` [PATCH 3/8] rhashtable: use cmpxchg() to protect ->future_tbl NeilBrown
2018-05-05  9:27   ` Herbert Xu
2018-05-05 21:45     ` NeilBrown
2018-05-04  3:54 ` [PATCH 6/8] rhashtable: further improve stability of rhashtable_walk NeilBrown
2018-05-05  9:42   ` Herbert Xu
2018-05-05 21:50     ` NeilBrown
2018-05-07  9:30       ` Herbert Xu
2018-05-08  0:54         ` NeilBrown [this message]
2018-05-04 17:07 ` [PATCH 0/8] Assorted rhashtable fixes and cleanups David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87wowfarwi.fsf@notabene.neil.brown.name \
    --to=neilb@suse.com \
    --cc=herbert@gondor.apana.org.au \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=tgraf@suug.ch \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).