From: "Paul E. McKenney" <paulmck@linux.ibm.com>
To: Akira Yokosawa <akiyks@gmail.com>
Cc: Junchang Wang <junchangwang@gmail.com>, perfbook@vger.kernel.org
Subject: Re: Question regarding hash_resize
Date: Tue, 8 Jan 2019 16:19:59 -0800 [thread overview]
Message-ID: <20190109001959.GS1215@linux.ibm.com> (raw)
In-Reply-To: <288a7aca-75ba-d9ba-297b-5bbd98054d5f@gmail.com>
On Wed, Jan 09, 2019 at 07:16:05AM +0900, Akira Yokosawa wrote:
> On 2019/01/08 10:39:31 -0800, Paul E. McKenney wrote:
> > On Wed, Jan 09, 2019 at 12:35:37AM +0900, Akira Yokosawa wrote:
> >> On 2019/01/09 0:28, Paul E. McKenney wrote:
> >>> On Tue, Jan 08, 2019 at 09:56:57AM +0800, Junchang Wang wrote:
> >>>> On Tue, Jan 8, 2019 at 7:06 AM Akira Yokosawa <akiyks@gmail.com> wrote:
> >>>>> On 2019/01/08 07:54:16 +0900, Akira Yokosawa wrote:
[ . . . ]
> >>>> Hi Paul and Akira,
> >>>>
> >>>> Thanks a lot for the comments, which I need some more time to look
> >>>> into. For Paul's patch, I have a few concerns. Please take a look.
> >>>>
> >>>> My understanding is that with this path, during the time period when
> >>>> the resizing thread is running, an updater may insert/delete an item
> >>>> into/from the new hash table, while readers are still looking up data
> >>>> in the old one, resulting the readers are unaware of
> >>>> insertions/deletions happening simultaneously. For example, it seems
> >>>> the following sequence could happen.
> >>>>
> >>>> 1. The resizing thread starts.
> >>>> 2. The resizing thread successfully passes bucket *B* of the old hash table.
> >>>> 3. An updater wants to insert a new item *I* which should be inserted
> >>>> into bucket *B*.
> >>>> 4. The updater will select the new hash table and insert the item *I*
> >>>> into the new hash table.
> >>>> 5. A read request comes in and wants to lookup item *I*. The lookup
> >>>> request will check the old hash table and fail. Doesn't it?
> >>>> 6. The resizing thread exits.
> >>>> 7. Now subsequent read requests can successfully find item *I*.
> >>>
> >>> Yes, this can happen.
> >>>
> >>>> Is my understanding correct? Please let me know if I misunderstood
> >>>> anything. Give the truth that this patch can accelerate the fast path,
> >>>> I think it should be OK because resizing is typically happen rarely.
> >>>> Just want to make sure I fully understand the algorithm.
> >>>
> >>> It is a design choice, and some users would prefer not to fail to see
> >>> new items during a resize. One approach would be to revert back to
> >>> the old-style checking, and another would be to provide a separate
> >>> lookup interface that synchronizes with adds and deletes.
> >>>
> >>> So, I could add a quick quiz with this information, I could revert the
> >>> change, or I could add another lookup function that provided more timely
> >>> information. Left to myself, I would provide a quick quiz, but what
> >>> do you guys think?
> >>
> >> Hi, I was composing a message, but now I'm replying to this one.
> >> I think adding a quick quiz would be a good idea.
> >
> > But in the meantime, it occurred to me that I was looking at the
> > problem in the wrong way. I believe that the following patch makes
> > hashtab_lookup() find elements recently added by hashtab_add(), even
> > during a resize, and without the need for memory barriers.
> >
> > The scenario that convinced me to take this approach is when a thread
> > does hashtab_add(), then immediately searches for the newly added element.
> > Failing to find it would be quite a surprise to most people.
>
> When a thread does hashtab_del() and immediately checks the deletion,
> it still finds the deleted element while resizing is in progress.
> This would also be a surprise. Current version looks less consistent
> than the simpler one did.
I bet I can fix that... Famous last words! ;-)
But please see below and tell me what you think.
Thanx, Paul
------------------------------------------------------------------------
diff --git a/CodeSamples/datastruct/hash/hash_resize.c b/CodeSamples/datastruct/hash/hash_resize.c
index 6dbfe020d78d..632d9e27675b 100644
--- a/CodeSamples/datastruct/hash/hash_resize.c
+++ b/CodeSamples/datastruct/hash/hash_resize.c
@@ -257,9 +257,12 @@ void hashtab_add(struct ht_elem *htep, //\lnlbl{add:b}
void hashtab_del(struct ht_elem *htep, //\lnlbl{del:b}
struct ht_lock_state *lsp)
{
- int i = lsp->hls_idx[!!lsp->hbp[1]]; //\lnlbl{del:i}
+ int new = !!lsp->hbp[1]; //\lnlbl{del:new}
+ int i = lsp->hls_idx[new]; //\lnlbl{del:i}
cds_list_del_rcu(&htep->hte_next[i]); //\lnlbl{del:del}
+ if (new)
+ cds_list_del_rcu(&htep->hte_next[!i]); //\lnlbl{del:del}
} //\lnlbl{del:e}
//\end{snippet}
next prev parent reply other threads:[~2019-01-09 0:20 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-01-07 13:49 Question regarding hash_resize Junchang Wang
2019-01-07 18:33 ` Paul E. McKenney
2019-01-07 22:54 ` Akira Yokosawa
2019-01-07 23:06 ` Akira Yokosawa
2019-01-07 23:48 ` Paul E. McKenney
2019-01-08 15:18 ` Akira Yokosawa
2019-01-08 15:32 ` Paul E. McKenney
2019-01-08 1:56 ` Junchang Wang
2019-01-08 15:28 ` Paul E. McKenney
2019-01-08 15:35 ` Akira Yokosawa
2019-01-08 18:39 ` Paul E. McKenney
2019-01-08 22:16 ` Akira Yokosawa
2019-01-09 0:19 ` Paul E. McKenney [this message]
2019-01-09 2:59 ` Paul E. McKenney
2019-01-11 4:08 ` Paul E. McKenney
2019-01-11 14:25 ` Akira Yokosawa
2019-01-11 15:43 ` Paul E. McKenney
2019-01-11 22:56 ` Akira Yokosawa
2019-01-11 23:28 ` Paul E. McKenney
2019-01-07 23:13 ` Paul E. McKenney
2019-01-07 23:33 ` Paul E. McKenney
2019-01-18 14:32 ` Junchang Wang
2019-01-18 18:34 ` Paul E. McKenney
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190109001959.GS1215@linux.ibm.com \
--to=paulmck@linux.ibm.com \
--cc=akiyks@gmail.com \
--cc=junchangwang@gmail.com \
--cc=perfbook@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.