Discussions of the Parallel Programming book
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.ibm.com>
To: Akira Yokosawa <akiyks@gmail.com>
Cc: Junchang Wang <junchangwang@gmail.com>, perfbook@vger.kernel.org
Subject: Re: Question regarding hash_resize
Date: Tue, 8 Jan 2019 18:59:13 -0800	[thread overview]
Message-ID: <20190109025913.GA7904@linux.ibm.com> (raw)
In-Reply-To: <20190109001959.GS1215@linux.ibm.com>

On Tue, Jan 08, 2019 at 04:19:59PM -0800, Paul E. McKenney wrote:
> On Wed, Jan 09, 2019 at 07:16:05AM +0900, Akira Yokosawa wrote:
> > On 2019/01/08 10:39:31 -0800, Paul E. McKenney wrote:
> > > On Wed, Jan 09, 2019 at 12:35:37AM +0900, Akira Yokosawa wrote:
> > >> On 2019/01/09 0:28, Paul E. McKenney wrote:
> > >>> On Tue, Jan 08, 2019 at 09:56:57AM +0800, Junchang Wang wrote:
> > >>>> On Tue, Jan 8, 2019 at 7:06 AM Akira Yokosawa <akiyks@gmail.com> wrote:
> > >>>>> On 2019/01/08 07:54:16 +0900, Akira Yokosawa wrote:
> 
> [ . . . ]
> 
> > >>>> Hi Paul and Akira,
> > >>>>
> > >>>> Thanks a lot for the comments, which I need some more time to look
> > >>>> into. For Paul's patch, I have a few concerns. Please take a look.
> > >>>>
> > >>>> My understanding is that with this path, during the time period when
> > >>>> the resizing thread is running, an updater may insert/delete an item
> > >>>> into/from the new hash table, while readers are still looking up data
> > >>>> in the old one, resulting the readers are unaware of
> > >>>> insertions/deletions happening simultaneously. For example, it seems
> > >>>> the following sequence could happen.
> > >>>>
> > >>>> 1. The resizing thread starts.
> > >>>> 2. The resizing thread successfully passes bucket *B* of the old hash table.
> > >>>> 3. An updater wants to insert a new item *I* which should be inserted
> > >>>> into bucket *B*.
> > >>>> 4. The updater will select the new hash table and insert the item *I*
> > >>>> into the new hash table.
> > >>>> 5. A read request comes in and wants to lookup item *I*. The lookup
> > >>>> request will check the old hash table and fail. Doesn't it?
> > >>>> 6. The resizing thread exits.
> > >>>> 7. Now subsequent read requests can successfully find item *I*.
> > >>>
> > >>> Yes, this can happen.
> > >>>
> > >>>> Is my understanding correct? Please let me know if I misunderstood
> > >>>> anything. Give the truth that this patch can accelerate the fast path,
> > >>>> I think it should be OK because resizing is typically happen rarely.
> > >>>> Just want to make sure I fully understand the algorithm.
> > >>>
> > >>> It is a design choice, and some users would prefer not to fail to see
> > >>> new items during a resize.  One approach would be to revert back to
> > >>> the old-style checking, and another would be to provide a separate
> > >>> lookup interface that synchronizes with adds and deletes.
> > >>>
> > >>> So, I could add a quick quiz with this information, I could revert the
> > >>> change, or I could add another lookup function that provided more timely
> > >>> information.  Left to myself, I would provide a quick quiz, but what
> > >>> do you guys think?
> > >>
> > >> Hi, I was composing a message, but now I'm replying to this one.
> > >> I think adding a quick quiz would be a good idea.
> > > 
> > > But in the meantime, it occurred to me that I was looking at the
> > > problem in the wrong way.  I believe that the following patch makes
> > > hashtab_lookup() find elements recently added by hashtab_add(), even
> > > during a resize, and without the need for memory barriers.
> > > 
> > > The scenario that convinced me to take this approach is when a thread
> > > does hashtab_add(), then immediately searches for the newly added element.
> > > Failing to find it would be quite a surprise to most people.
> > 
> > When a thread does hashtab_del() and immediately checks the deletion,
> > it still finds the deleted element while resizing is in progress.
> > This would also be a surprise. Current version looks less consistent
> > than the simpler one did.
> 
> I bet I can fix that...  Famous last words!  ;-)
> 
> But please see below and tell me what you think.

Well, that is not quite right, but close.  Working on it...

							Thanx, Paul

> ------------------------------------------------------------------------
> 
> diff --git a/CodeSamples/datastruct/hash/hash_resize.c b/CodeSamples/datastruct/hash/hash_resize.c
> index 6dbfe020d78d..632d9e27675b 100644
> --- a/CodeSamples/datastruct/hash/hash_resize.c
> +++ b/CodeSamples/datastruct/hash/hash_resize.c
> @@ -257,9 +257,12 @@ void hashtab_add(struct ht_elem *htep,			//\lnlbl{add:b}
>  void hashtab_del(struct ht_elem *htep,			//\lnlbl{del:b}
>                   struct ht_lock_state *lsp)
>  {
> -	int i = lsp->hls_idx[!!lsp->hbp[1]];		//\lnlbl{del:i}
> +	int new = !!lsp->hbp[1];			//\lnlbl{del:new}
> +	int i = lsp->hls_idx[new];			//\lnlbl{del:i}
>  
>  	cds_list_del_rcu(&htep->hte_next[i]);		//\lnlbl{del:del}
> +	if (new)
> +		cds_list_del_rcu(&htep->hte_next[!i]);	//\lnlbl{del:del}
>  }							//\lnlbl{del:e}
>  //\end{snippet}
>  


  reply	other threads:[~2019-01-09  2:59 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-07 13:49 Question regarding hash_resize Junchang Wang
2019-01-07 18:33 ` Paul E. McKenney
2019-01-07 22:54   ` Akira Yokosawa
2019-01-07 23:06     ` Akira Yokosawa
2019-01-07 23:48       ` Paul E. McKenney
2019-01-08 15:18         ` Akira Yokosawa
2019-01-08 15:32           ` Paul E. McKenney
2019-01-08  1:56       ` Junchang Wang
2019-01-08 15:28         ` Paul E. McKenney
2019-01-08 15:35           ` Akira Yokosawa
2019-01-08 18:39             ` Paul E. McKenney
2019-01-08 22:16               ` Akira Yokosawa
2019-01-09  0:19                 ` Paul E. McKenney
2019-01-09  2:59                   ` Paul E. McKenney [this message]
2019-01-11  4:08                     ` Paul E. McKenney
2019-01-11 14:25                       ` Akira Yokosawa
2019-01-11 15:43                         ` Paul E. McKenney
2019-01-11 22:56                           ` Akira Yokosawa
2019-01-11 23:28                             ` Paul E. McKenney
2019-01-07 23:13     ` Paul E. McKenney
2019-01-07 23:33       ` Paul E. McKenney
2019-01-18 14:32   ` Junchang Wang
2019-01-18 18:34     ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190109025913.GA7904@linux.ibm.com \
    --to=paulmck@linux.ibm.com \
    --cc=akiyks@gmail.com \
    --cc=junchangwang@gmail.com \
    --cc=perfbook@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox