Discussions of the Parallel Programming book
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.ibm.com>
To: Akira Yokosawa <akiyks@gmail.com>
Cc: perfbook@vger.kernel.org
Subject: Re: [PATCH] EXP hashtorture.h: Avoid sporadic SIGSEGV in hash_bkt_rcu
Date: Wed, 2 Jan 2019 09:18:48 -0800	[thread overview]
Message-ID: <20190102171848.GA4170@linux.ibm.com> (raw)
In-Reply-To: <3fa9dcca-14d0-876d-fdcb-5db7eff3a97b@gmail.com>

On Thu, Jan 03, 2019 at 12:02:56AM +0900, Akira Yokosawa wrote:
> On 2019/01/01 10:00:25 -0800, Paul E. McKenney wrote:
> > On Tue, Jan 01, 2019 at 09:27:41AM +0900, Akira Yokosawa wrote:
> >> On 2018/12/31 13:03:07 -0800, Paul E. McKenney wrote:
> >>> On Tue, Jan 01, 2019 at 12:15:23AM +0900, Akira Yokosawa wrote:
> >>>> >From 52f5d218442eb64f2798335d56a1838f90d96d5f Mon Sep 17 00:00:00 2001
> >>>> From: Akira Yokosawa <akiyks@gmail.com>
> >>>> Date: Mon, 30 Dec 2018 22:54:43 +0900
> >>>> Subject: [PATCH] EXP hashtorture.h: Avoid sporadic SIGSEGV in hash_bkt_rcu
> >>>>
> >>>> Commit 4e22bdc905ff ("Wait at end of test for call_rcu() to finish")
> >>>> added a couple of synchronize_rcu()s in perftest_update()
> >>>> and zoo_reader().
> >>>>
> >>>> However, there still remains sporadic SIGSEGV in
> >>>>
> >>>>     $ ./hash_bkt_rcu --perftest --nupdaters 3
> >>>>
> >>>> On the other hand,
> >>>>
> >>>>     $ ./hash_bkt_rcu --schroedinger --nupdaters 3
> >>>>
> >>>> does not show such issue. Just moving synchronize_rcu()s in
> >>>> zoo_reader() to zoo_updater() does not resolve the
> >>>> SIGSEGV.
> >>>>
> >>>>
> >>>> This commit defines rcu_barrier() if not available,
> >>>> and puts them at both before and after the final loop
> >>>> of perftest_updater() and zoo_updater().
> >>>>
> >>>> It looks like this change can fix the above mentioned
> >>>> SIGSEGV in "--perftest".
> >>>>
> >>>> [Tested on Ubuntu Xenial with liburcu-dev/xenial,now 0.9.1-3 and
> >>>> liburcu4/xenial,now 0.9.1-3 installed.]
> >>>>
> >>>> NOTE:
> >>>>
> >>>>     $ ./hash_resize --schroedinger --resizemult 2 --duration 20
> >>>
> >>> I get SIGSEGV and hangs from time to time, so I am looking into this.
> >>> Thank you for calling it to my attention!
> >>
> >> I've found some suspicious code in hash_resize.c
> >>
> >> hashtab_lock_mod() takes care of ongoing resizing and spin_lock()
> >> new bucket if necessary. This is good for add, but for delete
> >> we may still need to lock old bucket.
> >>
> >> And hashtab_unlock_mod() doesn't care ongoing resizing, so
> >> there can be mismatch of spin_lock() -- spin_unlock().
> >>
> >> Also, htp_master->ht_cur can change during the
> >> hashtab_lock_mod() -- hashtab_unlock_mod() critical section
> >> because the update of the pointer by rcu_assign_pointer()
> >> is ahead of synchronize_rcu().
> >>
> >> Given the resizing is infrequent, the simplest way might be to
> >> block hashtab_lock_mod while resizing is going on.
> > 
> > I do believe you have found something here, and thank you!  So the
> > answer to my earlier question as to whether I was smarter when writing
> > it than now is clearly that I was equally stupid in both cases.  ;-)
> > 
> > Well, it is conference-driven code, but still high time for me to
> > clean it up.
> > 
> >> There can be a better way to keep concurrent add/del/resize, though.
> >> Happy hacking! ;-) 
> > 
> > I do believe that I can preserve concurrency between resizing and
> > deletion, but that is clearly for me to prove.
> 
> There is one more thing I've noticed with "hash_resize --schroedinger".
> *Without* resizing enabled, it says:
> 
>     $ ./hash_resize --schroedinger
>     nlookups: 91373 91373  ncats: 0  nadds: 5  ndels: 6  duration: 10.851
>     ns/read: 118.755  ns/update: 986455
> 
> This means that all the lookups failed. OTOH, hash_bkt_rcu works as expected
> as follows:
> 
>     $ ./hash_bkt_rcu --schroedinger
>     nlookups: 56064 28004  ncats: 0  nadds: 5  ndels: 5  duration: 10.373
>     ns/read: 185.021  ns/update: 1.0373e+06
> 
> (ns/read looks slow because compiler optimization is disabled.)
> 
> There seems to be some mismatch in hash/key handling of hash_resize.c --
> hashtorture.h combination. I've not yet figured out the cause, though.

The short story is that I am working to return the locking state
from hashtab_lock_mod() for use by hashtab_add(), hashtab_del(),
and hashtab_unlock_mod().  Also, the first resize carries out some
"interesting" state changes that might need to be reflected in
initialization.

But yes, not one of my best efforts...

							Thanx, Paul

>         Thanks, Akira
> 
> > 
> > And thank you again!
> > 
> > 							Thanx, Paul
> > 
> >>         Thanks, Akira
> >>>
> >>>> still fails with SIGSEGV frequently in zoo_del(). GDB says:
> >>>>
> >>>>     (gdb) where
> >>>>     #0  0x0000000000402b27 in cds_list_del_rcu (elem=0x7ff8fc0138f0)
> >>>>         at /usr/include/urcu/rculist.h:71
> >>>>     #1  hashtab_del (htep=0x7ff8fc0138d0, htp_master=<optimized out>)
> >>>>         at hash_resize.c:261
> >>>>     #2  zoo_del (zhep=0x7ff8fc0138d0) at hashtorture.h:1007
> >>>>     #3  zoo_updater (arg=0x1e8b298) at hashtorture.h:1153
> >>>>     #4  0x00007ff9057d16ba in start_thread (arg=0x7ff903fed700)
> >>>>         at pthread_create.c:333
> >>>>     #5  0x00007ff9050f741d in clone ()
> >>>>         at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
> >>>>
> >>>> Signed-off-by: Akira Yokosawa <akiyks@gmail.com>
> >>>
> >>> Good catch, queue and pushed, thank you!
> >>>
> >>> With one small modification -- given that liburcu has had rcu_barrier()
> >>> for some years now, I removed the "training wheels" (and unreliable)
> >>> use of the wait and pair of synchronize_rcu() calls.
> >>>
> >>>> ---
> >>>> Hi Paul,
> >>>>
> >>>> This is a partial fix, but it resolves SIGSEGV in "--perftest" of
> >>>> hash_bkt_rcu and hash_resize.
> >>>>
> >>>> "--schroedinger" of hash_resize with resizing enabled still seg faults
> >>>> as mentioned in the commit log.
> >>>>
> >>>> By the way, what version of liburcu are you using?
> >>>
> >>> It is about two years old, but it does have rcu_barrier().
> >>>
> >>> 								Thanx, Paul
> >>>
> >>>>         Thanks, Akira
> >>>> --
> >>>>  CodeSamples/datastruct/hash/hashtorture.h | 24 ++++++++++++++++--------
> >>>>  1 file changed, 16 insertions(+), 8 deletions(-)
> >>>>
> >>>> diff --git a/CodeSamples/datastruct/hash/hashtorture.h b/CodeSamples/datastruct/hash/hashtorture.h
> >>>> index 0e90220..9ae3dfa 100644
> >>>> --- a/CodeSamples/datastruct/hash/hashtorture.h
> >>>> +++ b/CodeSamples/datastruct/hash/hashtorture.h
> >>>> @@ -55,6 +55,15 @@ void (*defer_del_done)(struct ht_elem *htep) = NULL;
> >>>>  #ifndef quiescent_state
> >>>>  #define quiescent_state() do ; while (0)
> >>>>  #define synchronize_rcu() do ; while (0)
> >>>> +#define rcu_barrier() do ; while (0)
> >>>> +#else
> >>>> +#ifndef rcu_barrier
> >>>> +#define rcu_barrier() do { \
> >>>> +		synchronize_rcu(); \
> >>>> +		poll(NULL, 0, 100); \
> >>>> +		synchronize_rcu(); \
> >>>> +	} while (0)
> >>>> +#endif /* #ifndef rcu_barrier */
> >>>>  #endif /* #ifndef quiescent_state */
> >>>>  
> >>>>  /*
> >>>> @@ -765,6 +774,7 @@ void *perftest_reader(void *arg)
> >>>>  		if (i >= ne)
> >>>>  			i = i % ne + offset;
> >>>>  	}
> >>>> +
> >>>>  	pap->nlookups = nlookups;
> >>>>  	pap->nlookupfails = nlookupfails;
> >>>>  	hash_unregister_thread();
> >>>> @@ -839,6 +849,7 @@ void *perftest_updater(void *arg)
> >>>>  			quiescent_state();
> >>>>  	}
> >>>>  
> >>>> +	rcu_barrier();
> >>>>  	/* Test over, so remove all our elements from the hash table. */
> >>>>  	for (i = 0; i < elperupdater; i++) {
> >>>>  		if (thep[i].in_table != 1)
> >>>> @@ -846,10 +857,7 @@ void *perftest_updater(void *arg)
> >>>>  		BUG_ON(!perftest_lookup(thep[i].data));
> >>>>  		perftest_del(&thep[i]);
> >>>>  	}
> >>>> -	/* Really want rcu_barrier(), but missing from old liburcu versions. */
> >>>> -	synchronize_rcu();
> >>>> -	poll(NULL, 0, 100);
> >>>> -	synchronize_rcu();
> >>>> +	rcu_barrier();
> >>>>  
> >>>>  	hash_unregister_thread();
> >>>>  	free(thep);
> >>>> @@ -1048,10 +1056,6 @@ void *zoo_reader(void *arg)
> >>>>  		if (i >= ne)
> >>>>  			i = i % ne + offset;
> >>>>  	}
> >>>> -	/* Really want rcu_barrier(), but missing from old liburcu versions. */
> >>>> -	synchronize_rcu();
> >>>> -	poll(NULL, 0, 100);
> >>>> -	synchronize_rcu();
> >>>>  
> >>>>  	pap->nlookups = nlookups;
> >>>>  	pap->nlookupfails = nlookupfails;
> >>>> @@ -1136,15 +1140,19 @@ void *zoo_updater(void *arg)
> >>>>  			quiescent_state();
> >>>>  	}
> >>>>  
> >>>> +	rcu_barrier();
> >>>>  	/* Test over, so remove all our elements from the hash table. */
> >>>>  	for (i = 0; i < elperupdater; i++) {
> >>>>  		if (!zheplist[i])
> >>>>  			continue;
> >>>>  		zoo_del(zheplist[i]);
> >>>>  	}
> >>>> +	rcu_barrier();
> >>>> +
> >>>>  	hash_unregister_thread();
> >>>>  	pap->nadds = nadds;
> >>>>  	pap->ndels = ndels;
> >>>> +	free(zheplist);
> >>>>  	return NULL;
> >>>>  }
> >>>>  
> >>>> -- 
> >>>> 2.7.4
> >>>>
> >>>>
> >>>
> >>
> > 
> 


  reply	other threads:[~2019-01-02 17:19 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-24 14:46 [PATCH 00/11] datastruct: Employ new scheme for code snippet Akira Yokosawa
2018-12-24 14:53 ` [PATCH 01/11] fcvextract.pl: Enhance comment block handling of C source Akira Yokosawa
2018-12-24 14:55 ` [PATCH 02/11] CodeSamples: Add explicit 'keepcomment=yes' options Akira Yokosawa
2018-12-24 14:56 ` [PATCH 03/11] fcvextract.pl: Make 'keepcomment=no' as default Akira Yokosawa
2018-12-24 14:57 ` [PATCH 04/11] CodeSamples: Remove redundant \fcvexclude Akira Yokosawa
2018-12-24 14:59 ` [PATCH 05/11] fcvextract.pl: Support '/* \lnlbl{...} */' style label in C source Akira Yokosawa
2018-12-24 15:00 ` [PATCH 06/11] datastruct: Employ new scheme for snippets of hash_bkt.c Akira Yokosawa
2018-12-24 15:01 ` [PATCH 07/11] datastruct: Update hashdiagram figure Akira Yokosawa
2018-12-24 15:02 ` [PATCH 08/11] datastruct: Employ new scheme for snippets of hash_bkt_rcu and hash_resize Akira Yokosawa
2018-12-24 15:03 ` [PATCH 09/11] Make sure lmtt font is used in 'VerbatimL' and 'Verbatim' env Akira Yokosawa
2018-12-24 15:04 ` [PATCH 10/11] Use wider tabsize for snippet in 'listing*' Akira Yokosawa
2018-12-24 15:05 ` [PATCH 11/11] datastruct: Tweak hyphenation Akira Yokosawa
2018-12-24 23:58 ` [PATCH 00/11] datastruct: Employ new scheme for code snippet Paul E. McKenney
2018-12-25  0:53   ` Paul E. McKenney
2018-12-25 14:30     ` Akira Yokosawa
2018-12-26 14:17       ` Paul E. McKenney
2018-12-26 14:31       ` [PATCH] gen_snippet_d.pl: Add rules to ignore editor's backup files Akira Yokosawa
2018-12-26 15:00         ` Paul E. McKenney
2018-12-31  4:37           ` Sporadic SIGSEGV in hash_bkt_rcu and hash_resize (was Re: [PATCH] gen_snippet_d.pl: Add rules to ignore editor's backup files) Akira Yokosawa
2018-12-31 15:15             ` [PATCH] EXP hashtorture.h: Avoid sporadic SIGSEGV in hash_bkt_rcu Akira Yokosawa
2018-12-31 21:03               ` Paul E. McKenney
2019-01-01  0:27                 ` Akira Yokosawa
2019-01-01 18:00                   ` Paul E. McKenney
2019-01-02 15:02                     ` Akira Yokosawa
2019-01-02 17:18                       ` Paul E. McKenney [this message]
2019-01-02 19:18                         ` Paul E. McKenney
2019-01-03 15:57                           ` [PATCH] datastruct/hash: Tweak appearance of updated code in snippet Akira Yokosawa
2019-01-03 17:21                             ` Paul E. McKenney
2019-01-03 23:35                               ` Akira Yokosawa
2019-01-04  0:52                                 ` Paul E. McKenney
2019-01-04  1:56                                   ` Akira Yokosawa
2019-01-04  3:56                                     ` Paul E. McKenney
2019-01-04 15:38                                 ` Akira Yokosawa
2019-01-04 15:39                                   ` [PATCH 1/2] datastruct/hash: Tweak indent of folded line " Akira Yokosawa
2019-01-04 22:40                                     ` Paul E. McKenney
2019-01-04 15:41                                   ` [PATCH 2/2] datastruct/hash: Annotate racy accesses with READ_ONCE/WRITE_ONCE Akira Yokosawa
2019-01-05  0:10                                     ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190102171848.GA4170@linux.ibm.com \
    --to=paulmck@linux.ibm.com \
    --cc=akiyks@gmail.com \
    --cc=perfbook@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox