From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: linux-security-module@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] TOMOYO: Add garbage collector support. (v3)
Date: Wed, 17 Jun 2009 09:31:22 -0700 [thread overview]
Message-ID: <20090617163122.GC6767@linux.vnet.ibm.com> (raw)
In-Reply-To: <200906172019.EAE00032.OLOJFQVMtFOFSH@I-love.SAKURA.ne.jp>
On Wed, Jun 17, 2009 at 08:19:07PM +0900, Tetsuo Handa wrote:
> Hello.
>
> This patchset adds garbage collector for TOMOYO.
> This time, I'm using some sort of RCU-like approach instead of cookie-list
> approach.
>
> TOMOYO 1/3: Move sleeping operations to outside the semaphore.
> 6 files changed, 231 insertions(+), 345 deletions(-)
>
> TOMOYO 2/3: Replace tomoyo_save_name() with tomoyo_get_name()/tomoyo_put_name().
> 5 files changed, 70 insertions(+), 23 deletions(-)
>
> TOMOYO 3/3: Add RCU-like garbage collector.
> 7 files changed, 733 insertions(+), 358 deletions(-)
>
> Paul E. McKenney wrote ( http://lkml.org/lkml/2009/5/27/2 ) :
> > I would also recommend the three-part LWN series as a starting point:
> >
> > # http://lwn.net/Articles/262464/ (What is RCU, Fundamentally?)
> > # http://lwn.net/Articles/263130/ (What is RCU's Usage?)
> > # http://lwn.net/Articles/264090/ (What is RCU's API?)
>
> I've read these articles. They are very good.
Glad that they were helpful!!!
> I came up with an idea that we may be able to implement GC while readers are
> permitted to sleep but no read locks are required.
I believe you have a bug in your pseudocode -- please see below.
> The idea is to have two counters which hold the number of readers currently
> reading the list, one is active and the other is inactive. Reader increments
> the currently active counter before starts reading and decrements that counter
> after finished reading. GC swaps active counter and inactive counter and waits
> for previously active counter's count to become 0 before releasing elements
> removed from the list.
> Code is shown below.
>
> atomic_t users_counter[2];
> atomic_t users_counter_idx;
> DEFINE_MUTEX(updator_mutex);
> DEFINE_MUTEX(gc_mutex);
>
> --- reader ---
> {
> /* Get counter index. */
> int idx = atomic_read(&users_counter_idx);
> /* Lock counter. */
> atomic_inc(&users_counter[idx]);
> list_for_each_entry_rcu() {
> ... /* Allowed to sleep. */
> }
> /* Unlock counter. */
> atomic_dec(&users_counter[idx]);
> }
>
> --- writer ---
> {
> bool found = false;
> /* Get lock for writing. */
> mutex_lock(&updater_mutex);
> list_for_each_entry_rcu() {
> if (...)
> continue;
> found = true;
> break;
> }
> if (!found)
> list_add_rcu(element);
> /* Release lock for writing. */
> mutex_unlock(&updater_mutex);
> }
>
> --- garbage collector ---
> {
> bool element_deleted = false;
> /* Protect the counters from concurrent GC threads. */
> mutex_lock(&gc_mutex);
> /* Get lock for writing. */
> mutex_lock(&updater_mutex);
> list_for_each_entry_rcu() {
> if (...)
> continue;
> list_del_rcu(element);
> element_deleted = true;
> break;
> }
> /* Release lock for writing. */
> mutex_unlock(&updater_mutex);
> if (element_deleted) {
> /* Swap active counter. */
> const int idx = atomic_read(&users_counter_idx);
> atomic_set(&users_counter_idx, idx ^ 1);
> /*
> * Wait for readers who are using previously active counter.
> * This is similar to synchronize_rcu() while this code allows
> * readers to do operations which may sleep.
> */
> while (atomic_read(&users_counter[idx]))
> msleep(1000);
> /*
> * Nobody is using previously active counter.
> * Ready to release memory of elements removed before
> * previously active counter became inactive.
> */
> kfree(element);
> }
> mutex_unlock(&gc_mutex);
> }
Consider the following sequence of events:
o CPU 0 picks up users_counter_idx int local variable idx.
Let's assume that the value is zero.
o CPU 0 is now preempted, interrupted, or otherwise delayed.
o CPU 1 starts garbage collection, finding some elements to
delete, thus setting "element_deleted" to true.
o CPU 1 continues garbage collection, inverting the value of
users_counter_idx, so that the value is now one, waiting
for the value-zero readers, and freeing up the old elements.
o CPU 0 continues execution, first atomically incrementing
users_counter[0], then traversing the list, possibly sleeping.
o CPU 2 starts a new round of garbage collection, again finding
some elements to delete, and thus again setting
"elements_deleted" to true. One of the elements deleted
is the one that CPU 0 is currently referencing while asleep.
o CPU 2 continues garbage collection, inverting the value of
users_counter_idx, so that the value is now zero, waiting
for the value-one readers, and freeing up the old elements.
Note that CPU 0 is a value-zero reader, so that CPU 2 will
not wait on it.
CPU 2 therefore kfree()s the element that CPU 0 is currently
referencing.
o CPU 0 wakes up, and suffers possibly fatal disappointment upon
attempting to reference an element that has been freed -- and,
worse yet, possibly re-allocated as some other type of
structure.
Or am I missing something in your pseudocode?
Also, if you have lots of concurrent readers, you can suffer high memory
contention on the users_counter[] array, correct?
I recommend that you look into use of SRCU in this case. There is some
documentation at http://lwn.net/Articles/202847/, with an revised
version incorporating feedback from the LWN comments available at:
http://www.rdrop.com/users/paulmck/RCU/srcu.2007.01.14a.pdf
Well, all but one of the LWN comments -- someone posted one a couple of
months ago that I just now noticed.
Anyway, the general approach would be to make changes to your code
roughly as follows:
1. replace your users_counter and users_counter_idx with a
struct srcu_struct.
2. In the reader, replace the fetch from users_counter_idx and
the atomic_inc() with srcu_read_lock().
3. In the garbage collector, replace the fetch/update of
users_counter_idx and the "while" loop with synchronize_srcu().
> In this idea, GC's kfree() call may be deferred for unknown duration, but
> defer duration will not matter if we use a dedicated kernel thread for GC.
>
> I noticed that there is QRCU in the "RCU has a Family of Wait-to-Finish APIs"
> section. My idea seems to resemble QRCU except grace periods.
> But "Availability" field is empty. Oh, what happened to QRCU?
Last I knew, it was queued up in Jens Axboe's tree, awaiting a first
user. But the approach you have above looks to me like it will do fine
with SRCU.
Or is there some reason why SRCU does not work for you?
Thanx, Paul
next prev parent reply other threads:[~2009-06-17 16:31 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-06-17 11:19 [PATCH] TOMOYO: Add garbage collector support. (v3) Tetsuo Handa
2009-06-17 11:21 ` [PATCH 1/3] TOMOYO: Move sleeping operations to outside the semaphore Tetsuo Handa
2009-06-17 11:22 ` [PATCH 2/3] TOMOYO: Replace tomoyo_save_name() with tomoyo_get_name()/tomoyo_put_name() Tetsuo Handa
2009-06-17 11:23 ` [PATCH 3/3] TOMOYO: Add RCU-like garbage collector Tetsuo Handa
2009-06-17 12:28 ` [PATCH] TOMOYO: Add garbage collector support. (v3) Peter Zijlstra
2009-06-17 16:31 ` Paul E. McKenney [this message]
2009-06-18 5:34 ` Tetsuo Handa
2009-06-18 6:45 ` [PATCH 3/3] TOMOYO: Add SRCU based garbage collector Tetsuo Handa
2009-06-18 16:05 ` Paul E. McKenney
2009-06-18 15:28 ` [PATCH] TOMOYO: Add garbage collector support. (v3) Paul E. McKenney
2009-06-19 4:57 ` Tetsuo Handa
2009-06-20 1:28 ` Paul E. McKenney
2009-06-20 7:04 ` Tetsuo Handa
2009-06-21 4:07 ` Paul E. McKenney
-- strict thread matches above, loose matches on Subject: below --
2009-06-02 1:39 Tetsuo Handa
2009-06-02 1:57 ` Tetsuo Handa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090617163122.GC6767@linux.vnet.ibm.com \
--to=paulmck@linux.vnet.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-security-module@vger.kernel.org \
--cc=penguin-kernel@I-love.SAKURA.ne.jp \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox