From: Corey Minyard <minyard@acm.org>
To: Sergey Vlasov <vsu@altlinux.ru>
Cc: Greg KH <greg@kroah.com>, lkml <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] New operation for kref to help avoid locks
Date: Sat, 26 Feb 2005 16:23:04 -0600 [thread overview]
Message-ID: <4220F6C8.4020002@acm.org> (raw)
In-Reply-To: <20050226232026.5c12d5b0.vsu@altlinux.ru>
[-- Attachment #1: Type: text/plain, Size: 444 bytes --]
Sergey Vlasov wrote:
>On Sat, 26 Feb 2005 09:55:41 -0600 Corey Minyard wrote:
>
>
>
>>Greg,
>>
>>This is the patch for krefs that we talked about. If you don't like it
>>but like the docs, feel free just to take the documentation and cut out
>>the stuff at the end about the new operation.
>>
>>
>
>See below for comments to the patch.
>
>
>
Sergey, right on all counts. Here's a patch with the fixes you suggest.
Thanks,
-Corey
[-- Attachment #2: kref_checked.diff --]
[-- Type: text/plain, Size: 10637 bytes --]
Add a routine to kref that allows the kref_put() routine to be
unserialized even when the get routine attempts to kref_get()
an object without first holding a valid reference to it. This is
useful in situations where this happens multiple times without
freeing the object, as it will avoid having to do a lock/semaphore
except on the final kref_put().
This also adds some kref documentation to the Documentation
directory.
Signed-off-by: Corey Minyard <minyard@acm.org>
Index: linux-2.6.11-rc4/include/linux/kref.h
===================================================================
--- linux-2.6.11-rc4.orig/include/linux/kref.h
+++ linux-2.6.11-rc4/include/linux/kref.h
@@ -28,5 +28,11 @@
void kref_get(struct kref *kref);
void kref_put(struct kref *kref, void (*release) (struct kref *kref));
+/* Get a kref if the count is not already zero. This can be used to
+ avoid a lock in the routine that calls kref_put(). Returns 1 if
+ successful or zero if the count was already zero. See
+ Documentation/kref.txt for details on how to use this. */
+int kref_get_with_check(struct kref *kref);
+
#endif /* __KERNEL__ */
#endif /* _KREF_H_ */
Index: linux-2.6.11-rc4/lib/kref.c
===================================================================
--- linux-2.6.11-rc4.orig/lib/kref.c
+++ linux-2.6.11-rc4/lib/kref.c
@@ -52,6 +52,20 @@
release(kref);
}
+/**
+ * kref_get_with_check - increment refcount if the refcount is not already 0.
+ * @kref: object.
+ */
+int kref_get_with_check(struct kref *kref)
+{
+ if (atomic_inc_return(&kref->refcount) == 1) {
+ atomic_dec(&kref->refcount);
+ return 0;
+ }
+ return 1;
+}
+
EXPORT_SYMBOL(kref_init);
EXPORT_SYMBOL(kref_get);
EXPORT_SYMBOL(kref_put);
+EXPORT_SYMBOL(kref_get_with_check);
Index: linux-2.6.11-rc4/Documentation/kref.txt
===================================================================
--- /dev/null
+++ linux-2.6.11-rc4/Documentation/kref.txt
@@ -0,0 +1,283 @@
+krefs allow you to add reference counters to your objects. If you
+have objects that are used in multiple places and passed around, and
+you don't have refcounts, your code is almost certainly broken. If
+you want refcounts, krefs are the way to go.
+
+To use a kref, add a one to your data structures like:
+
+struct my_data
+{
+ .
+ .
+ struct kref refcount;
+ .
+ .
+};
+
+The kref can occur anywhere within the data structure.
+
+You must initialize the kref after you allocate it. To do this, call
+kref init as so:
+
+ struct my_data *data;
+
+ data = kmalloc(sizeof(*data), GFP_KERNEL);
+ if (!data)
+ return -ENOMEM;
+ kref_init(&data->refcount);
+
+This sets the refcount in the kref to 1.
+
+Once you have a refcount, you must follow the following rules:
+
+1) If you make a non-temporary copy of a pointer, especially if
+ it can be passed to another thread of execution, you must
+ increment the refcount with kref_get() before passing it off:
+ kref_get(&data->refcount);
+ If you already have a valid pointer to a kref-ed structure (the
+ refcount cannot go to zero) you may do this without a lock.
+
+2) When you are done with a pointer, you must call kref_put():
+ kref_put(&data->refcount, data_release);
+ If this is the last reference to the pointer, the release
+ routine will be called. If the code never tries to get
+ a valid pointer to a kref-ed structure without already
+ holding a valid pointer, it is safe to do this without
+ a lock.
+
+3) If the code attempts to gain a reference to a kref-ed structure
+ without already holding a valid pointer, it must serialize access
+ where a kref_put() cannot occur during the kref_get(), and the
+ structure must remain valid during the kref_get().
+
+For example, if you allocate some data and then pass it to another
+thread to process:
+
+void data_release(struct kref *ref)
+{
+ struct my_data *data = container_of(ref, struct my_data, refcount);
+ kfree(data);
+}
+
+void more_data_handling(void *cb_data)
+{
+ struct my_data *data = cb_data;
+ .
+ . do stuff with data here
+ .
+ kref_put(data, data_release);
+}
+
+int my_data_handler(void)
+{
+ int rv = 0;
+ struct my_data *data;
+ struct task_struct *task;
+ data = kmalloc(sizeof(*data), GFP_KERNEL);
+ if (!data)
+ return -ENOMEM;
+ kref_init(&data->refcount);
+
+ kref_get(&data->refcount);
+ task = kthread_run(more_data_handling, data, "more_data_handling");
+ if (task == ERR_PTR(-ENOMEM)) {
+ rv = -ENOMEM;
+ kref_put(&data->refcount);
+ goto out;
+ }
+
+ .
+ . do stuff with data here
+ .
+ out:
+ kref_put(data, data_release);
+ return rv;
+}
+
+This way, it doesn't matter what order the two threads handle the
+data, the put handles knowing when the data is free and releasing it.
+The kref_get() does not require a lock, since we already have a valid
+pointer that we own a refcount for. The put needs no lock because
+nothing tries to get the data without already holding a pointer.
+
+Note that the "before" in rule 1 is very important. You should never
+do something like:
+
+ rv = kthread_run(more_data_handling, data, "more_data_handling");
+ if (rv)
+ goto out;
+ else
+ /* BAD BAD BAD - get is after the handoff */
+ kref_get(&data->refcount);
+
+Don't assume you know what you are doing and use the above construct.
+First of all, you may not know what you are doing. Second, you may
+know what you are doing (there are some situations where locking is
+involved where the above may be legal) but someone else who doesn't
+know what they are doing may change the code or copy the code. It's
+bad style. Don't do it.
+
+There are some situations where you can optimize the gets and puts.
+For instance, if you are done with an object and enqueuing it for
+something else or passing it off to something else, there is no reason
+to do a get then a put:
+
+ /* Silly extra get and put */
+ kref_get(&obj->ref);
+ enqueue(obj);
+ kref_put(&obj->ref, obj_cleanup);
+
+Just do the enqueue. A comment about this is always welcome:
+
+ enqueue(obj);
+ /* We are done with obj, so we pass our refcount off
+ to the queue. DON'T TOUCH obj AFTER HERE! */
+
+The last rule (rule 3) is the nastiest one to handle. Say, for
+instance, you have a list of items that are each kref-ed, and you wish
+to get the first one. You can't just pull the first item off the list
+and kref_get() it. That violates rule 3 because you are not already
+holding a valid pointer. You must add locks or semaphores. For
+instance:
+
+static DECLARE_MUTEX(sem);
+static LIST_HEAD(q);
+struct my_data
+{
+ struct kref refcount;
+ struct list_head link;
+};
+
+static struct my_data *get_entry()
+{
+ struct my_data *entry = NULL;
+ down(&sem);
+ if (!list_empty(&q)) {
+ entry = container_of(q.next, struct my_q_entry, link);
+ kref_get(&entry->refcount);
+ }
+ up(&sem);
+ return entry;
+}
+
+static void release_entry(struct kref *ref)
+{
+ struct my_data *entry = container_of(ref, struct my_data, refcount);
+
+ list_del(&entry->link);
+ kfree(entry);
+}
+
+static void put_entry(struct my_data *entry)
+{
+ down(&sem);
+ kref_put(&entry->refcount, release_entry);
+ up(&sem);
+}
+
+
+There is one exception to rule 3. kref_get_with_check() can be used
+to avoid a lock/semaphore in your routines that call kref_put(), except
+in the case that release is called. It is useful in situations where
+you might be trying to get an object that might have just gone free,
+as in our list example above.
+
+For instance, suppose you have a queue of items to process in your
+driver. The head of the queue is the current item being worked on.
+You driver has timers, interrupts, and user context threads that may
+all attempt to claim the head of the queue.
+
+So you define three routines, get_working_entry(), put_entry(), and
+entry_completed(), one to get the current queue entry that is being
+worked on, one to tell the system you are done working on that
+entry for now, and one to tell that the entry is completed and
+should be dequeued. Here is some example code:
+
+static LIST_HEAD(q);
+DEFINE_SPINLOCK(q_lock);
+
+struct my_q_entry
+{
+ struct list_head link;
+ struct kref refcount;
+ char data[10];
+ void (*done)(struct my_q_entry *);
+};
+
+struct my_q_entry *get_working_entry(void)
+{
+ unsigned long flags;
+ struct my_q_entry *entry = NULL;
+
+ spin_lock_irqsave(&q_lock, flags);
+ retry:
+ if (!list_empty(&q)) {
+ entry = container_of(q.next, struct my_q_entry, link);
+ if (!kref_get_with_check(&entry->refcount)) {
+ /* Entry's refcount has been deleted, but it
+ * has not yet been removed from the list. We
+ * raced with put_entry and are between the
+ * kref_put() decrementing the count and the
+ * release routine claiming the lock. Just
+ * remove it and retry. */
+ list_del_init(&entry->link);
+ entry = NULL;
+ goto retry;
+ }
+ }
+ spin_lock_irqrestore(&q_lock, flags);
+ return entry;
+}
+
+void release_entry(struct kref *ref)
+{
+ unsigned long flags;
+ struct my_q_entry *entry = container_of(ref, struct my_q_entry,
+ refcount);
+
+ spin_lock_irqsave(&q_lock, flags);
+ list_del(&entry->link);
+ spin_lock_irqrestore(&q_lock, flags);
+ entry->done(entry);
+
+ entry = get_working_entry();
+ if (entry) {
+ /* Start next entry here. */
+ put_entry(entry)
+ }
+}
+
+void put_entry(struct my_q_entry *entry)
+{
+ kref_put(&entry->refcount, release_entry);
+}
+
+void entry_completed(struct my_q_entry *entry)
+{
+ kref_put(&entry->refcount, release_entry);
+}
+
+
+Note that there is no lock around kref_put(), which should avoid a
+lock on most put operations, except for the last one. This works
+because if kref_put() sets the refcount to zero, one and only one
+thread of execution can be doing the kref_get(). If the kref_get()
+beats the kref_put(), everything is fine, the get routine has a
+valid entry. If the kref_put() beats the kref_get(), the kref
+value will be zero and kref_get_with_check() will fail, but the
+release_entry() code will block on the spinlock. So the entry will
+be dequeued and everything still works.
+
+This is generally only useful when you expect do have this happen many
+times for a single object. If 99% of the time you have one get and
+one put, it's probably not worth the complexity. If you generally
+have 20 gets and puts, it saves 19 locking operations, which is
+probably worth it.
+
+Other than this, normal kref rules apply. You can do a kref_get()
+without a lock if you are holding a valid kref pointer.
+
+
+Corey Minyard <minyard@acm.org>
+
+A lot of this was lifted from Greg KH's OLS presentation on krefs.
next prev parent reply other threads:[~2005-02-26 22:23 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-02-26 15:55 [PATCH] New operation for kref to help avoid locks Corey Minyard
2005-02-26 20:20 ` Sergey Vlasov
2005-02-26 22:23 ` Corey Minyard [this message]
2005-03-01 20:15 ` Greg KH
2005-03-01 21:02 ` Arjan van de Ven
2005-03-01 21:24 ` Greg KH
2005-03-01 21:54 ` Corey Minyard
2005-03-01 22:14 ` Arjan van de Ven
2005-03-01 23:35 ` Corey Minyard
2005-03-02 0:02 ` Nick Piggin
2005-03-02 0:28 ` Corey Minyard
2005-03-02 0:34 ` Nick Piggin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4220F6C8.4020002@acm.org \
--to=minyard@acm.org \
--cc=greg@kroah.com \
--cc=linux-kernel@vger.kernel.org \
--cc=vsu@altlinux.ru \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox