From: Waiman Long <waiman.long@hpe.com>
To: Jan Kara <jack@suse.cz>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>,
Jan Kara <jack@suse.com>, Jeff Layton <jlayton@poochiereds.net>,
"J. Bruce Fields" <bfields@fieldses.org>,
Tejun Heo <tj@kernel.org>,
Christoph Lameter <cl@linux-foundation.org>,
<linux-fsdevel@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
Ingo Molnar <mingo@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
Andi Kleen <andi@firstfloor.org>,
Dave Chinner <dchinner@redhat.com>,
Scott J Norton <scott.norton@hp.com>,
Douglas Hatch <doug.hatch@hp.com>
Subject: Re: [PATCH v3 1/3] lib/percpu-list: Per-cpu list with associated per-cpu locks
Date: Wed, 24 Feb 2016 14:51:17 -0500 [thread overview]
Message-ID: <56CE09B5.2000602@hpe.com> (raw)
In-Reply-To: <20160224075622.GA10096@quack.suse.cz>
On 02/24/2016 02:56 AM, Jan Kara wrote:
> On Tue 23-02-16 14:04:30, Waiman Long wrote:
>> Linked list is used everywhere in the Linux kernel. However, if many
>> threads are trying to add or delete entries into the same linked list,
>> it can create a performance bottleneck.
>>
>> This patch introduces a new per-cpu list subystem with associated
>> per-cpu locks for protecting each of the lists individually. This
>> allows list entries insertion and deletion operations to happen in
>> parallel instead of being serialized with a global list and lock.
>>
>> List entry insertion is strictly per cpu. List deletion, however, can
>> happen in a cpu other than the one that did the insertion. So we still
>> need lock to protect the list. Because of that, there may still be
>> a small amount of contention when deletion is being done.
>>
>> A new header file include/linux/percpu-list.h will be added with the
>> associated pcpu_list_head and pcpu_list_node structures. The following
>> functions are provided to manage the per-cpu list:
>>
>> 1. int init_pcpu_list_head(struct pcpu_list_head **ppcpu_head)
>> 2. void pcpu_list_add(struct pcpu_list_node *node,
>> struct pcpu_list_head *head)
>> 3. void pcpu_list_del(struct pcpu_list *node)
>>
>> Iteration of all the list entries within a group of per-cpu
>> lists is done by calling either the pcpu_list_iterate() or
>> pcpu_list_iterate_safe() functions in a while loop. They correspond
>> to the list_for_each_entry() and list_for_each_entry_safe() macros
>> respectively. The iteration states are keep in a pcpu_list_state
>> structure that is passed to the iteration functions.
>>
>> Signed-off-by: Waiman Long<Waiman.Long@hpe.com>
> Two comments below.
>
>> +/*
>> + * Helper function to find the first entry of the next per-cpu list
>> + * It works somewhat like for_each_possible_cpu(cpu).
>> + *
>> + * Return: true if the entry is found, false if all the lists exhausted
>> + */
>> +static __always_inline bool
>> +__pcpu_list_next_cpu(struct pcpu_list_head *head, struct pcpu_list_state *state)
>> +{
>> + if (state->lock)
>> + spin_unlock(state->lock);
>> +next_cpu:
>> + /*
>> + * for_each_possible_cpu(cpu)
>> + */
>> + state->cpu = cpumask_next(state->cpu, cpu_possible_mask);
>> + if (state->cpu>= nr_cpu_ids)
>> + return false; /* All the per-cpu lists iterated */
>> +
>> + state->head =&per_cpu_ptr(head, state->cpu)->list;
>> + state->lock =&per_cpu_ptr(head, state->cpu)->lock;
>> + state->curr = list_entry(state->head->next,
>> + struct pcpu_list_node, list);
>> + if (&state->curr->list == state->head)
>> + goto next_cpu;
> This might be more comprehensible as:
>
> if (list_empty(state->head))
> goto next_cpu;
>
> and you can do it just after updating state->head (no need to init
> state->lock& state->curr if the list is empty).
Thank for the suggestion. Will change the code accordingly.
> Another note: Initialization of state->curr is IMO racy - you need to hold
> state->lock to grab state->curr reliably, don't you? Otherwise someone can
> remove the entry while you are working with it. So you need to move that
> down just before returning.
Right. I will move the initialization of state->curr after the spin_lock().
>> +
>> + spin_lock(state->lock);
>> + return true;
>> +}
>> +#endif /* NR_CPUS == 1 */
> ...
>
>> +/*
>> + * Delete a node from a percpu list
>> + *
>> + * We need to check the lock pointer again after taking the lock to guard
>> + * against concurrent delete of the same node. If the lock pointer changes
>> + * (becomes NULL or to a different one), we assume that the deletion was done
>> + * elsewhere.
>> + */
>> +void pcpu_list_del(struct pcpu_list_node *node)
>> +{
>> + spinlock_t *lock = READ_ONCE(node->lockptr);
>> +
>> + if (unlikely(!lock))
>> + return;
>> +
>> + spin_lock(lock);
>> + if (likely(lock == node->lockptr)) {
>> + list_del_init(&node->list);
>> + node->lockptr = NULL;
>> + }
> But someone changing lockptr under your hands would mean that there are
> two processes racing to remove entries and that would generally point to a
> problem (and likely use-after-free) in the caller, won't it? Or do you have
> some particular usecase in mind?
>
> Honza
>
This is just defensive programming to guard against unforeseen case. I
don't have any particular use case in mind that will make that happen.
Maybe I should put a WARN_ON if this really happens.
Cheers,
Longman
next prev parent reply other threads:[~2016-02-24 19:51 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-02-23 19:04 [PATCH v3 0/3] vfs: Use per-cpu list for SB's s_inodes list Waiman Long
2016-02-23 19:04 ` [PATCH v3 1/3] lib/percpu-list: Per-cpu list with associated per-cpu locks Waiman Long
2016-02-24 2:00 ` Boqun Feng
2016-02-24 4:01 ` Waiman Long
2016-02-24 7:56 ` Jan Kara
2016-02-24 19:51 ` Waiman Long [this message]
2016-02-23 19:04 ` [PATCH v3 2/3] fsnotify: Simplify inode iteration on umount Waiman Long
2016-02-23 19:04 ` [PATCH v3 3/3] vfs: Use per-cpu list for superblock's inode list Waiman Long
2016-02-24 8:28 ` Jan Kara
2016-02-24 8:36 ` Ingo Molnar
2016-02-24 8:58 ` Jan Kara
2016-02-25 8:06 ` Ingo Molnar
2016-02-25 14:43 ` Waiman Long
2016-02-24 20:23 ` Waiman Long
2016-02-25 14:50 ` Waiman Long
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56CE09B5.2000602@hpe.com \
--to=waiman.long@hpe.com \
--cc=andi@firstfloor.org \
--cc=bfields@fieldses.org \
--cc=cl@linux-foundation.org \
--cc=dchinner@redhat.com \
--cc=doug.hatch@hp.com \
--cc=jack@suse.com \
--cc=jack@suse.cz \
--cc=jlayton@poochiereds.net \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=scott.norton@hp.com \
--cc=tj@kernel.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).