linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Waiman Long <Waiman.Long@hpe.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>,
	Jan Kara <jack@suse.com>, Jeff Layton <jlayton@poochiereds.net>,
	"J. Bruce Fields" <bfields@fieldses.org>,
	Tejun Heo <tj@kernel.org>,
	Christoph Lameter <cl@linux-foundation.org>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Andi Kleen <andi@firstfloor.org>,
	Dave Chinner <dchinner@redhat.com>,
	Scott J Norton <scott.norton@hp.com>,
	Douglas Hatch <doug.hatch@hp.com>
Subject: Re: [PATCH v3 1/3] lib/percpu-list: Per-cpu list with associated per-cpu locks
Date: Wed, 24 Feb 2016 08:56:22 +0100	[thread overview]
Message-ID: <20160224075622.GA10096@quack.suse.cz> (raw)
In-Reply-To: <1456254272-42313-2-git-send-email-Waiman.Long@hpe.com>

On Tue 23-02-16 14:04:30, Waiman Long wrote:
> Linked list is used everywhere in the Linux kernel. However, if many
> threads are trying to add or delete entries into the same linked list,
> it can create a performance bottleneck.
> 
> This patch introduces a new per-cpu list subystem with associated
> per-cpu locks for protecting each of the lists individually. This
> allows list entries insertion and deletion operations to happen in
> parallel instead of being serialized with a global list and lock.
> 
> List entry insertion is strictly per cpu. List deletion, however, can
> happen in a cpu other than the one that did the insertion. So we still
> need lock to protect the list. Because of that, there may still be
> a small amount of contention when deletion is being done.
> 
> A new header file include/linux/percpu-list.h will be added with the
> associated pcpu_list_head and pcpu_list_node structures. The following
> functions are provided to manage the per-cpu list:
> 
>  1. int init_pcpu_list_head(struct pcpu_list_head **ppcpu_head)
>  2. void pcpu_list_add(struct pcpu_list_node *node,
> 		       struct pcpu_list_head *head)
>  3. void pcpu_list_del(struct pcpu_list *node)
> 
> Iteration of all the list entries within a group of per-cpu
> lists is done by calling either the pcpu_list_iterate() or
> pcpu_list_iterate_safe() functions in a while loop. They correspond
> to the list_for_each_entry() and list_for_each_entry_safe() macros
> respectively. The iteration states are keep in a pcpu_list_state
> structure that is passed to the iteration functions.
> 
> Signed-off-by: Waiman Long <Waiman.Long@hpe.com>

Two comments below.

> +/*
> + * Helper function to find the first entry of the next per-cpu list
> + * It works somewhat like for_each_possible_cpu(cpu).
> + *
> + * Return: true if the entry is found, false if all the lists exhausted
> + */
> +static __always_inline bool
> +__pcpu_list_next_cpu(struct pcpu_list_head *head, struct pcpu_list_state *state)
> +{
> +	if (state->lock)
> +		spin_unlock(state->lock);
> +next_cpu:
> +	/*
> +	 * for_each_possible_cpu(cpu)
> +	 */
> +	state->cpu = cpumask_next(state->cpu, cpu_possible_mask);
> +	if (state->cpu >= nr_cpu_ids)
> +		return false;	/* All the per-cpu lists iterated */
> +
> +	state->head = &per_cpu_ptr(head, state->cpu)->list;
> +	state->lock = &per_cpu_ptr(head, state->cpu)->lock;
> +	state->curr = list_entry(state->head->next,
> +				 struct pcpu_list_node, list);
> +	if (&state->curr->list == state->head)
> +		goto next_cpu;

This might be more comprehensible as:

	if (list_empty(state->head))
		goto next_cpu;

and you can do it just after updating state->head (no need to init
state->lock & state->curr if the list is empty).

Another note: Initialization of state->curr is IMO racy - you need to hold
state->lock to grab state->curr reliably, don't you? Otherwise someone can
remove the entry while you are working with it. So you need to move that
down just before returning.

> +
> +	spin_lock(state->lock);
> +	return true;
> +}
> +#endif /* NR_CPUS == 1 */

...

> +/*
> + * Delete a node from a percpu list
> + *
> + * We need to check the lock pointer again after taking the lock to guard
> + * against concurrent delete of the same node. If the lock pointer changes
> + * (becomes NULL or to a different one), we assume that the deletion was done
> + * elsewhere.
> + */
> +void pcpu_list_del(struct pcpu_list_node *node)
> +{
> +	spinlock_t *lock = READ_ONCE(node->lockptr);
> +
> +	if (unlikely(!lock))
> +		return;
> +
> +	spin_lock(lock);
> +	if (likely(lock == node->lockptr)) {
> +		list_del_init(&node->list);
> +		node->lockptr = NULL;
> +	}

But someone changing lockptr under your hands would mean that there are
two processes racing to remove entries and that would generally point to a
problem (and likely use-after-free) in the caller, won't it? Or do you have
some particular usecase in mind?

								Honza

-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

  parent reply	other threads:[~2016-02-24  7:56 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-23 19:04 [PATCH v3 0/3] vfs: Use per-cpu list for SB's s_inodes list Waiman Long
2016-02-23 19:04 ` [PATCH v3 1/3] lib/percpu-list: Per-cpu list with associated per-cpu locks Waiman Long
2016-02-24  2:00   ` Boqun Feng
2016-02-24  4:01     ` Waiman Long
2016-02-24  7:56   ` Jan Kara [this message]
2016-02-24 19:51     ` Waiman Long
2016-02-23 19:04 ` [PATCH v3 2/3] fsnotify: Simplify inode iteration on umount Waiman Long
2016-02-23 19:04 ` [PATCH v3 3/3] vfs: Use per-cpu list for superblock's inode list Waiman Long
2016-02-24  8:28   ` Jan Kara
2016-02-24  8:36     ` Ingo Molnar
2016-02-24  8:58       ` Jan Kara
2016-02-25  8:06         ` Ingo Molnar
2016-02-25 14:43           ` Waiman Long
2016-02-24 20:23     ` Waiman Long
2016-02-25 14:50       ` Waiman Long

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160224075622.GA10096@quack.suse.cz \
    --to=jack@suse.cz \
    --cc=Waiman.Long@hpe.com \
    --cc=andi@firstfloor.org \
    --cc=bfields@fieldses.org \
    --cc=cl@linux-foundation.org \
    --cc=dchinner@redhat.com \
    --cc=doug.hatch@hp.com \
    --cc=jack@suse.com \
    --cc=jlayton@poochiereds.net \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=scott.norton@hp.com \
    --cc=tj@kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).