Re: [PATCH v3 1/4] lib/dlock-list: Distributed and lock-protected lists

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Waiman Long <waiman.long@hpe.com>
To: Tejun Heo <tj@kernel.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>,
	Jan Kara <jack@suse.com>, Jeff Layton <jlayton@poochiereds.net>,
	"J. Bruce Fields" <bfields@fieldses.org>,
	Christoph Lameter <cl@linux-foundation.org>,
	<linux-fsdevel@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
	Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Andi Kleen <andi@firstfloor.org>,
	Dave Chinner <dchinner@redhat.com>,
	Boqun Feng <boqun.feng@gmail.com>,
	Scott J Norton <scott.norton@hpe.com>,
	Douglas Hatch <doug.hatch@hpe.com>
Subject: Re: [PATCH v3 1/4] lib/dlock-list: Distributed and lock-protected lists
Date: Wed, 20 Jul 2016 15:53:56 -0400	[thread overview]
Message-ID: <578FD6D4.4010005@hpe.com> (raw)
In-Reply-To: <20160719192333.GP3078@mtj.duckdns.org>

On 07/19/2016 03:23 PM, Tejun Heo wrote:
> Hello,
>
> On Tue, Jul 19, 2016 at 02:42:31PM -0400, Waiman Long wrote:
>> On 07/18/2016 07:38 PM, Tejun Heo wrote:
>>>> +struct dlock_list_node {
>>>> +	struct list_head list;
>>>> +	spinlock_t *lockptr;
>>>> +};
>>> Wouldn't it be better to point to dlock_list_percpu?
>> I could. However, the only thing that matter is the spinlock that protects
>> the list entry.
> Yeah, we can get back to this when it's actually necessary.  It just
> looked a bit weird to me.
>
>>>> +/*
>>>> + * The dlock list iteration functions which return true if iteration has
>>>> + * to be continued.
>>>> + */
>>>> +extern bool dlock_list_next(struct dlock_list_head *dlist,
>>>> +			    struct dlock_list_iter *iter);
>>>> +extern bool dlock_list_next_safe(struct dlock_list_head *dlist,
>>>> +				 struct dlock_list_iter *iter);
>>> Why not return dlock_list_node * for the current node?  That'd more
>>> conventional and allows dlock_list_iter to be opaque.
>> Yes, I can make it return dlock_list_node *.
>>
>> However, to make dlock_list_iter opaque, I will have to dynamically allocate
>> the structure. That will add an extra memory allocation and free calls as
>> well as handling the error case of running out of memory. I don't think that
>> is worth doing at this point.
> Sure, keep it defined in the header file.  Just don't require users to
> reach into it and add a comment saying that the struct is opaque to
> its users.

Will do so.

>>>> +int alloc_dlock_list_head(struct dlock_list_head *dlist)
>>>> +{
>>>> +	struct dlock_list_head dlist_tmp;
>>>> +	int cpu;
>>>> +
>>>> +	dlist_tmp.head = alloc_percpu(struct dlock_list_head_percpu);
>>>> +	if (!dlist_tmp.head)
>>>> +		return -ENOMEM;
>>>> +
>>>> +	for_each_possible_cpu(cpu) {
>>>> +		struct dlock_list_head_percpu *head;
>>>> +
>>>> +		head = per_cpu_ptr(dlist_tmp.head, cpu);
>>>> +		INIT_LIST_HEAD(&head->list);
>>>> +		head->lock = __SPIN_LOCK_UNLOCKED(&head->lock);
>>>> +		lockdep_set_class(&head->lock,&dlock_list_key);
>>>> +	}
>>>> +
>>>> +	dlist->head = dlist_tmp.head;
>>> Just use dlist->head directly or use local __perpcu head pointer?
>> I just don't want to expose the structure to world until it is fully
>> initialized. If you think I am over-cautious, I can use dlist->head as
>> suggested.
> I don't think it makes any actual difference.  No strong opinion
> either way.  Just use local __percpu head pointer then?
>
>>>> +	return 0;
>>>> +}
>>>> +EXPORT_SYMBOL(alloc_dlock_list_head);
>>> Does this actually need to be exported?  If so, it might be a better
>>> idea to start with EXPORT_SYMBOL_GPL().
>> For the current use case, we probably don't need to export the symbols.
>> Other use cases may require that. I will change it to use the version
>> instead.
> If it's not immediately necessary, it's best to not export at all.

I will remove the EXPORT statements.

>>>> +void dlock_list_del(struct dlock_list_node *node)
>>>> +{
>>>> +	spinlock_t *lock = READ_ONCE(node->lockptr);
>>>> +
>>>> +	if (unlikely(!lock)) {
>>>> +		WARN_ONCE(1,
>>>> +			"dlock_list_del: node 0x%lx has no associated lock\n",
>>>> +			(unsigned long)node);
>>> Maybe "if (WARN_ONCE(!lock...)"?  WARN_ONCE implies unlikely.
>> OK, will do that.
>>
>>>> +		return;
>>>> +	}
>>>> +
>>>> +	spin_lock(lock);
>>>> +	if (likely(lock == node->lockptr)) {
>>>> +		list_del_init(&node->list);
>>>> +		node->lockptr = NULL;
>>>> +	} else {
>>>> +		/*
>>>> +		 * This path should never be executed.
>>>> +		 */
>>>> +		WARN_ON_ONCE(1);
>>>> +	}
>>> This still kinda bothers me because this pretty much requires the
>>> users to have strong synchronization around the operations and makes
>>> it unusable in situations where opportunistic behaviors are
>>> acceptable.  It negates the usefulness quite a bit.
>> I understand your concern. I will make it retry again with the new lock.
> It doesn't necessarily have to retry but shouldn't break down when
> used in an opportunistic racy way - e.g. if adds and removes race, the
> order of operations isn't clearly defined as such any outcome is fine
> as long as the list maintains its integrity.

I am going to retry it if the lock changes to different one and ignore 
it if it becomes NULL.

>>>> +/**
>>>> + * dlock_list_next_safe - Removal-safe iterator of dlock list
>>>> + * @dlist: Pointer to the dlock_list_head structure
>>>> + * @iter : Pointer to the dlock list iterator structure
>>>> + * Return: true if the next entry is found, false if all the entries iterated
>>>> + *
>>>> + * The iterator has to be properly initialized before calling this function.
>>>> + * This iteration function is safe with respect to list entry removal.
>>>> + * However, it cannot correctly iterate newly added entries right after the
>>>> + * current one.
>>>> + */
>>> This still looks wrong to me.  If you want to provide the two variants
>>> of iterations, can't you just implement one next function and build
>>> the two types of iterations on top of it?
>> I have been thinking about making dlock_list_next_cpu()  the real external
>> function and have 2 inline functions that implement dlock_list_next() and
>> dlock_list_next_safe(). That may strike a better balance between performance
>> and code abstraction. I will do so if you have no objection to that.
> Yeah, please give it a try.  As mentioned in another reply, it'd
> probably be best to provide an iteration macro which encapsulates the
> whole thing.
>
> Thanks.
>

Yes, I am go to make the dlist_for_each_entry macro as suggested by Al.

Cheers,
Longman

next prev parent reply	other threads:[~2016-07-20 19:54 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-15 17:39 [PATCH v3 0/4] vfs: Use dlock list for SB's s_inodes list Waiman Long
2016-07-15 17:39 ` [PATCH v3 1/4] lib/dlock-list: Distributed and lock-protected lists Waiman Long
2016-07-18 23:38   ` Tejun Heo
2016-07-19 18:42     ` Waiman Long
2016-07-19 19:23       ` Tejun Heo
2016-07-20 19:53         ` Waiman Long [this message]
2016-07-20 22:02         ` Waiman Long
2016-07-20 22:15       ` Waiman Long
2016-07-21  0:48         ` Christoph Lameter
2016-07-21  1:36           ` Waiman Long
2016-07-21  1:49           ` Dave Chinner
2016-07-22 20:43       ` Waiman Long
2016-07-19  5:00   ` Al Viro
2016-07-19 19:01     ` Waiman Long
2016-07-15 17:39 ` [PATCH v3 2/4] fsnotify: Simplify inode iteration on umount Waiman Long
2016-07-15 17:39 ` [PATCH v3 3/4] vfs: Remove unnecessary list_for_each_entry_safe() variants Waiman Long
2016-07-15 17:39 ` [PATCH v3 4/4] vfs: Use dlock list for superblock's inode list Waiman Long
2016-07-19  5:23   ` Al Viro
2016-07-19 18:35     ` Tejun Heo
2016-07-19 19:07     ` Waiman Long

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=578FD6D4.4010005@hpe.com \
    --to=waiman.long@hpe.com \
    --cc=andi@firstfloor.org \
    --cc=bfields@fieldses.org \
    --cc=boqun.feng@gmail.com \
    --cc=cl@linux-foundation.org \
    --cc=dchinner@redhat.com \
    --cc=doug.hatch@hpe.com \
    --cc=jack@suse.com \
    --cc=jlayton@poochiereds.net \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=scott.norton@hpe.com \
    --cc=tj@kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.