From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-sn1nam01on0102.outbound.protection.outlook.com ([104.47.32.102]:62435 "EHLO NAM01-SN1-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751244AbcGTTyL (ORCPT ); Wed, 20 Jul 2016 15:54:11 -0400 Message-ID: <578FD6D4.4010005@hpe.com> Date: Wed, 20 Jul 2016 15:53:56 -0400 From: Waiman Long MIME-Version: 1.0 To: Tejun Heo CC: Alexander Viro , Jan Kara , Jeff Layton , "J. Bruce Fields" , Christoph Lameter , , , Ingo Molnar , Peter Zijlstra , Andi Kleen , Dave Chinner , Boqun Feng , Scott J Norton , Douglas Hatch Subject: Re: [PATCH v3 1/4] lib/dlock-list: Distributed and lock-protected lists References: <1468604383-40362-1-git-send-email-Waiman.Long@hpe.com> <1468604383-40362-2-git-send-email-Waiman.Long@hpe.com> <20160718233803.GN3078@mtj.duckdns.org> <578E7497.30602@hpe.com> <20160719192333.GP3078@mtj.duckdns.org> In-Reply-To: <20160719192333.GP3078@mtj.duckdns.org> Content-Type: text/plain; charset="ISO-8859-1"; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On 07/19/2016 03:23 PM, Tejun Heo wrote: > Hello, > > On Tue, Jul 19, 2016 at 02:42:31PM -0400, Waiman Long wrote: >> On 07/18/2016 07:38 PM, Tejun Heo wrote: >>>> +struct dlock_list_node { >>>> + struct list_head list; >>>> + spinlock_t *lockptr; >>>> +}; >>> Wouldn't it be better to point to dlock_list_percpu? >> I could. However, the only thing that matter is the spinlock that protects >> the list entry. > Yeah, we can get back to this when it's actually necessary. It just > looked a bit weird to me. > >>>> +/* >>>> + * The dlock list iteration functions which return true if iteration has >>>> + * to be continued. >>>> + */ >>>> +extern bool dlock_list_next(struct dlock_list_head *dlist, >>>> + struct dlock_list_iter *iter); >>>> +extern bool dlock_list_next_safe(struct dlock_list_head *dlist, >>>> + struct dlock_list_iter *iter); >>> Why not return dlock_list_node * for the current node? That'd more >>> conventional and allows dlock_list_iter to be opaque. >> Yes, I can make it return dlock_list_node *. >> >> However, to make dlock_list_iter opaque, I will have to dynamically allocate >> the structure. That will add an extra memory allocation and free calls as >> well as handling the error case of running out of memory. I don't think that >> is worth doing at this point. > Sure, keep it defined in the header file. Just don't require users to > reach into it and add a comment saying that the struct is opaque to > its users. Will do so. >>>> +int alloc_dlock_list_head(struct dlock_list_head *dlist) >>>> +{ >>>> + struct dlock_list_head dlist_tmp; >>>> + int cpu; >>>> + >>>> + dlist_tmp.head = alloc_percpu(struct dlock_list_head_percpu); >>>> + if (!dlist_tmp.head) >>>> + return -ENOMEM; >>>> + >>>> + for_each_possible_cpu(cpu) { >>>> + struct dlock_list_head_percpu *head; >>>> + >>>> + head = per_cpu_ptr(dlist_tmp.head, cpu); >>>> + INIT_LIST_HEAD(&head->list); >>>> + head->lock = __SPIN_LOCK_UNLOCKED(&head->lock); >>>> + lockdep_set_class(&head->lock,&dlock_list_key); >>>> + } >>>> + >>>> + dlist->head = dlist_tmp.head; >>> Just use dlist->head directly or use local __perpcu head pointer? >> I just don't want to expose the structure to world until it is fully >> initialized. If you think I am over-cautious, I can use dlist->head as >> suggested. > I don't think it makes any actual difference. No strong opinion > either way. Just use local __percpu head pointer then? > >>>> + return 0; >>>> +} >>>> +EXPORT_SYMBOL(alloc_dlock_list_head); >>> Does this actually need to be exported? If so, it might be a better >>> idea to start with EXPORT_SYMBOL_GPL(). >> For the current use case, we probably don't need to export the symbols. >> Other use cases may require that. I will change it to use the version >> instead. > If it's not immediately necessary, it's best to not export at all. I will remove the EXPORT statements. >>>> +void dlock_list_del(struct dlock_list_node *node) >>>> +{ >>>> + spinlock_t *lock = READ_ONCE(node->lockptr); >>>> + >>>> + if (unlikely(!lock)) { >>>> + WARN_ONCE(1, >>>> + "dlock_list_del: node 0x%lx has no associated lock\n", >>>> + (unsigned long)node); >>> Maybe "if (WARN_ONCE(!lock...)"? WARN_ONCE implies unlikely. >> OK, will do that. >> >>>> + return; >>>> + } >>>> + >>>> + spin_lock(lock); >>>> + if (likely(lock == node->lockptr)) { >>>> + list_del_init(&node->list); >>>> + node->lockptr = NULL; >>>> + } else { >>>> + /* >>>> + * This path should never be executed. >>>> + */ >>>> + WARN_ON_ONCE(1); >>>> + } >>> This still kinda bothers me because this pretty much requires the >>> users to have strong synchronization around the operations and makes >>> it unusable in situations where opportunistic behaviors are >>> acceptable. It negates the usefulness quite a bit. >> I understand your concern. I will make it retry again with the new lock. > It doesn't necessarily have to retry but shouldn't break down when > used in an opportunistic racy way - e.g. if adds and removes race, the > order of operations isn't clearly defined as such any outcome is fine > as long as the list maintains its integrity. I am going to retry it if the lock changes to different one and ignore it if it becomes NULL. >>>> +/** >>>> + * dlock_list_next_safe - Removal-safe iterator of dlock list >>>> + * @dlist: Pointer to the dlock_list_head structure >>>> + * @iter : Pointer to the dlock list iterator structure >>>> + * Return: true if the next entry is found, false if all the entries iterated >>>> + * >>>> + * The iterator has to be properly initialized before calling this function. >>>> + * This iteration function is safe with respect to list entry removal. >>>> + * However, it cannot correctly iterate newly added entries right after the >>>> + * current one. >>>> + */ >>> This still looks wrong to me. If you want to provide the two variants >>> of iterations, can't you just implement one next function and build >>> the two types of iterations on top of it? >> I have been thinking about making dlock_list_next_cpu() the real external >> function and have 2 inline functions that implement dlock_list_next() and >> dlock_list_next_safe(). That may strike a better balance between performance >> and code abstraction. I will do so if you have no objection to that. > Yeah, please give it a try. As mentioned in another reply, it'd > probably be best to provide an iteration macro which encapsulates the > whole thing. > > Thanks. > Yes, I am go to make the dlist_for_each_entry macro as suggested by Al. Cheers, Longman