From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1751785AbcGSSmt (ORCPT <rfc822;w@1wt.eu>);
	Tue, 19 Jul 2016 14:42:49 -0400
Received: from mail-sn1nam01on0128.outbound.protection.outlook.com ([104.47.32.128]:30621
	"EHLO NAM01-SN1-obe.outbound.protection.outlook.com"
	rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP
	id S1751251AbcGSSmo (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
	Tue, 19 Jul 2016 14:42:44 -0400
Authentication-Results: spf=none (sender IP is )
 smtp.mailfrom=waiman.long@hpe.com; 
Message-ID: <578E7497.30602@hpe.com>
Date: Tue, 19 Jul 2016 14:42:31 -0400
From: Waiman Long <waiman.long@hpe.com>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:10.0.12) Gecko/20130109 Thunderbird/10.0.12
MIME-Version: 1.0
To: Tejun Heo <tj@kernel.org>
CC: Alexander Viro <viro@zeniv.linux.org.uk>, Jan Kara <jack@suse.com>,
        Jeff Layton <jlayton@poochiereds.net>,
        "J. Bruce Fields" <bfields@fieldses.org>,
        Christoph Lameter <cl@linux-foundation.org>,
        <linux-fsdevel@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
        Ingo Molnar <mingo@redhat.com>, Peter Zijlstra <peterz@infradead.org>,
        Andi Kleen <andi@firstfloor.org>, Dave Chinner <dchinner@redhat.com>,
        Boqun Feng <boqun.feng@gmail.com>,
        Scott J Norton <scott.norton@hpe.com>,
        Douglas Hatch <doug.hatch@hpe.com>
Subject: Re: [PATCH v3 1/4] lib/dlock-list: Distributed and lock-protected
 lists
References: <1468604383-40362-1-git-send-email-Waiman.Long@hpe.com> <1468604383-40362-2-git-send-email-Waiman.Long@hpe.com> <20160718233803.GN3078@mtj.duckdns.org>
In-Reply-To: <20160718233803.GN3078@mtj.duckdns.org>
Content-Type: text/plain; charset="ISO-8859-1"; format=flowed
Content-Transfer-Encoding: 7bit
X-Originating-IP: [72.71.243.31]
X-ClientProxiedBy: BY2PR06CA024.namprd06.prod.outlook.com (10.141.250.142) To
 AT5PR84MB0306.NAMPRD84.PROD.OUTLOOK.COM (10.162.138.28)
X-MS-Office365-Filtering-Correlation-Id: 156b0c81-f15e-4106-d7ed-08d3b0047793
X-Microsoft-Exchange-Diagnostics: 1;AT5PR84MB0306;2:P8HObdRXOA3AltHO/xa7lsZt2HDevi/ZPARPRwdX6W799D9UZsA+zEHmOtfvmFgvv6sqG8gf55Xqq0lwQAILzQZm7+qHpe24Hxu7UHAFS60JztTX+yERD/2zf6N6VzI0WVpBo651SPlykn3bgG+dDa6fo7bCmBvGWvo5yrxQRCqhR0T6uwiYc4VBKZS+OUIF;3:nwQRXddZd42N7XgqXhcTC+01JWbrI9/zU70R0JyUgS/ukcsc4TuSHSXO4R7Ht6+DLfZ3SjOsnkYXhQLcjcC1f0dEvlPuHPPWLcbi6BueeHd5v32SM27QDaM9w6nHIRlr
X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:AT5PR84MB0306;
X-Microsoft-Exchange-Diagnostics: 1;AT5PR84MB0306;25:5Dt2linOUiAF+WSwImkSL4EVedgTM8u35bi9n8GD2Mzsez5J/mgUPmrnf45KQmsNJHT0vIcM1pza4Lqt15UkNC4RwVafna+AChZzUgjqN1pEYyjMbXEj8oBQIf210wF98saELSmbdpG9rKTCoBQzQQxxcsUVCdbwd5qqwgl9tdzTE0cI0ZSC2jW06gCl+ozGuskB9eWiAaVCORLrFo0xT4jMf8meOu7IotKpiQHUxmxZVKV3tJPQQPtPALunL5BfKaDuuHAdGszmTPgeHMlzdqxKcoQqqKmjmeNzAiRvNKTxKUVH3j+k1/VyDhbPPRoNPmTRd7crmYjPoDXuB96BxCScgCqdZFQlNY2WHQeklgCFCtiB5nBEsLtgiWbpapOQZpU6IOjYidJMaem0XijtixkbscHIl5a+XAF69IBufYVk9HSUBDBzgUCwd6YvAH1wqgt2yqM4wQymd8xmsUR3l9G89kq94eatIuDQx5zcvaquSDvSXSoxKfv/dVRg4Xu5DhSjiDtPYQZEC/sSs4Zig7eAntKRGz6Ah+ED1mT84iq2NGD0Bn96fszuTZ8RMtWM8moIw7TiN6TfZYLgPypWUWnUPj5ab4O8fhU8pTlYKhGBsbVa0b9BV09o6ja+2V0OPFQAvcOopY8rNlvls5hePOq8WJe8kzsifZGv9kA821Mx9KeZZJHRAR9nnJ9QWYdE94AaQPRwm42c3Bria5v3fr93mxF06XokutOkJGMkRrE=;31:Kj4VE70Pf4AVBMgprT43jiRuCJ0qIYomhgJnVuYV+3eHY3qYT+VvevEfdWcF3Krja/B1wkBP5duu5sOj5oxKUaIziY0We5Yt3bLBF8gLbdiJWWiBjuwok11JiOYnvtvs0FKYQht/duOrqSmxLOsnLQunLuLciR44iLFdITqQSDDTb98XPFOJaadeZXVTLgJDbn2psAbTFY1EGab2a8CO6Q==
X-Microsoft-Exchange-Diagnostics: 1;AT5PR84MB0306;20:6iY/OLYmrCnL2unNhgd2FFKPKRZMOn/hEH5sQQ9+MVy+1+zUkRVdi59fHBTh85RaDIcvD4sAzibaZTZ1VfPpqQdm7OfEf9L05q9q6nn4g+ZK6WrvuV5Q2K1t8VOug4zaos14la2IIGM9QYyH4FoMc1XlZV4MRWSwi4EhqO9quo64iOWhTBQ3B/+WU4PdPnjpnOy27nwCfTZEMudLGkq0kMDKxF+ccBUnSJX0SIvmWjguz2PWLGsj5lcH7US5eDVNP1w2nbdGkywaNB209Cf9RcyPuTRNWVtqJ1v1nHyeRityXB+YsMzdGGD2XDWPWia8ninJPhPhUlW6r3MopJ3HngVDI4zu80aTedXco6WXe2hrq/qALtpFGHG9lh7mC11eAV8N9vNNTBjBQTGHxi3fJYNo0cPNxNoiOw2K97BYP9jdonKVSM2vH+Zn72hpshQYbB4b43VhlBiKdBv0ZkRAAdVBVJzx+2jlBu/NluLh5DsDW4KrX+rz2SBvOrUawuqn;4:gp3aOhuZUxhEkp9c0sKkOSPCKEVpooXEBAhApa0hsdRwe5ISdMoAVv+NmHa107JSbrZhXlKg0KXxqSk3yCZ7EGZ9GZB0cMjUcy77bE7tNFOcPDjTug9BYmXWhpLT5NuzEnVVW9jTDwU6nchduXPw2WEfawapq9lylJhfFcOnhjc3FUML4rg5gJHZMdpi5DYBo3oxHXNYja2F9xUWg9B/w62yFcI7tFeh5eDOoU3bdR+lOLHuxY4H8KGHh1OE6Jtni2jxgSehoZSvXlnjG/estgMpR2JH0+WDPmW2yA8N/SE7BqK78ZlzPks+TjRv9mXCJjN/eV30liF+z3n3Y7t0iiZYQbQ3ZlAkWwUUUhRsJHq8Yl4qB8tt4LDxDi2kMaxPSU/u1T2VbkRlwuSzxxoJfw==
X-Microsoft-Antispam-PRVS: <AT5PR84MB0306820471942845EB69F904F1370@AT5PR84MB0306.NAMPRD84.PROD.OUTLOOK.COM>
X-Exchange-Antispam-Report-Test: UriScan:;
X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(601004)(2401047)(5005006)(8121501046)(10201501046)(3002001)(6055026);SRVR:AT5PR84MB0306;BCL:0;PCL:0;RULEID:;SRVR:AT5PR84MB0306;
X-Forefront-PRVS: 000800954F
X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10019020)(4630300001)(979002)(6009001)(6049001)(7916002)(377454003)(24454002)(76104003)(189002)(199003)(7736002)(110136002)(86362001)(97736004)(65806001)(7846002)(77096005)(65956001)(36756003)(305945005)(4001350100001)(19580405001)(2906002)(2950100001)(33656002)(83506001)(19580395003)(8676002)(105586002)(66066001)(68736007)(50466002)(47776003)(230700001)(65816999)(64126003)(54356999)(76176999)(92566002)(23756003)(101416001)(81156014)(81166006)(50986999)(4326007)(586003)(42186005)(3846002)(6116002)(117156001)(106356001)(189998001)(969003)(989001)(999001)(1009001)(1019001);DIR:OUT;SFP:1102;SCL:1;SRVR:AT5PR84MB0306;H:[192.168.142.172];FPR:;SPF:None;PTR:InfoNoRecords;MX:1;A:1;LANG:en;
X-Microsoft-Exchange-Diagnostics: =?iso-8859-1?Q?1;AT5PR84MB0306;23:b/+kOqoilJ9smSxHC+gOpobiaKiyI5KDxH/+Pt8?=
 =?iso-8859-1?Q?wHiLz4MakIFglcc7Wy0aRkyUt9Jt7wW4KwQFGkgxei2hWA+o1C0QUdcZI0?=
 =?iso-8859-1?Q?8HeZeXrzQo7ASqDJo6jKdRe1ybjq4qdLHLmIyD1SNjgEGBKFM3fasbxv/P?=
 =?iso-8859-1?Q?Fl/AXI8IffqHzQLMJO9bdN65G6xMt8DKH8pfL4DTaNZcWOYPYC7OEZ1IvU?=
 =?iso-8859-1?Q?MdIOIybozOX8Gsir7GrxR5/OQrxH5SiUtIuDIhbrK8C2Sl/cSBrAhfVkK6?=
 =?iso-8859-1?Q?+cNx04z1jxVe0YIbVqL/aZVDn1bR0VMbPuetxbI33u9h2pwmTect/L++pY?=
 =?iso-8859-1?Q?pjp77hW8e62KtHqatv5OuLfiBjuPMf1S7OZcTnSCDh6r2fEYlEZjbBZ4Bq?=
 =?iso-8859-1?Q?XKTdigWBrwRhVNvk8kRn/MDg0n1/OrPKb4IkgDOYewGMgqYnZVMjSCDq/P?=
 =?iso-8859-1?Q?XD28fJyy/R9wADf316aJbJXdwK47621iiV4Ga9GtAo82dxOA4eHHmXeJmP?=
 =?iso-8859-1?Q?7jR16dtB/g68k5vRakTX9bn020IpA32gmcCsjVoixiqZgZd4I3jVjWc6hw?=
 =?iso-8859-1?Q?otXn7s3s/a5yqqZMCHj6PvV00KF/xU8iSdRkeImfOiGrlRZEaQOAQy8t+n?=
 =?iso-8859-1?Q?moYj1jtCMJVNn/sx7E93hwz773BQjCuMmAukQfPEisDSmvO72KgDRWfh1O?=
 =?iso-8859-1?Q?36PvKqB4qcx9vJycGV8DhupUUwIcDhOWm5Cfo/PJM3kQdLvzh/kLq7cZox?=
 =?iso-8859-1?Q?CSg/9b9MbHZHSrz5NJpmlpT90wYz5ti2vm31FhQaI5OZWzcQVkiOCyeGfW?=
 =?iso-8859-1?Q?ZIxnpiG228P6uwqZLy93Czoyr+wYUJTWSj9JHEA+JGfONGfkSdAVo4VLaQ?=
 =?iso-8859-1?Q?GgFG4ES1Mhnibcva9uOjPpexsnMN4Hy5pAHJk8x9Ai7ndktH2G8IPkEjAw?=
 =?iso-8859-1?Q?rhgdPOp2TCu/EjFDS3gBm0OkUcyE8rangu6vzxCCF3FcMee2GvY3k/bqeu?=
 =?iso-8859-1?Q?fy65Qn9ZC8FJSGnN4FqtOyOnQ6Mef41oX349WXy5cLFVzn/HzPZkiZ13nH?=
 =?iso-8859-1?Q?F3xzm/QxRPY14yxeIfkkYcdhFP0NAHxxPOz+PbAoaLmlx2D8C+prhic8XK?=
 =?iso-8859-1?Q?XjgD4qgC1uzWFdil6QYe/N1CXNHvkmHnrtyyV7KsDTg6aijPsniBU8pzNu?=
 =?iso-8859-1?Q?6UjDPKJmGO++R0sViNU13gqmpXXWAALqshBTsox30UxluqYSWuX2CjaRW6?=
 =?iso-8859-1?Q?z6ylPhZj/3u7Rke+wdgbhbnqGvm9jvCCO3NeUy+ad2txn4HNOzVhA/6mIv?=
 =?iso-8859-1?Q?fDDv1CZc/XVLX1aUWj1Sj1pULw/eS3yD5uQYd3jycBBxzqzamBp+50ycRW?=
 =?iso-8859-1?Q?WKTM3gItFAPewwox5vqaI0izFMYq4Z88h7UrRXUx2DUmaXMsXYw=3D=3D?=
X-Microsoft-Exchange-Diagnostics: 1;AT5PR84MB0306;6:UVZ+F0YQ+AJAlPoy5KQzW5GnJCxDAAxUSZ6VX6J8D4jJ5DQDXHDtP4dW4lMap2YI5pOHAggs2Am4kzwMDUWs9iM1HH4fEEEhSbz/otBY+1Cfuqva7pXWlAbOacvEFkajiDqgRLOiAsUGuJkjjDdJYNZwbfPSqJKtrjGetQJbtNknJCyg3Z7KkLsXeykygjlg/z4BVoNAR+5n1kfkPbCIfEUnJ+KxnI0Eg8Og483SLLHyvwKCDE8in9FFW65PoFAsrwtZeeiFXUQIvaudW2aK1DxWEtWSUl1nI/akuBKyrkUJPsQUgaipUL8x/rmXOXX5NnzSvY8arfd1yEB1Mj02cg==;5:svqUpTLPM7r1PCSn2SO6uIM2ZOpwY/J2fPMDaZ48JYTol+wdl7U+wCP9vftF+knaA6dEnNeiO+Zyvj3c2Kklr3EY415xtgF5ifjkQt+2TkHkOtHebwmDlI9w2TKxCq4LYG5BR5MhbSvU8/TyFaTUqw==;24:1gy/81bp/L0d3YkXv0Imu6BDA0NqREUcZ8mFGABLjRnRei7BixXNbdiVjEW+dxvQJdeEodzvRseLViiKwS+uJnsTB3E7NE46sWQZl78LUpg=;7:VccV1BZ92IanSuVqFsrNNnWPH4GyUoq/OrcKitY+i3KAu6PLPEuLsVc4GBxiYYDiyZuH6FRc5N2YnW5eLFtHjvkW6NZdn7ohk3vmLks8P/J36m6B2EUb4QuUnglXNNcxmWNqHImn2DbMAfyI78mC0bX+bgHoNzauZQgjagrXUFHTCyVx2etgg4Cu3ZPeCUuAQv3jrVq18UqyciB8us0wXOzZQc6fsa8ROxyPRZq7Qact4RXTVVVYmiCcBlzjzW2G
SpamDiagnosticOutput: 1:99
SpamDiagnosticMetadata: NSPM
X-OriginatorOrg: hpe.com
X-MS-Exchange-CrossTenant-OriginalArrivalTime: 19 Jul 2016 18:42:39.7357 (UTC)
X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted
X-MS-Exchange-Transport-CrossTenantHeadersStamped: AT5PR84MB0306
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On 07/18/2016 07:38 PM, Tejun Heo wrote:
> Hello, Waiman.
>
> On Fri, Jul 15, 2016 at 01:39:40PM -0400, Waiman Long wrote:
>> Suggested-by: Tejun Heo<tj@kernel.org>
> Not sure I should be on suggested-by given that this wasn't my idea at
> all.

I put the tag there because of your suggestion to restructure the data 
structure which did make the patch better. I can remove that tag if you 
think it is not appropriate.

>> +/*
>> + * include/linux/dlock-list.h
>> + *
>> + * A distributed (per-cpu) set of lists each of which is protected by its
>> + * own spinlock, but acts like a single consolidated list to the callers.
>> + *
>> + * The dlock_list_head_percpu structure contains the spinlock, the other
>> + * dlock_list_node structures only contains a pointer to the spinlock in
>> + * dlock_list_head_percpu.
>> + */
> The more I think about it, the more bothered I'm about the dlock_list
> name.  For the most part, this isn't different from other percpu data
> structures in the kernel.  Sure, it might benefit from doing Nth cpu,
> but so are other percpu data structures and it's not just "distributed
> lock" list either.  The list itself is percpu, not just locking.  Can
> we please go back to percpu_list?  Christoph, what do you think?
>

As I said before, I don't mind reverting the name back to percpu_list. I 
am just waiting for a final agreement.

>> +struct dlock_list_node {
>> +	struct list_head list;
>> +	spinlock_t *lockptr;
>> +};
> Wouldn't it be better to point to dlock_list_percpu?

I could. However, the only thing that matter is the spinlock that 
protects the list entry.

>> +#define DLOCK_LIST_HEAD_PERCPU_INIT(name)			\
>> +	{							\
>> +		.list.prev =&name.list,			\
>> +		.list.next =&name.list,			\
> Use LIST_HEAD_INIT()?  Also, why do we even need the initializers if
> the data structure can only be dynamically allocated.  In fact, do
> the definitions even need to be exposed in the header?

I put it there for completeness sake. You are right. That macro isn't 
currently used. I will remove it in the next iteration of the patch.

>> +		.list.lock = __SPIN_LOCK_UNLOCKED(name),	\
>> +	}
>> +
>> +/*
>> + * dlock list iteration state
>> + */
>> +struct dlock_list_iter {
>> +	int			 cpu;
>                                  ^
> 	I'm not sure lining up with space here is common in kernel.

OK, I will remove the extra spaces to make it more conformant to the 
kernel style.

>> +	spinlock_t		*lock;
>> +	struct list_head	*head;	/* List head of current per-cpu list */
>> +	struct dlock_list_node	*curr;
>> +	struct dlock_list_node	*next;
>> +};
>> +
>> +#define DLOCK_LIST_ITER_INIT()			\
>                                 ^
> 		Don't we usually omit () in these cases?

Good point. Will remove the unneeded ().

>> +	{					\
>> +		.cpu  = -1,			\
>> +	}
>> +
>> +#define DEFINE_DLOCK_LIST_ITER(s)		\
>> +	struct dlock_list_iter s = DLOCK_LIST_ITER_INIT()
>> +
>> +static inline void init_dlock_list_iter(struct dlock_list_iter *iter)
>> +{
>> +	*iter = (struct dlock_list_iter)DLOCK_LIST_ITER_INIT();
>> +}
>> +
>> +#define DLOCK_LIST_NODE_INIT(name)		\
>> +	{					\
>> +		.list.prev =&name.list,	\
>> +		.list.next =&name.list,	\
> 		^
> 		LIST_HEAD_INIT()?

Will make the change.

>> +	}
>> +
>> +static inline void init_dlock_list_node(struct dlock_list_node *node)
>> +{
>> +	INIT_LIST_HEAD(&node->list);
>> +	node->lockptr = NULL;
> Why not use DLOCK_LIST_NODE_INIT()?
>

Yes, I can make the change.

>> +}
>> +
>> +/*
>> + * Check if all the dlock lists are empty
>> + *
>> + * This can be a pretty expensive function call. If this function is required
>> + * in a performance critical path, we may have to maintain a global count
>> + * of the list entries in the global dlock_list_head structure instead.
>> + */
> /** function comment please.

Sure.

>> +static inline bool dlock_list_empty(struct dlock_list_head *dlist)
>> +{
>> +	int cpu;
>> +
>> +	for_each_possible_cpu(cpu)
>> +		if (!list_empty(&per_cpu_ptr(dlist->head, cpu)->list))
>> +			return false;
>> +	return true;
>> +}
>> +
>> +/*
>> + * Allocation and freeing of dlock list
>> + */
>> +extern int  alloc_dlock_list_head(struct dlock_list_head *dlist);
>                ^
> 	      ditto with alignment

Will remove the extra space.

>> +extern void free_dlock_list_head(struct dlock_list_head *dlist);
>> +
>> +/*
>> + * The dlock list iteration functions which return true if iteration has
>> + * to be continued.
>> + */
>> +extern bool dlock_list_next(struct dlock_list_head *dlist,
>> +			    struct dlock_list_iter *iter);
>> +extern bool dlock_list_next_safe(struct dlock_list_head *dlist,
>> +				 struct dlock_list_iter *iter);
> Why not return dlock_list_node * for the current node?  That'd more
> conventional and allows dlock_list_iter to be opaque.

Yes, I can make it return dlock_list_node *.

However, to make dlock_list_iter opaque, I will have to dynamically 
allocate the structure. That will add an extra memory allocation and 
free calls as well as handling the error case of running out of memory. 
I don't think that is worth doing at this point.

>> diff --git a/lib/dlock-list.c b/lib/dlock-list.c
>> new file mode 100644
>> index 0000000..af4a9f3
>> --- /dev/null
>> +++ b/lib/dlock-list.c
> ...
>> +int alloc_dlock_list_head(struct dlock_list_head *dlist)
>> +{
>> +	struct dlock_list_head dlist_tmp;
>> +	int cpu;
>> +
>> +	dlist_tmp.head = alloc_percpu(struct dlock_list_head_percpu);
>> +	if (!dlist_tmp.head)
>> +		return -ENOMEM;
>> +
>> +	for_each_possible_cpu(cpu) {
>> +		struct dlock_list_head_percpu *head;
>> +
>> +		head = per_cpu_ptr(dlist_tmp.head, cpu);
>> +		INIT_LIST_HEAD(&head->list);
>> +		head->lock = __SPIN_LOCK_UNLOCKED(&head->lock);
>> +		lockdep_set_class(&head->lock,&dlock_list_key);
>> +	}
>> +
>> +	dlist->head = dlist_tmp.head;
> Just use dlist->head directly or use local __perpcu head pointer?

I just don't want to expose the structure to world until it is fully 
initialized. If you think I am over-cautious, I can use dlist->head as 
suggested.

>> +	return 0;
>> +}
>> +EXPORT_SYMBOL(alloc_dlock_list_head);
> Does this actually need to be exported?  If so, it might be a better
> idea to start with EXPORT_SYMBOL_GPL().

For the current use case, we probably don't need to export the symbols. 
Other use cases may require that. I will change it to use the version 
instead.

>> +void dlock_list_add(struct dlock_list_node *node,
>> +		    struct dlock_list_head *dlist)
>> +{
>> +	struct dlock_list_head_percpu *head;
>                                        ^
>
> This probably requires __percpu annotation.  Have you run it through
> sparse and checked for address space warnings?

You are right. I probably miss the __percpu annotation. I haven't run it 
through sparse. I will do that to fix any warning found.

>> +
>> +	/*
>> +	 * Disable preemption to make sure that CPU won't gets changed.
>> +	 */
>> +	head = get_cpu_ptr(dlist->head);
>> +	spin_lock(&head->lock);
>> +	node->lockptr =&head->lock;
>> +	list_add(&node->list,&head->list);
>> +	spin_unlock(&head->lock);
>> +	put_cpu_ptr(dlist->head);
>> +}
>> +EXPORT_SYMBOL(dlock_list_add);
>> +
>> +/**
>> + * dlock_list_del - Delete a node from a dlock list
>> + * @node : The node to be deleted
>> + *
>> + * We need to check the lock pointer again after taking the lock to guard
>> + * against concurrent deletion of the same node. If the lock pointer changes
>> + * (becomes NULL or to a different one), we assume that the deletion was done
>> + * elsewhere. A warning will be printed if this happens as it is likely to be
>> + * a bug.
>> + */
>> +void dlock_list_del(struct dlock_list_node *node)
>> +{
>> +	spinlock_t *lock = READ_ONCE(node->lockptr);
>> +
>> +	if (unlikely(!lock)) {
>> +		WARN_ONCE(1,
>> +			"dlock_list_del: node 0x%lx has no associated lock\n",
>> +			(unsigned long)node);
> Maybe "if (WARN_ONCE(!lock...)"?  WARN_ONCE implies unlikely.

OK, will do that.

>> +		return;
>> +	}
>> +
>> +	spin_lock(lock);
>> +	if (likely(lock == node->lockptr)) {
>> +		list_del_init(&node->list);
>> +		node->lockptr = NULL;
>> +	} else {
>> +		/*
>> +		 * This path should never be executed.
>> +		 */
>> +		WARN_ON_ONCE(1);
>> +	}
> This still kinda bothers me because this pretty much requires the
> users to have strong synchronization around the operations and makes
> it unusable in situations where opportunistic behaviors are
> acceptable.  It negates the usefulness quite a bit.

I understand your concern. I will make it retry again with the new lock.

>> +	spin_unlock(lock);
>> +}
>> +EXPORT_SYMBOL(dlock_list_del);
>> +
>> +/*
>> + * Helper function to find the first entry of the next per-cpu list
>> + * It works somewhat like for_each_possible_cpu(cpu).
>> + *
>> + * Return: true if the entry is found, false if all the lists exhausted
>> + *
>> + */
>> +static inline bool dlock_list_next_cpu(struct dlock_list_head *dlist,
>                ^
> 	Just let the compiler figure it out.

Sure. Will remove the inline tag.

>> +				       struct dlock_list_iter *iter)
>> +{
>> +	if (iter->lock)
>> +		spin_unlock(iter->lock);
>> +next_cpu:
>> +	/*
>> +	 * for_each_possible_cpu(cpu)
>> +	 */
>> +	iter->cpu = cpumask_next(iter->cpu, cpu_possible_mask);
>> +	if (iter->cpu>= nr_cpu_ids)
>> +		return false;	/* All the per-cpu lists iterated */
>> +
>> +	iter->head =&per_cpu_ptr(dlist->head, iter->cpu)->list;
>> +	if (list_empty(iter->head))
>> +		goto next_cpu;
>> +
>> +	iter->lock =&per_cpu_ptr(dlist->head, iter->cpu)->lock;
>> +	spin_lock(iter->lock);
>> +	/*
>> +	 * There is a slight chance that the list may become empty just
>> +	 * before the lock is acquired. So an additional check is
>> +	 * needed to make sure that iter->curr points to a valid entry.
>> +	 */
>> +	if (list_empty(iter->head)) {
>> +		spin_unlock(iter->lock);
>> +		goto next_cpu;
>> +	}
>> +	iter->curr = list_entry(iter->head->next,
>> +				 struct dlock_list_node, list);
>> +	return true;
>> +}
> ...
>> +/**
>> + * dlock_list_next_safe - Removal-safe iterator of dlock list
>> + * @dlist: Pointer to the dlock_list_head structure
>> + * @iter : Pointer to the dlock list iterator structure
>> + * Return: true if the next entry is found, false if all the entries iterated
>> + *
>> + * The iterator has to be properly initialized before calling this function.
>> + * This iteration function is safe with respect to list entry removal.
>> + * However, it cannot correctly iterate newly added entries right after the
>> + * current one.
>> + */
> This still looks wrong to me.  If you want to provide the two variants
> of iterations, can't you just implement one next function and build
> the two types of iterations on top of it?

I have been thinking about making dlock_list_next_cpu()  the real 
external function and have 2 inline functions that implement 
dlock_list_next() and dlock_list_next_safe(). That may strike a better 
balance between performance and code abstraction. I will do so if you 
have no objection to that.

Cheers,
Longman